Search tips
Search criteria

Results 1-25 (869)

Clipboard (0)

Select a Filter Below

Year of Publication
1.  Polar Bears Exhibit Genome-Wide Signatures of Bioenergetic Adaptation to Life in the Arctic Environment 
Genome Biology and Evolution  2014;6(2):433-450.
Polar bears (Ursus maritimus) face extremely cold temperatures and periods of fasting, which might result in more severe energetic challenges than those experienced by their sister species, the brown bear (U. arctos). We have examined the mitochondrial and nuclear genomes of polar and brown bears to investigate whether polar bears demonstrate lineage-specific signals of molecular adaptation in genes associated with cellular respiration/energy production. We observed increased evolutionary rates in the mitochondrial cytochrome c oxidase I gene in polar but not brown bears. An amino acid substitution occurred near the interaction site with a nuclear-encoded subunit of the cytochrome c oxidase complex and was predicted to lead to a functional change, although the significance of this remains unclear. The nuclear genomes of brown and polar bears demonstrate different adaptations related to cellular respiration. Analyses of the genomes of brown bears exhibited substitutions that may alter the function of proteins that regulate glucose uptake, which could be beneficial when feeding on carbohydrate-dominated diets during hyperphagia, followed by fasting during hibernation. In polar bears, genes demonstrating signatures of functional divergence and those potentially under positive selection were enriched in functions related to production of nitric oxide (NO), which can regulate energy production in several different ways. This suggests that polar bears may be able to fine-tune intracellular levels of NO as an adaptive response to control trade-offs between energy production in the form of adenosine triphosphate versus generation of heat (thermogenesis).
PMCID: PMC3942037  PMID: 24504087
cellular respiration; mitochondrial genome; nitric oxide; nuclear genome; oxidative phosphorylation
2.  Genomics of Ecological Adaptation in Cactophilic Drosophila 
Genome Biology and Evolution  2014;7(1):349-366.
Cactophilic Drosophila species provide a valuable model to study gene–environment interactions and ecological adaptation. Drosophila buzzatii and Drosophila mojavensis are two cactophilic species that belong to the repleta group, but have very different geographical distributions and primary host plants. To investigate the genomic basis of ecological adaptation, we sequenced the genome and developmental transcriptome of D. buzzatii and compared its gene content with that of D. mojavensis and two other noncactophilic Drosophila species in the same subgenus. The newly sequenced D. buzzatii genome (161.5 Mb) comprises 826 scaffolds (>3 kb) and contains 13,657 annotated protein-coding genes. Using RNA sequencing data of five life-stages we found expression of 15,026 genes, 80% protein-coding genes, and 20% noncoding RNA genes. In total, we detected 1,294 genes putatively under positive selection. Interestingly, among genes under positive selection in the D. mojavensis lineage, there is an excess of genes involved in metabolism of heterocyclic compounds that are abundant in Stenocereus cacti and toxic to nonresident Drosophila species. We found 117 orphan genes in the shared D. buzzatii–D. mojavensis lineage. In addition, gene duplication analysis identified lineage-specific expanded families with functional annotations associated with proteolysis, zinc ion binding, chitin binding, sensory perception, ethanol tolerance, immunity, physiology, and reproduction. In summary, we identified genetic signatures of adaptation in the shared D. buzzatii–D. mojavensis lineage, and in the two separate D. buzzatii and D. mojavensis lineages. Many of the novel lineage-specific genomic features are promising candidates for explaining the adaptation of these species to their distinct ecological niches.
PMCID: PMC4316639  PMID: 25552534
cactophilic Drosophila; genome sequence; ecological adaptation; positive selection; orphan genes; gene duplication
3.  Patterns of Evolutionary Conservation of Ascorbic Acid-Related Genes Following Whole-Genome Triplication in Brassica rapa 
Genome Biology and Evolution  2014;7(1):299-313.
Ascorbic acid (AsA) is an important antioxidant in plants and an essential vitamin for humans. Extending the study of AsA-related genes from Arabidopsis thaliana to Brassica rapa could shed light on the evolution of AsA in plants and inform crop breeding. In this study, we conducted whole-genome annotation, molecular-evolution and gene-expression analyses of all known AsA-related genes in B. rapa. The nucleobase–ascorbate transporter (NAT) gene family and AsA l-galactose pathway genes were also compared among plant species. Four important insights gained are that: 1) 102 AsA-related gene were identified in B. rapa and they mainly diverged 12–18 Ma accompanied by the Brassica-specific genome triplication event; 2) during their evolution, these AsA-related genes were preferentially retained, consistent with the gene dosage hypothesis; 3) the putative proteins were highly conserved, but their expression patterns varied; and 4) although the number of AsA-related genes is higher in B. rapa than in A. thaliana, the AsA contents and the numbers of expressed genes in leaves of both species are similar, the genes that are not generally expressed may serve as substitutes during emergencies. In summary, this study provides genome-wide insights into evolutionary history and mechanisms of AsA-related genes following whole-genome triplication in B. rapa.
PMCID: PMC4316640  PMID: 25552535
AsA-related genes; Brassica rapa; evolutionary conservation; synteny analysis; gene dosage hypothesis; expression pattern
4.  A Simple Method for Estimating the Strength of Natural Selection on Overlapping Genes 
Genome Biology and Evolution  2014;7(1):381-390.
Overlapping genes, where one DNA sequence codes for two proteins with different reading frames, are not uncommon in viruses and cellular organisms. Estimating the direction and strength of natural selection acting on overlapping genes is important for understanding their functionality, origin, evolution, maintenance, and potential interaction. However, the standard methods for estimating synonymous (dS) and nonsynonymous (dN) nucleotide substitution rates are inapplicable here because a nucleotide change can be simultaneously synonymous and nonsynonymous when both reading frames involved are considered. We have developed a simple method that can estimate dN/dS and test for the action of natural selection in each relevant reading frame of the overlapping genes. Our method is an extension of the modified Nei-Gojobori method previously developed for nonoverlapping genes. We confirmed the reliability of our method using extensive computer simulation. Applying this method, we studied the longest human sense–antisense overlapping gene pair, LRRC8E and ENSG00000214248. Although LRRC8E (leucine-rich repeat containing eight family, member E) is known to regulate cell size, the function of ENSG00000214248 is unknown. Our analysis revealed purifying selection on ENSG00000214248 and suggested that it originated in the common ancestor of bony vertebrates.
PMCID: PMC4316641  PMID: 25552532
synonymous substitution; nonsynonymous substitution; evolution
5.  Extreme Features of the Galdieria sulphuraria Organellar Genomes: A Consequence of Polyextremophily? 
Genome Biology and Evolution  2014;7(1):367-380.
Nuclear genome sequencing from extremophilic eukaryotes has revealed clues about the mechanisms of adaptation to extreme environments, but the functional consequences of extremophily on organellar genomes are unknown. To address this issue, we assembled the mitochondrial and plastid genomes from a polyextremophilic red alga, Galdieria sulphuraria strain 074 W, and performed a comparative genomic analysis with other red algae and more broadly across eukaryotes. The mitogenome is highly reduced in size and genetic content and exhibits the highest guanine–cytosine skew of any known genome and the fastest substitution rate among all red algae. The plastid genome contains a large number of intergenic stem-loop structures but is otherwise rather typical in size, structure, and content in comparison with other red algae. We suggest that these unique genomic modifications result not only from the harsh conditions in which Galdieria lives but also from its unusual capability to grow heterotrophically, endolithically, and in the dark. These conditions place additional mutational pressures on the mitogenome due to the increased reliance on the mitochondrion for energy production, whereas the decreased reliance on photosynthesis and the presence of numerous stem-loop structures may shield the plastome from similar genomic stress.
PMCID: PMC4316638  PMID: 25552531
Galdieria sulphuraria; red algae; facultative heterotrophy; polyextremophily; GC skew; substitution rate
6.  Comprehensive Transcriptome Analysis Reveals Accelerated Genic Evolution in a Tibet Fish, Gymnodiptychus pachycheilus 
Genome Biology and Evolution  2014;7(1):251-261.
Elucidating the genetic mechanisms of organismal adaptation to the Tibetan Plateau at a genomic scale can provide insights into the process of adaptive evolution. Many highland species have been investigated and various candidate genes that may be responsible for highland adaptation have been identified. However, we know little about the genomic basis of adaptation to Tibet in fishes. Here, we performed transcriptome sequencing of a schizothoracine fish (Gymnodiptychus pachycheilus) and used it to identify potential genetic mechanisms of highland adaptation. We obtained totally 66,105 assembled unigenes, of which 7,232 were assigned as putative one-to-one orthologs in zebrafish. Comparative gene annotations from several species indicated that at least 350 genes lost and 41 gained since the divergence between G. pachycheilus and zebrafish. An analysis of 6,324 orthologs among zebrafish, fugu, medaka, and spotted gar identified consistent evidence for genome-wide accelerated evolution in G. pachycheilus and only the terminal branch of G. pachycheilus had an elevated Ka/Ks ratio than the ancestral branch. Many functional categories related to hypoxia and energy metabolism exhibited rapid evolution in G. pachycheilus relative to zebrafish. Genes showing signature of rapid evolution and positive selection in the G. pachycheilus lineage were also enriched in functions associated with energy metabolism and hypoxia. The first genomic resources for fish in the Tibetan Plateau and evolutionary analyses provided some novel insights into highland adaptation in fishes and served as a foundation for future studies aiming to identify candidate genes underlying the genetic bases of adaptation to Tibet in fishes.
PMCID: PMC4316632  PMID: 25543049
Tibetan Plateau; adaptation; positive selection; schizothoracine fish; transcriptome
7.  Octocoral Mitochondrial Genomes Provide Insights into the Phylogenetic History of Gene Order Rearrangements, Order Reversals, and Cnidarian Phylogenetics 
Genome Biology and Evolution  2014;7(1):391-409.
We use full mitochondrial genomes to test the robustness of the phylogeny of the Octocorallia, to determine the evolutionary pathway for the five known mitochondrial gene rearrangements in octocorals, and to test the suitability of using mitochondrial genomes for higher taxonomic-level phylogenetic reconstructions. Our phylogeny supports three major divisions within the Octocorallia and show that Paragorgiidae is paraphyletic, with Sibogagorgia forming a sister branch to the Coralliidae. Furthermore, Sibogagorgia cauliflora has what is presumed to be the ancestral gene order in octocorals, but the presence of a pair of inverted repeat sequences suggest that this gene order was not conserved but rather evolved back to this apparent ancestral state. Based on this we recommend the resurrection of the family Sibogagorgiidae to fix the paraphyly of the Paragorgiidae.
This is the first study to show that in the Octocorallia, mitochondrial gene orders have evolved back to an ancestral state after going through a gene rearrangement, with at least one of the gene orders evolving independently in different lineages. A number of studies have used gene boundaries to determine the type of mitochondrial gene arrangement present. However, our findings suggest that this method known as gene junction screening may miss evolutionary reversals.
Additionally, substitution saturation analysis demonstrates that while whole mitochondrial genomes can be used effectively for phylogenetic analyses within Octocorallia, their utility at higher taxonomic levels within Cnidaria is inadequate. Therefore for phylogenetic reconstruction at taxonomic levels higher than subclass within the Cnidaria, nuclear genes will be required, even when whole mitochondrial genomes are available.
PMCID: PMC4316637  PMID: 25539723
Octocorallia; deep-sea corals; soft corals; cnidarian phylogenetics; gene rearrangement; substitution saturation
8.  A Neutrality Test for Detecting Selection on DNA Methylation Using Single Methylation Polymorphism Frequency Spectrum 
Genome Biology and Evolution  2014;7(1):154-171.
Inheritable epigenetic mutations (epimutations) can contribute to transmittable phenotypic variation. Thus, epimutations can be subject to natural selection and impact the fitness and evolution of organisms. Based on the framework of the modified Tajima’s D test for DNA mutations, we developed a neutrality test with the statistic “Dm” to detect selection forces on DNA methylation mutations using single methylation polymorphisms. With computer simulation and empirical data analysis, we compared the Dm test with the original and modified Tajima’s D tests and demonstrated that the Dm test is suitable for detecting selection on epimutations and outperforms original/modified Tajima’s D tests. Due to the higher resetting rate of epimutations, the interpretation of Dm on epimutations and Tajima’s D test on DNA mutations could be different in inferring natural selection. Analyses using simulated and empirical genome-wide polymorphism data suggested that genes under genetic and epigenetic selections behaved differently. We applied the Dm test to recently originated Arabidopsis and human genes, and showed that newly evolved genes contain higher level of rare epialleles, suggesting that epimutation may play a role in origination and evolution of genes and genomes. Overall, we demonstrate the utility of the Dm test to detect whether the loci are under selection regarding DNA methylation. Our analytical metrics and methodology could contribute to our understanding of evolutionary processes of genes and genomes in the field of epigenetics. The Perl script for the “Dm” test is available at (last accessed December 18, 2014).
PMCID: PMC4316624  PMID: 25539727
epigenetics; epimutation; neutrality test; single methylation polymorphism; site frequency; Tajima’s D
9.  Evolution of Spatially Coexpressed Families of Type-2 Vomeronasal Receptors in Rodents 
Genome Biology and Evolution  2014;7(1):272-285.
The vomeronasal organ (VNO) is an olfactory structure for the detection of pheromones. VNO neurons express three groups of unrelated G-protein-coupled receptors. Type-2 vomeronasal receptors (V2Rs) are specifically localized in the basal neurons of the VNO and are believed to sense protein pheromones eliciting specific reproductive behaviors. In murine species, V2Rs are organized into four families. Family-ABD V2Rs are expressed monogenically and coexpress with family-C V2Rs of either subfamily C1 (V2RC1) or subfamily C2 (V2RC2), according to a coordinate temporal diagram. Neurons expressing the phylogenetically ancient V2RC1 coexpress family-BD V2Rs or a specific group of subfamily-A V2Rs (V2RA8-10), whereas a second neuronal subset (V2RC2-positive) coexpresses a recently expanded group of five subfamily-A V2Rs (V2RA1-5) along with vomeronasal-specific Major Histocompatibility Complex molecules (H2-Mv). Through database mining and Sanger sequencing, we have analyzed the onset, diversification, and expansion of the V2R-families throughout the phylogeny of Rodentia. Our results suggest that the separation of V2RC1 and V2RC2 occurred in a Cricetidae ancestor in coincidence with the evolution of the H2-Mv genes; this phylogenetic event did not correspond with the origin of the coexpressing V2RA1-5 genes, which dates back to an ancestral myomorphan lineage. Interestingly, the evolution of receptors within the V2RA1-5 group may be implicated in the origin and diversification of some of the V2R putative cognate ligands, the exocrine secreting peptides. The establishment of V2RC2, which probably reflects the complex expansion and diversification of family-A V2Rs, generated receptors that have probably acquired a more subtle functional specificity.
PMCID: PMC4316634  PMID: 25539725
vomeronasal; pheromones; chemosensory; evolution; phylogeny; rodents
10.  Mutation Rate, Spectrum, Topology, and Context-Dependency in the DNA Mismatch Repair-Deficient Pseudomonas fluorescens ATCC948 
Genome Biology and Evolution  2014;7(1):262-271.
High levels of genetic diversity exist among natural isolates of the bacterium Pseudomonas fluorescens, and are especially elevated around the replication terminus of the genome, where strain-specific genes are found. In an effort to understand the role of genetic variation in the evolution of Pseudomonas, we analyzed 31,106 base substitutions from 45 mutation accumulation lines of P. fluorescens ATCC948, naturally deficient for mismatch repair, yielding a base-substitution mutation rate of 2.34 × 10−8 per site per generation (SE: 0.01 × 10−8) and a small-insertion-deletion mutation rate of 1.65 × 10−9 per site per generation (SE: 0.03 × 10−9). We find that the spectrum of mutations in prophage regions, which often contain virulence factors and antibiotic resistance, is highly similar to that in the intergenic regions of the host genome. Our results show that the mutation rate varies around the chromosome, with the lowest mutation rate found near the origin of replication. Consistent with observations from other studies, we find that site-specific mutation rates are heavily influenced by the immediately flanking nucleotides, indicating that mutations are context dependent.
PMCID: PMC4316635  PMID: 25539726
neutral evolution; mutation hotspots; nonrandom mutations; phage evolution
11.  More than Skin Deep: Functional Genomic Basis for Resistance to Amphibian Chytridiomycosis 
Genome Biology and Evolution  2014;7(1):286-298.
The amphibian-killing chytrid fungus Batrachochytrium dendrobatidis (Bd) is one of the most generalist pathogens known, capable of infecting hundreds of species globally and causing widespread population declines and extinctions. However, some host species are seemingly unaffected by Bd, tolerating or clearing infections without clinical signs of disease. Variation in host immune responses is commonly evoked for these resistant or tolerant species, yet to date, we have no direct comparison of amphibian species responses to infection at the level of gene expression. In this study, we challenged four Central American frog species that vary in Bd susceptibility, with a sympatric virulent strain of the pathogen. We compared skin and spleen orthologous gene expression using differential expression tests and coexpression gene network analyses. We found that resistant species have reduced skin inflammatory responses and increased expression of genes involved in skin integrity. In contrast, only highly susceptible species exhibited suppression of splenic T-cell genes. We conclude that resistance to chytridiomycosis may be related to a species’ ability to escape the immunosuppressive activity of the fungus. Moreover, our results indicate that within-species differences in splenic proteolytic enzyme gene expression may contribute to intraspecific variation in survival. This first comparison of amphibian functional immunogenomic architecture in response to Bd provides insights into key genetic mechanisms underlying variation in disease outcomes among amphibian species.
PMCID: PMC4316636  PMID: 25539724
Batrachochytrium dendrobatidis; immunogenomics; comparative transcriptomics; immunosuppression; amphibian immunity
12.  Recent Coselection in Human Populations Revealed by Protein–Protein Interaction Network 
Genome Biology and Evolution  2014;7(1):136-153.
Genome-wide scans for signals of natural selection in human populations have identified a large number of candidate loci that underlie local adaptations. This is surprising given the relatively short evolutionary time since the divergence of the human population. One hypothesis that has not been formally examined is whether and how the recent human evolution may have been shaped by coselection in the context of complex molecular interactome. In this study, genome-wide signals of selection were scanned in East Asians, Europeans, and Africans using 1000 Genome data, and subsequently mapped onto the protein–protein interaction (PPI) network. We found that the candidate genes of recent positive selection localized significantly closer to each other on the PPI network than expected, revealing substantial clustering of selected genes. Furthermore, gene pairs of shorter PPI network distances showed higher similarities of their recent evolutionary paths than those further apart. Last, subnetworks enriched with recent coselection signals were identified, which are substantially overrepresented in biological pathways related to signal transduction, neurogenesis, and immune function. These results provide the first genome-wide evidence for association of recent selection signals with the PPI network, shedding light on the potential mechanisms of recent coselection in the human genome.
PMCID: PMC4316623  PMID: 25532814
recent positive selection; PPI network; network topology; coselection; coevolution; pathway selection
13.  Evolutionary Dynamics of hAT DNA Transposon Families in Saccharomycetaceae 
Genome Biology and Evolution  2014;7(1):172-190.
Transposable elements (TEs) are widespread in eukaryotes but uncommon in yeasts of the Saccharomycotina subphylum, in terms of both host species and genome fraction. The class II elements are especially scarce, but the hAT element Rover is a noteworthy exception that deserves further investigation.
Here, we conducted a genome-wide analysis of hAT elements in 40 ascomycota. A novel family, Roamer, was found in three species, whereas Rover was detected in 15 preduplicated species from Kluyveromyces, Eremothecium, and Lachancea genera, with up to 41 copies per genome. Rover acquisition seems to have occurred by horizontal transfer in a common ancestor of these genera. The detection of remote Rover copies in Naumovozyma dairenensis and in the sole Saccharomyces cerevisiae strain AWRI1631, without synteny, suggests that two additional independent horizontal transfers took place toward these genomes. Such patchy distribution of elements prevents any anticipation of TE presence in incoming sequenced genomes, even closely related ones.
The presence of both putative autonomous and defective Rover copies, as well as their diversification into five families, indicate particular dynamics of Rover elements in the Lachancea genus. Especially, we discovered the first miniature inverted-repeat transposable elements (MITEs) to be described in yeasts, together with their parental autonomous copies. Evidence of MITE insertion polymorphism among Lachancea waltii strains suggests their recent activity. Moreover, 40% of Rover copies appeared to be involved in chromosome rearrangements, showing the large structural impact of TEs on yeast genome and opening the door to further investigations to understand their functional and evolutionary consequences.
PMCID: PMC4316626  PMID: 25532815
yeast; MITE; evolution; Rover; Roamer; horizontal transfer
14.  Sequence Diversity of Pan troglodytes Subspecies and the Impact of WFDC6 Selective Constraints in Reproductive Immunity 
Genome Biology and Evolution  2013;5(12):2512-2523.
Recent efforts have attempted to describe the population structure of common chimpanzee, focusing on four subspecies: Pan troglodytes verus, P. t. ellioti, P. t. troglodytes, and P. t. schweinfurthii. However, few studies have pursued the effects of natural selection in shaping their response to pathogens and reproduction. Whey acidic protein (WAP) four-disulfide core domain (WFDC) genes and neighboring semenogelin (SEMG) genes encode proteins with combined roles in immunity and fertility. They display a strikingly high rate of amino acid replacement (dN/dS), indicative of adaptive pressures during primate evolution. In human populations, three signals of selection at the WFDC locus were described, possibly influencing the proteolytic profile and antimicrobial activities of the male reproductive tract. To evaluate the patterns of genomic variation and selection at the WFDC locus in chimpanzees, we sequenced 17 WFDC genes and 47 autosomal pseudogenes in 68 chimpanzees (15 P. t. troglodytes, 22 P. t. verus, and 31 P. t. ellioti). We found a clear differentiation of P. t. verus and estimated the divergence of P. t. troglodytes and P. t. ellioti subspecies in 0.173 Myr; further, at the WFDC locus we identified a signature of strong selective constraints common to the three subspecies in WFDC6—a recent paralog of the epididymal protease inhibitor EPPIN. Overall, chimpanzees and humans do not display similar footprints of selection across the WFDC locus, possibly due to different selective pressures between the two species related to immune response and reproductive biology.
PMCID: PMC3879984  PMID: 24356879
WFDC; natural selection; chimpanzees; serine protease inhibitor; reproduction; innate immunity
15.  The Mitochondrial Genome of the Glomeromycete Rhizophagus sp. DAOM 213198 Reveals an Unusual Organization Consisting of Two Circular Chromosomes 
Genome Biology and Evolution  2014;7(1):96-105.
Mitochondrial (mt) genomes are intensively studied in Ascomycota and Basidiomycota, but they are poorly documented in basal fungal lineages. In this study, we sequenced the complete mtDNA of Rhizophagus sp. DAOM 213198, a close relative to Rhizophagus irregularis, a widespread, ecologically and economical relevant species belonging to Glomeromycota. Unlike all other known taxonomically close relatives harboring a full-length circular chromosome, mtDNA of Rhizophagus sp. reveals an unusual organization with two circular chromosomes of 61,964 and 29,078 bp. The large chromosome contained nine protein-coding genes (atp9, nad5, cob, nad4, nad1, nad4L, cox1, cox2, and atp8), small subunit rRNA gene (rns), and harbored 20 tRNA-coding genes and 10 orfs, while the small chromosome contained five protein-coding genes (atp6, nad2, nad3, nad6, and cox3), large subunit rRNA gene (rnl) in addition to 5 tRNA-coding genes, and 8 plasmid-related DNA polymerases (dpo). Although structural variation of plant mt genomes is well documented, this study is the first report of the presence of two circular mt genomes in arbuscular mycorrhizal fungi. Interestingly, the presence of dpo at the breakage point in intergenes cox1-cox2 and rnl-atp6 for large and small mtDNAs, respectively, could be responsible for the conversion of Rhizophagus sp. mtDNA into two chromosomes. Using quantitative real-time polymerase chain reaction, we found that both mtDNAs have an equal abundance. This study reports a novel mtDNA organization in Glomeromycota and highlights the importance of studying early divergent fungal lineages to describe novel evolutionary pathways in the fungal kingdom.
PMCID: PMC4316621  PMID: 25527840
mitochondrial genome; genome sequencing; basal fungal lineages; fungi; plasmid-like DNA polymerase genes (dpo); arbuscular mycorrhizal fungi; Glomeromycota; Rhizophagus
16.  Contrasting Inter- and Intraspecies Recombination Patterns in the “Harveyi Clade” Vibrio Collected over Large Spatial and Temporal Scales 
Genome Biology and Evolution  2014;7(1):71-80.
Recombination plays an important role in the divergence of bacteria, but the frequency of interspecies and intraspecies recombination events remains poorly understood. We investigated recombination events that occurred within core genomes of 35 Vibrio strains (family Vibrionaceae, Gammaproteobacteria), from six closely related species in the so-called “Harveyi clade.” The strains were selected from a collection of strains isolated in the last 90 years, from various environments worldwide. We found a close relationship between the number of interspecies recombination events within core genomes of the 35 strains and the overall genomic identity, as inferred from calculations of the average nucleotide identity. The relationship between the overall nucleotide identity and the number of detected interspecies recombination events was comparable when analyzing strains isolated over 80 years apart, from different hemispheres, or from different ecologies, as well as in strains isolated from the same geographic location within a short time frame. We further applied the same method of detecting recombination events to analyze 11 strains of Vibrio campbellii, and identified disproportionally high number of intraspecies recombination events within the core genomes of some, but not all, strains. The high number of recombination events was detected between V. campbellii strains that have significant temporal (over 18 years) and geographical (over 10,000 km) differences in their origins of isolation. Results of this study reveal a remarkable stability of Harveyi clade species, and give clues about the origins and persistence of species in the clade.
PMCID: PMC4316622  PMID: 25527835
Vibrio; recombination; bacterial species definition; bacterial speciation
17.  Rooting the Domain Archaea by Phylogenomic Analysis Supports the Foundation of the New Kingdom Proteoarchaeota 
Genome Biology and Evolution  2014;7(1):191-204.
The first 16S rRNA-based phylogenies of the Archaea showed a deep division between two groups, the kingdoms Euryarchaeota and Crenarchaeota. This bipartite classification has been challenged by the recent discovery of new deeply branching lineages (e.g., Thaumarchaeota, Aigarchaeota, Nanoarchaeota, Korarchaeota, Parvarchaeota, Aenigmarchaeota, Diapherotrites, and Nanohaloarchaeota) which have also been given the same taxonomic status of kingdoms. However, the phylogenetic position of some of these lineages is controversial. In addition, phylogenetic analyses of the Archaea have often been carried out without outgroup sequences, making it difficult to determine if these taxa actually define lineages at the same level as the Euryarchaeota and Crenarchaeota. We have addressed the question of the position of the root of the Archaea by reconstructing rooted archaeal phylogenetic trees using bacterial sequences as outgroup. These trees were based on commonly used conserved protein markers (32 ribosomal proteins) as well as on 38 new markers identified through phylogenomic analysis. We thus gathered a total of 70 conserved markers that we analyzed as a concatenated data set. In contrast with previous analyses, our trees consistently placed the root of the archaeal tree between the Euryarchaeota (including the Nanoarchaeota and other fast-evolving lineages) and the rest of archaeal species, which we propose to class within the new kingdom Proteoarchaeota. This implies the relegation of several groups previously classified as kingdoms (e.g., Crenarchaeota, Thaumarchaeota, Aigarchaeota, and Korarchaeota) to a lower taxonomic rank. In addition to taxonomic implications, this profound reorganization of the archaeal phylogeny has also consequences on our appraisal of the nature of the last archaeal ancestor, which most likely was a complex organism with a gene-rich genome.
PMCID: PMC4316627  PMID: 25527841
Archaea; Euryarchaeota; Proteoarchaeota; root; phylogenomics
18.  Intraisolate Mitochondrial Genetic Polymorphism and Gene Variants Coexpression in Arbuscular Mycorrhizal Fungi 
Genome Biology and Evolution  2014;7(1):218-227.
Arbuscular mycorrhizal fungi (AMF) are multinucleated and coenocytic organisms, in which the extent of the intraisolate nuclear genetic variation has been a source of debate. Conversely, their mitochondrial genomes (mtDNAs) have appeared to be homogeneous within isolates in all next generation sequencing (NGS)-based studies. Although several lines of evidence have challenged mtDNA homogeneity in AMF, extensive survey to investigate intraisolate allelic diversity has not previously been undertaken. In this study, we used a conventional polymerase chain reaction -based approach on selected mitochondrial regions with a high-fidelity DNA polymerase, followed by cloning and Sanger sequencing. Two isolates of Rhizophagus irregularis were used, one cultivated in vitro for several generations (DAOM-197198) and the other recently isolated from the field (DAOM-242422). At different loci in both isolates, we found intraisolate allelic variation within the mtDNA and in a single copy nuclear marker, which highlighted the presence of several nonsynonymous mutations in protein coding genes. We confirmed that some of this variation persisted in the transcriptome, giving rise to at least four distinct nad4 transcripts in DAOM-197198. We also detected the presence of numerous mitochondrial DNA copies within nuclear genomes (numts), providing insights to understand this important evolutionary process in AMF. Our study reveals that genetic variation in Glomeromycota is higher than what had been previously assumed and also suggests that it could have been grossly underestimated in most NGS-based AMF studies, both in mitochondrial and nuclear genomes, due to the presence of low-level mutations.
PMCID: PMC4316628  PMID: 25527836
arbuscular mycorrhizal fungi; mitochondria; heteroplasmy; NGS and Sanger sequencing; polymorphism; gene variants coexpression
19.  Biased Gene Conversion and GC-Content Evolution in the Coding Sequences of Reptiles and Vertebrates 
Genome Biology and Evolution  2014;7(1):240-250.
Mammalian and avian genomes are characterized by a substantial spatial heterogeneity of GC-content, which is often interpreted as reflecting the effect of local GC-biased gene conversion (gBGC), a meiotic repair bias that favors G and C over A and T alleles in high-recombining genomic regions. Surprisingly, the first fully sequenced nonavian sauropsid (i.e., reptile), the green anole Anolis carolinensis, revealed a highly homogeneous genomic GC-content landscape, suggesting the possibility that gBGC might not be at work in this lineage. Here, we analyze GC-content evolution at third-codon positions (GC3) in 44 vertebrates species, including eight newly sequenced transcriptomes, with a specific focus on nonavian sauropsids. We report that reptiles, including the green anole, have a genome-wide distribution of GC3 similar to that of mammals and birds, and we infer a strong GC3-heterogeneity to be already present in the tetrapod ancestor. We further show that the dynamic of coding sequence GC-content is largely governed by karyotypic features in vertebrates, notably in the green anole, in agreement with the gBGC hypothesis. The discrepancy between third-codon positions and noncoding DNA regarding GC-content dynamics in the green anole could not be explained by the activity of transposable elements or selection on codon usage. This analysis highlights the unique value of third-codon positions as an insertion/deletion-free marker of nucleotide substitution biases that ultimately affect the evolution of proteins.
PMCID: PMC4316630  PMID: 25527834
third-codon positions; phylogeny; karyotype
20.  The Bimodal Distribution of Genic GC Content Is Ancestral to Monocot Species 
Genome Biology and Evolution  2014;7(1):336-348.
In grasses such as rice or maize, the distribution of genic GC content is well known to be bimodal. It is mainly driven by GC content at third codon positions (GC3 for short). This feature is thought to be specific to grasses as closely related species like banana have a unimodal GC3 distribution. GC3 is associated with numerous genomics features and uncovering the origin of this peculiar distribution will help understanding the potential roles and consequences of GC3 variations within and between genomes. Until recently, the origin of the peculiar GC3 distribution in grasses has remained unknown. Thanks to the recent publication of several complete genomes and transcriptomes of nongrass monocots, we studied more than 1,000 groups of one-to-one orthologous genes in seven grasses and three outgroup species (banana, palm tree, and yam). Using a maximum likelihood-based method, we reconstructed GC3 at several ancestral nodes. We found that the bimodal GC3 distribution observed in extant grasses is ancestral to both grasses and most monocot species, and that other species studied here have lost this peculiar structure. We also found that GC3 in grass lineages is globally evolving very slowly and that the decreasing GC3 gradient observed from 5′ to 3′ along coding sequences is also conserved and ancestral to monocots. This result strongly challenges the previous views on the specificity of grass genomes and we discuss its implications for the possible causes of the evolution of GC content in monocots.
PMCID: PMC4316631  PMID: 25527839
GC content; coding regions; monocotyledons; ancestral reconstructions; GC gradient
21.  Satellite DNA as a Driver of Population Divergence in the Red Flour Beetle Tribolium castaneum 
Genome Biology and Evolution  2014;7(1):228-239.
Tandemly repeated satellite DNAs are among most rapidly evolving sequences in eukaryotic genome, usually differing significantly among closely related species. By inducing changes in heterochromatin and/or centromere, satellite DNAs are expected to drive population and species divergence. However, despite high evolutionary dynamics, divergence of satellite DNA profiles at the level of natural population which precedes and possibly triggers speciation process is not readily detected. Here, we characterize minor TCAST2 satellite DNA of the red flour beetle Tribolium castaneum and follow its dynamics among wild-type strains originating from diverse geographic locations. The investigation revealed presence of three distinct subfamilies of TCAST2 satellite DNA which differ in monomer size, genome organization, and subfamily specific mutations. Subfamilies Tcast2a and Tcast2b are tandemly arranged within pericentromeric heterochromatin whereas Tcast2c is preferentially dispersed within euchromatin of all chromosomes. Among strains, TCAST2 subfamilies are conserved in sequence but exhibit a significant content variability. This results in overrepresentation or almost complete absence of particular subfamily in some strains and enables discrimination between strains. It is proposed that homologous recombination, probably stimulated by environmental stress, is responsible for the emergence of TCAST2 satellite subfamilies, their copy number variation and dispersion within genome. The results represent the first evidence for the existence of population-specific satellite DNA profiles. Partial organization of TCAST2 satellite DNA in the form of single repeats dispersed within euchromatin additionally contributes to the genome divergence at the population level.
PMCID: PMC4316633  PMID: 25527837
satellite DNA; repetitive DNA; genome evolution; heterochromatin; population divergence; Tribolium castaneum
22.  The Secreted Proteins of Achlya hypogyna and Thraustotheca clavata Identify the Ancestral Oomycete Secretome and Reveal Gene Acquisitions by Horizontal Gene Transfer 
Genome Biology and Evolution  2014;7(1):120-135.
Saprotrophic and parasitic microorganisms secrete proteins into the environment to breakdown macromolecules and obtain nutrients. The molecules secreted are collectively termed the “secretome” and the composition and function of this set of proteins varies depending on the ecology, life cycle, and environment of an organism. Beyond the function of nutrient acquisition, parasitic lineages must also secrete molecules to manipulate their host. Here, we use a combination of de novo genome and transcriptome sequencing and bioinformatic identification of signal peptides to identify the putative secreted proteome of two oomycetes, the facultative parasite Achlya hypogyna and free-living Thraustotheca clavata. By comparing the secretomes of these saprolegnialean oomycetes with that of eight other oomycetes, we were able to characterize the evolution of this protein set across the oomycete clade. These species span the last common ancestor of the two major oomycete families allowing us to identify the ancestral secretome. This putative ancestral secretome consists of at least 84 gene families. Only 11 of these gene families are conserved across all 10 secretomes analyzed and the two major branches in the oomycete radiation. Notably, we have identified expressed elicitin-like effector genes in the saprotrophic decomposer, T. clavata. Phylogenetic analyses show six novel horizontal gene transfers to the oomycete secretome from bacterial and fungal donor lineages, four of which are specific to the Saprolegnialeans. Comparisons between free-living and pathogenic taxa highlight the functional changes of oomycete secretomes associated with shifts in lifestyle.
PMCID: PMC4316629  PMID: 25527045
oomycete; horizontal gene transfer; evolution; comparative genomics; osmotrophy
23.  Genome-Wide Patterns of Genetic Polymorphism and Signatures of Selection in Plasmodium vivax 
Genome Biology and Evolution  2014;7(1):106-119.
Plasmodium vivax is the most prevalent human malaria parasite outside of Africa. Yet, studies aimed to identify genes with signatures consistent with natural selection are rare. Here, we present a comparative analysis of the pattern of genetic variation of five sequenced isolates of P. vivax and its divergence with two closely related species, Plasmodium cynomolgi and Plasmodium knowlesi, using a set of orthologous genes. In contrast to Plasmodium falciparum, the parasite that causes the most lethal form of human malaria, we did not find significant constraints on the evolution of synonymous sites genome wide in P. vivax. The comparative analysis of polymorphism and divergence across loci allowed us to identify 87 genes with patterns consistent with positive selection, including genes involved in the “exportome” of P. vivax, which are potentially involved in evasion of the host immune system. Nevertheless, we have found a pattern of polymorphism genome wide that is consistent with a significant amount of constraint on the replacement changes and prevalent negative selection. Our analyses also show that silent polymorphism tends to be larger toward the ends of the chromosomes, where many genes involved in antigenicity are located, suggesting that natural selection acts not only by shaping the patterns of variation within the genes but it also affects genome organization.
PMCID: PMC4316620  PMID: 25523904
Plasmodium; natural selection; genome variation; Plasmodium vivax; genome architecture
24.  Deciphering the Genome Repertoire of Pseudomonas sp. M1 toward β-Myrcene Biotransformation 
Pseudomonas sp. M1 is able to mineralize several unusual substrates of natural and xenobiotic origin, contributing to its competence to thrive in different ecological niches. In this work, the genome of M1 strain was resequenced by Illumina MiSeq to refine the quality of a published draft by resolving the majority of repeat-rich regions. In silico genome analysis led to the prediction of metabolic pathways involved in biotransformation of several unusual substrates (e.g., plant-derived volatiles), providing clues on the genomic complement required for such biodegrading/biotransformation functionalities. Pseudomonas sp. M1 exhibits a particular sensory and biotransformation/biocatalysis potential toward β-myrcene, a terpene vastly used in industries worldwide. Therefore, the genomic responsiveness of M1 strain toward β-myrcene was investigated, using an RNA sequencing approach. M1 cells challenged with β-myrcene(compared with cells grown in lactate) undergo an extensive alteration of the transcriptome expression profile, including 1,873 genes evidencing at least 1.5-fold of altered expression (627 upregulated and 1,246 downregulated), toward β-myrcene-imposed molecular adaptation and cellular specialization. A thorough data analysis identified a novel 28-kb genomic island, whose expression was strongly stimulated in β-myrcene-supplemented medium, that is essential for β-myrcene catabolism. This island includes β-myrcene-induced genes whose products are putatively involved in 1) substrate sensing, 2) gene expression regulation, and 3) β-myrcene oxidation and bioconversion of β-myrcene derivatives into central metabolism intermediates. In general, this locus does not show high homology with sequences available in databases and seems to have evolved through the assembly of several functional blocks acquired from different bacteria, probably, at different evolutionary stages.
PMCID: PMC4316614  PMID: 25503374
genome sequencing; biocatalysis; terpenes; Pseudomonas; RNA-seq; genomic island
25.  Multiple Lineages of Ancient CR1 Retroposons Shaped the Early Genome Evolution of Amniotes 
Genome Biology and Evolution  2014;7(1):205-217.
Chicken repeat 1 (CR1) retroposons are long interspersed elements (LINEs) that are ubiquitous within amniote genomes and constitute the most abundant family of transposed elements in birds, crocodilians, turtles, and snakes. They are also present in mammalian genomes, where they reside as numerous relics of ancient retroposition events. Yet, despite their relevance for understanding amniote genome evolution, the diversity and evolution of CR1 elements has never been studied on an amniote-wide level. We reconstruct the temporal and quantitative activity of CR1 subfamilies via presence/absence analyses across crocodilian phylogeny and comparative analyses of 12 crocodilian genomes, revealing relative genomic stasis of retroposition during genome evolution of extant Crocodylia. Our large-scale phylogenetic analysis of amniote CR1 subfamilies suggests the presence of at least seven ancient CR1 lineages in the amniote ancestor; and amniote-wide analyses of CR1 successions and quantities reveal differential retention (presence of ancient relics or recent activity) of these CR1 lineages across amniote genome evolution. Interestingly, birds and lepidosaurs retained the fewest ancient CR1 lineages among amniotes and also exhibit smaller genome sizes. Our study is the first to analyze CR1 evolution in a genome-wide and amniote-wide context and the data strongly suggest that the ancestral amniote genome contained myriad CR1 elements from multiple ancient lineages, and remnants of these are still detectable in the relatively stable genomes of crocodilians and turtles. Early mammalian genome evolution was thus characterized by a drastic shift from CR1 prevalence to dominance and hyperactivity of L2 LINEs in monotremes and L1 LINEs in therians.
PMCID: PMC4316615  PMID: 25503085
transposable elements; chicken repeat 1; phylogenomics; comparative genomics; crocodilian genomes; amniotes

Results 1-25 (869)