Despite the availability of deep-sequencing techniques, genomic and transcriptomic data remain unevenly distributed across phylogenetic groups. For example, reptiles are poorly represented in sequence databases, hindering functional evolutionary and developmental studies in these lineages substantially more diverse than mammals. In addition, different studies use different assembly and annotation protocols, inhibiting meaningful comparisons. Here, we present the “Reptilian Transcriptomes Database 2.0,” which provides extensive annotation of transcriptomes and genomes from species covering the major reptilian lineages. To this end, we sequenced normalized complementary DNA libraries of multiple adult tissues and various embryonic stages of the leopard gecko and the corn snake and gathered published reptilian sequence data sets from representatives of the four extant orders of reptiles: Squamata (snakes and lizards), the tuatara, crocodiles, and turtles. The LANE runner 2.0 software was implemented to annotate all assemblies within a single integrated pipeline. We show that this approach increases the annotation completeness of the assembled transcriptomes/genomes. We then built large concatenated protein alignments of single-copy genes and inferred phylogenetic trees that support the positions of turtles and the tuatara as sister groups of Archosauria and Squamata, respectively. The Reptilian Transcriptomes Database 2.0 resource will be updated to include selected new data sets as they become available, thus making it a reference for differential expression studies, comparative genomics and transcriptomics, linkage mapping, molecular ecology, and phylogenomic analyses involving reptiles. The database is available at www.reptilian-transcriptomes.org and can be enquired using a wwwblast server installed at the University of Geneva.
reptiles; transcriptomes; deep sequencing; deep sequencing; squamates; turtles; turtles; Archosauria
Mechanoreception, the sensing of mechanical forces, is an ancient means of orientation and communication and tightly linked to the evolution of motile animals. In flies, the transient-receptor-potential N protein (TRP-N) was found to be a cilia-associated mechanoreceptor. TRP-N belongs to a large and diverse family of ion channels. Its unusually long N-terminal repeat of 28 ankyrin domains presumably acts as the gating spring by which mechanical energy induces channel gating. We analyzed the evolutionary origins and possible diversification of TRP-N. Using a custom-made set of highly discriminative sequence profiles we scanned a representative set of metazoan genomes and subsequently corrected several gene models. We find that, contrary to other ion channel families, TRP-N is remarkably conserved in its domain arrangements and copy number (1) in all Bilateria except for amniotes, even in the wake of several whole-genome duplications. TRP-N is absent in Porifera but present in Ctenophora and Placozoa. Exceptional multiplications of TRP-N occurred in Cnidaria, independently along the Hydra and the Nematostella lineage. Molecular signals of subfunctionalization can be attributed to different mechanisms of activation of the gating spring. In Hydra this is further supported by in situ hybridization and immune staining, suggesting that at least three paralogs adapted to nematocyte discharge, which is key for predation and defense. We propose that these new candidate proteins help explain the sensory complexity of Cnidaria which has been previously observed but so far has lacked a molecular underpinning. Also, the ancient appearance of TRP-N supports a common origin of important components of the nervous systems in Ctenophores, Cnidaria, and Bilateria.
protein evolution; domain rearrangements; mechanosensation; neurobiology; Cnidaria; nematocyst evolution
It is not really helpful to consider modern environmental epigenetics as neo-Lamarckian; and there is no evidence that Lamarck considered the idea original to himself. We must all keep learning about inheritance, but attributing modern ideas to early researchers is not helpful, and can be misleading.
Darwinism; epigenetics; evolution; genetics; Lamarck
Genome reduction is a hallmark of symbiotic genomes, and the rate and patterns of gene loss associated with this process have been investigated in several different symbiotic systems. However, in long-term host-associated coevolving symbiont clades, the genome size differences between strains are normally quite small and hence patterns of large-scale genome reduction can only be inferred from distant relatives. Here we present the complete genome of a Coxiella-like symbiont from Rhipicephalus turanicus ticks (CRt), and compare it with other genomes from the genus Coxiella in order to investigate the process of genome reduction in a genus consisting of intracellular host-associated bacteria with variable genome sizes. The 1.7-Mb CRt genome is larger than the genomes of most obligate mutualists but has a very low protein-coding content (48.5%) and an extremely high number of identifiable pseudogenes, indicating that it is currently undergoing genome reduction. Analysis of encoded functions suggests that CRt is an obligate tick mutualist, as indicated by the possible provisioning of the tick with biotin (B7), riboflavin (B2) and other cofactors, and by the loss of most genes involved in host cell interactions, such as secretion systems. Comparative analyses between CRt and the 2.5 times smaller genome of Coxiella from the lone star tick Amblyomma americanum (CLEAA) show that many of the same gene functions are lost and suggest that the large size difference might be due to a higher rate of genome evolution in CLEAA generated by the loss of the mismatch repair genes mutSL. Finally, sequence polymorphisms in the CRt population sampled from field collected ticks reveal up to one distinct strain variant per tick, and analyses of mutational patterns within the population suggest that selection might be acting on synonymous sites. The CRt genome is an extreme example of a symbiont genome caught in the act of genome reduction, and the comparison between CLEAA and CRt indicates that losses of particular genes early on in this process can potentially greatly influence the speed of this process.
symbiosis; genome reduction; Coxiella
Venom peptides from predatory organisms are a resource for investigating evolutionary processes such as adaptive radiation or diversification, and exemplify promising targets for biomedical drug development. Terebridae are an understudied lineage of conoidean snails, which also includes cone snails and turrids. Characterization of cone snail venom peptides, conotoxins, has revealed a cocktail of bioactive compounds used to investigate physiological cellular function, predator-prey interactions, and to develop novel therapeutics. However, venom diversity of other conoidean snails remains poorly understood. The present research applies a venomics approach to characterize novel terebrid venom peptides, teretoxins, from the venom gland transcriptomes of Triplostephanus anilis and Terebra subulata. Next-generation sequencing and de novo assembly identified 139 putative teretoxins that were analyzed for the presence of canonical peptide features as identified in conotoxins. To meet the challenges of de novo assembly, multiple approaches for cross validation of findings were performed to achieve reliable assemblies of venom duct transcriptomes and to obtain a robust portrait of Terebridae venom. Phylogenetic methodology was used to identify 14 teretoxin gene superfamilies for the first time, 13 of which are unique to the Terebridae. Additionally, basic local algorithm search tool homology-based searches to venom-related genes and posttranslational modification enzymes identified a convergence of certain venom proteins, such as actinoporin, commonly found in venoms. This research provides novel insights into venom evolution and recruitment in Conoidean predatory marine snails and identifies a plethora of terebrid venom peptides that can be used to investigate fundamental questions pertaining to gene evolution.
venomics; venom evolution; Terebridae; teretoxins; transcriptomics; Conoidea
Eukaryotic genomes are colonized by various transposons including short interspersed elements (SINEs). The 5′ region (head) of the majority of SINEs is derived from one of the three types of RNA genes—7SL RNA, transfer RNA (tRNA), or 5S ribosomal RNA (rRNA)—and the internal promoter inside the head promotes the transcription of the entire SINEs. Here I report a new group of SINEs whose heads originate from either the U1 or U2 small nuclear RNA gene. These SINEs, named SINEU, are distributed among crocodilians and classified into three families. The structures of the SINEU-1 subfamilies indicate the recurrent addition of a U1- or U2-derived sequence onto the 5′ end of SINEU-1 elements. SINEU-1 and SINEU-3 are ancient and shared among alligators, crocodiles, and gharials, while SINEU-2 is absent in the alligator genome. SINEU-2 is the only SINE family that was active after the split of crocodiles and gharials. All SINEU families, especially SINEU-3, are preferentially inserted into a family of Mariner DNA transposon, Mariner-N4_AMi. A group of Tx1 non-long terminal repeat retrotransposons designated Tx1-Mar also show target preference for Mariner-N4_AMi, indicating that SINEU was mobilized by Tx1-Mar.
SINEU; U1; U2; crocodilians; gharial; alligator; crocodile; transposable elements; Mariner-N4_AMi; Tx1; Tx1-Mar
Elizabethkingia anophelis is an emerging pathogen that can cause life-threatening infections in neonates, severely immunocompromised and postoperative patients. The lack of genomic information on E. anophelis hinders our understanding of its mechanisms of pathogenesis. Here, we report the first complete genome sequence of E. anophelis NUHP1 and assess its response to oxidative stress. Elizabethkingia anophelis NUHP1 has a circular genome of 4,369,828 base pairs and 4,141 predicted coding sequences. Sequence analysis indicates that E. anophelis has well-developed systems for scavenging iron and stress response. Many putative virulence factors and antibiotic resistance genes were identified, underscoring potential host–pathogen interactions and antibiotic resistance. RNA-sequencing-based transcriptome profiling indicates that expressions of genes involved in synthesis of an yersiniabactin-like iron siderophore and heme utilization are highly induced as a protective mechanism toward oxidative stress caused by hydrogen peroxide treatment. Chrome azurol sulfonate assay verified that siderophore production of E. anophelis is increased in the presence of oxidative stress. We further showed that hemoglobin facilitates the growth, hydrogen peroxide tolerance, cell attachment, and biofilm formation of E. anophelis NUHP1. Our study suggests that siderophore production and heme uptake pathways might play essential roles in stress response and virulence of the emerging pathogen E. anophelis.
Elizabethkingia anophelis; genome; transcriptome; iron siderophore; heme; oxidative stress response
Antibiotic resistance poses a major threat to human health. It is therefore important to characterize the frequency of resistance within natural bacterial environments. Many studies have focused on characterizing the frequencies with which horizontally acquired resistance genes segregate within natural bacterial populations. Yet, very little is currently understood regarding the frequency of segregation of resistance alleles occurring within the housekeeping targets of antibiotics. We surveyed a large number of metagenomic datasets extracted from a large variety of host-associated and non host-associated environments for such alleles conferring resistance to three groups of broad spectrum antibiotics: streptomycin, rifamycins, and quinolones. We find notable segregation frequencies of resistance alleles occurring within the target genes of each of the three antibiotics, with quinolone resistance alleles being the most frequent and rifamycin resistance alleles being the least frequent. Resistance allele frequencies varied greatly between different phyla and as a function of environment. The frequency of quinolone resistance alleles was especially high within host-associated environments, where it averaged an alarming ∼40%. Within host-associated environments, resistance to quinolones was most often conferred by a specific resistance allele. High frequencies of quinolone resistance alleles were also found within hosts that were not directly treated with antibiotics. Therefore, the high segregation frequency of quinolone resistance alleles occurring within the housekeeping targets of antibiotics in host-associated environments does not seem to be the sole result of clinical antibiotic usage.
antibiotic resistance; microbiome; metagenomics; allele frequencies
Identification of retrotransposon insertions in nonmodel taxa can be technically challenging and costly. This has inhibited progress in understanding retrotransposon insertion dynamics outside of a few well-studied species. To address this problem, we have extended a retrotransposon-based capture and sequence method (ME-Scan [mobile element scanning]) to identify insertions belonging to the Ves family of short interspersed elements (SINEs) across seven species of the bat genus Myotis. We identified between 120,000 and 143,000 SINE insertions in six taxa lacking a draft genome by comparing to the M. lucifugus reference genome. On average, each Ves insertion was sequenced to 129.6 × coverage. When mapped back to the M. lucifugus reference genome, all insertions were confidently assigned within a 10-bp window. Polymorphic Ves insertions were identified in each taxon based on their mapped locations. Using cross-species comparisons and the identified insertion positions, a presence–absence matrix was created for approximately 796,000 insertions. Dollo parsimony analysis of more than 85,000 phylogenetically informative insertions recovered strongly supported, monophyletic clades that correspond with the biogeography of each taxa. This phylogeny is similar to previously published mitochondrial phylogenies, with the exception of the placement of M. vivesi. These results support the utility of our variation on ME-Scan to identify polymorphic retrotransposon insertions in taxa without a reference genome and for large-scale retrotransposon-based phylogenetics.
rare genomic events; Dollo parsimony; retrotransposon; phylogenetics; Myotis lucifugus
Many algal groups acquired complex plastids by the uptake of green and red algae through multiple secondary endosymbioses. As a result of gene loss and transfer during the endosymbiotic processes, algal endosymbiont nuclei disappeared in most cases. However, chlorarachniophytes and cryptophytes still possess a relict nucleus, so-called the nucleomorph, of the green and red algal endosymbiont, respectively. Nucleomorph genomes are an interesting and suitable model to study the reductive evolution of endosymbiotically derived genomes. To date, nucleomorph genomes have been sequenced in four cryptophyte species and two chlorarachniophyte species, including Bigelowiella natans (373 kb) and Lotharella oceanica (610 kb). In this study, we report complete nucleomorph genome sequences of two chlorarachniophytes, Amorphochlora amoebiformis and Lotharella vacuolata, to gain insight into the reductive evolution of nucleomorph genomes in the chlorarachniophytes. The nucleomorph genomes consist of three chromosomes totaling 374 and 432 kb in size in A. amoebiformis and L. vacuolata, respectively. Comparative analyses among four chlorarachniophyte nucleomorph genomes revealed that these sequences share 171 function-predicted genes (86% of total 198 function-predicted nucleomorph genes), including the same set of genes encoding 17 plastid-associated proteins, and no evidence of a recent nucleomorph-to-nucleus gene transfer was found. This suggests that chlorarachniophyte nucleomorph genomes underwent most of their reductive evolution prior to the radiation of extent members of the group. However, there are slight variations in genome size, GC content, duplicated gene number, and subtelomeric regions among the four nucleomorph genomes, suggesting that the genomes might be undergoing changes that do not affect the core functions in each species.
chlorarachniophyte; nucleomorph; endosymbiosis; genome reduction; secondary plastid
Protein-coding sequences can arise either from duplication and divergence of existing sequences, or de novo from noncoding DNA. Unfortunately, recently evolved de novo genes can be hard to distinguish from false positives, making their study difficult. Here, we study a more tractable version of the process of conversion of noncoding sequence into coding: the co-option of short segments of noncoding sequence into the C-termini of existing proteins via the loss of a stop codon. Because we study recent additions to potentially old genes, we are able to apply a variety of stringent quality filters to our annotations of what is a true protein-coding gene, discarding the putative proteins of unknown function that are typical of recent fully de novo genes. We identify 54 examples of C-terminal extensions in Saccharomyces and 28 in Drosophila, all of them recent enough to still be polymorphic. We find one putative gene fusion that turns out, on close inspection, to be the product of replicated assembly errors, further highlighting the issue of false positives in the study of rare events. Four of the Saccharomyces C-terminal extensions (to ADH1, ARP8, TPM2, and PIS1) that survived our quality filters are predicted to lead to significant modification of a protein domain structure.
gene birth; stop codon readthrough; origin of novelty; protein structure
How genomic selection enables species to adapt to divergent environments is a fundamental question in ecology and evolution. We investigated the genomic signatures of local adaptation in Atlantic cod (Gadus morhua L.) along a natural salinity gradient, ranging from 35‰ in the North Sea to 7‰ within the Baltic Sea. By utilizing a 12 K SNPchip, we simultaneously assessed neutral and adaptive genetic divergence across the Atlantic cod genome. Combining outlier analyses with a landscape genomic approach, we identified a set of directionally selected loci that are strongly correlated with habitat differences in salinity, oxygen, and temperature. Our results show that discrete regions within the Atlantic cod genome are subject to directional selection and associated with adaptation to the local environmental conditions in the Baltic- and the North Sea, indicating divergence hitchhiking and the presence of genomic islands of divergence. We report a suite of outlier single nucleotide polymorphisms within or closely located to genes associated with osmoregulation, as well as genes known to play important roles in the hydration and development of oocytes. These genes are likely to have key functions within a general osmoregulatory framework and are important for the survival of eggs and larvae, contributing to the buildup of reproductive isolation between the low-salinity adapted Baltic cod and the adjacent cod populations. Hence, our data suggest that adaptive responses to the environmental conditions in the Baltic Sea may contribute to a strong and effective reproductive barrier, and that Baltic cod can be viewed as an example of ongoing speciation.
Atlantic cod; Baltic Sea; ecological divergence; genomic adaptation; population genomics; SNPs; speciation
Bacterial outer membrane proteins require the beta-barrel assembly machinery (BAM) for their correct folding and function. The central component of this machinery is BamA, an Omp85 protein that is essential and found in all Gram-negative bacteria. An additional feature of the BAM is the translocation and assembly module (TAM), comprised TamA (an Omp85 family protein) and TamB. We report that TamA and a closely related protein TamL are confined almost exclusively to Proteobacteria and Bacteroidetes/Chlorobi respectively, whereas TamB is widely distributed across the majority of Gram-negative bacterial lineages. A comprehensive phylogenetic and secondary structure analysis of the TamB protein family revealed that TamB was present very early in the evolution of bacteria. Several sequence characteristics were discovered to define the TamB protein family: A signal-anchor linkage to the inner membrane, beta-helical structure, conserved domain architecture and a C-terminal region that mimics outer membrane protein beta-strands. Taken together, the structural and phylogenetic analyses suggest that the TAM likely evolved from an original combination of BamA and TamB, with a later gene duplication event of BamA, giving rise to an additional Omp85 sequence that evolved to be TamA in Proteobacteria and TamL in Bacteroidetes/Chlorobi.
TamA; TamB; beta-barrel assembly; translocation; outer membrane; membrane biogenesis
The internal compartmentation of eukaryotic cells not only allows separation of biochemical processes but it also creates the requirement for systems that can selectively transport proteins across the membrane boundaries. Although most proteins function in a single subcellular compartment, many are able to enter two or more compartments, a phenomenon known as dual or multiple targeting. The aminoacyl-tRNA synthetases (aaRSs), which catalyze the ligation of tRNAs to their cognate amino acids, are particularly prone to functioning in multiple subcellular compartments. They are essential for translation, so they are required in every compartment where translation takes place. In diatoms, there are three such compartments, the plastid, the mitochondrion, and the cytosol. In cryptophytes, translation also takes place in the periplastid compartment (PPC), which is the reduced cytoplasm of the plastid’s red algal ancestor and which retains a reduced red algal nucleus. We searched the organelle and nuclear genomes of the cryptophyte Guillardia theta and the diatoms Phaeodactylum tricornutum and Thalassiosira pseudonana for aaRS genes and found an insufficient number of genes to provide each compartment with a complete set of aaRSs. We therefore inferred, with support from localization predictions, that many aaRSs are dual targeted. We tested four of the predicted dual targeted aaRSs with green fluorescent protein fusion localizations in P. tricornutum and found evidence for dual targeting to the mitochondrion and plastid in P. tricornutum and G. theta, and indications for dual targeting to the PPC and cytosol in G. theta. This is the first report of dual targeting in diatoms or cryptophytes.
Guillardia; Phaeodactylum; pheRS; PPC; protein targeting; syfB
Pathogens and hosts are in an ongoing arms race and genes involved in host–pathogen interactions are likely to undergo diversifying selection. Fusarium plant pathogens have evolved diverse infection strategies, but how they interact with their hosts in the biotrophic infection stage remains puzzling. To address this, we analyzed the genomes of three Fusarium plant pathogens for genes that are under diversifying selection. We found a two-speed genome structure both on the chromosome and gene group level. Diversifying selection acts strongly on the dispensable chromosomes in Fusarium oxysporum f. sp. lycopersici and on distinct core chromosome regions in Fusarium graminearum, all of which have associations with virulence. Members of two gene groups evolve rapidly, namely those that encode proteins with an N-terminal [SG]-P-C-[KR]-P sequence motif and proteins that are conserved predominantly in pathogens. Specifically, 29 F. graminearum genes are rapidly evolving, in planta induced and encode secreted proteins, strongly pointing toward effector function. In summary, diversifying selection in Fusarium is strongly reflected as genomic footprints and can be used to predict a small gene set likely to be involved in host–pathogen interactions for experimental verification.
Fusarium; fungal pathogens; diversifying selection; effector; dispensable chromosomes; evolution
Population response to environmental variation involves adaptation, acclimation, or both. For long-lived organisms, acclimation likely generates a faster response but is only effective if the rates and limits of acclimation match the dynamics of local environmental variation. In coral reef habitats, heat stress from extreme ocean warming can occur over several weeks, resulting in symbiont expulsion and widespread coral death. However, transcriptome regulation during short-term acclimation is not well understood. We examined acclimation during a 11-day experiment in the coral Acropora nana. We acclimated colonies to three regimes: ambient temperature (29 °C), increased stable temperature (31 °C), and variable temperature (29–33 °C), mimicking local heat stress conditions. Within 7–11 days, individuals acclimated to increased temperatures had higher tolerance to acute heat stress. Despite physiological changes, no gene expression changes occurred during acclimation before acute heat stress. However, we found strikingly different transcriptional responses to heat stress between acclimation treatments across 893 contigs. Across these contigs, corals acclimated to higher temperatures (31 °C or 29–33 °C) exhibited a muted stress response—the magnitude of expression change before and after heat stress was less than in 29 °C acclimated corals. Our results show that corals have a rapid phase of acclimation that substantially increases their heat resilience within 7 days and that alters their transcriptional response to heat stress. This is in addition to a previously observed longer term response, distinguishable by its shift in baseline expression, under nonstressful conditions. Such rapid acclimation may provide some protection for this species of coral against slow onset of warming ocean temperatures.
acclimation; transcriptomics; thermal tolerance; coral; climate change
Oenococcus oeni is a lactic acid bacteria species encountered particularly in wine, where it achieves the malolactic fermentation. Molecular typing methods have previously revealed that the species is made of several genetic groups of strains, some being specific to certain types of wines, ciders or regions. Here, we describe 36 recently released O. oeni genomes and the phylogenomic analysis of these 36 plus 14 previously reported genomes. We also report three genome sequences of the sister species Oenococcus kitaharae that were used for phylogenomic reconstructions. Phylogenomic and population structure analyses performed revealed that the 50 O. oeni genomes delineate two major groups of 12 and 37 strains, respectively, named A and B, plus a putative group C, consisting of a single strain. A study on the orthologs and single nucleotide polymorphism contents of the genetic groups revealed that the domestication of some strains to products such as cider, wine, or champagne, is reflected at the genetic level. While group A strains proved to be predominant in wine and to form subgroups adapted to specific types of wine such as champagne, group B strains were found in wine and cider. The strain from putative group C was isolated from cider and genetically closer to group B strains. The results suggest that ancestral O. oeni strains were adapted to low-ethanol containing environments such as overripe fruits, and that they were domesticated to cider and wine, with group A strains being naturally selected in a process of further domestication to specific wines such as champagne.
Oenococcus oeni; genomics; phylogeny; population structure; domestication
Metagenomic analyses are challenging in metazoans, but high-copy number and repeat regions can be assembled from low-coverage sequencing by “genome skimming,” which is applied here as a new way of characterizing metagenomes obtained in an ecological or taxonomic context. Illumina shotgun sequencing on two pools of Coleoptera (beetles) of approximately 200 species each were assembled into tens of thousands of scaffolds. Repeated low-coverage sequencing recovered similar scaffold sets consistently, although approximately 70% of scaffolds could not be identified against existing genome databases. Identifiable scaffolds included mitochondrial DNA, conserved sequences with hits to expressed sequence tag and protein databases, and known repeat elements of high and low complexity, including numerous copies of rRNA and histone genes. Assemblies of histones captured a diversity of gene order and primary sequence in Coleoptera. Scaffolds with similarity to multiple sites in available coleopteran genome sequences for Dendroctonus and Tribolium revealed high specificity of scaffolds to either of these genomes, in particular for high-copy number repeats. Numerous “clusters” of scaffolds mapped to the same genomic site revealed intra- and/or intergenomic variation within a metagenome pool. In addition to effect of taxonomic composition of the metagenomes, the number of mapped scaffolds also revealed structural differences between the two reference genomes, although the significance of this striking finding remains unclear. Finally, apparently exogenous sequences were recovered, including potential food plants, fungal pathogens, and bacterial symbionts. The “metagenome skimming” approach is useful for capturing the genomic diversity of poorly studied, species-rich lineages and opens new prospects in environmental genomics.
environmental genomics; repetitive DNA; histone genes; Coleoptera; bacterial endosymbionts; genome evolution
We set out to investigate potential differences and similarities between the selective forces acting upon the coding and noncoding regions of five different sets of genes defined according to functional and evolutionary criteria: 1) two reference gene sets presenting accelerated and slow rates of protein evolution (the Complement and Actin pathways); 2) a set of genes with evidence of accelerated evolution in at least one of their introns; and 3) two gene sets related to neurological function (Parkinson’s and Alzheimer’s diseases). To that effect, we combine human–chimpanzee divergence patterns with polymorphism data obtained from target resequencing 20 central chimpanzees, our closest relatives with largest long-term effective population size. By using the distribution of fitness effect-alpha extension of the McDonald–Kreitman test, we reproduce inferences of rates of evolution previously based only on divergence data on both coding and intronic sequences and also obtain inferences for other classes of genomic elements (untranslated regions, promoters, and conserved noncoding sequences). Our results suggest that 1) the distribution of fitness effect-alpha method successfully helps distinguishing different scenarios of accelerated divergence (adaptation or relaxed selective constraints) and 2) the adaptive history of coding and noncoding sequences within the gene sets analyzed is decoupled.
chimpanzee; biochemical pathways; natural selection; distribution of fitness effects; fraction of adaptive substitution (α) and adaptive substitution rate (ωα); Alzheimer; Parkinson
The amino acid composition (AAC) of proteomes differs greatly between microorganisms and is associated with the environmental niche they inhabit, suggesting that these changes may be adaptive. Similarly, the oligonucleotide composition of genomes varies and may confer advantages at the DNA/RNA level. These influences overlap in protein-coding sequences, making it difficult to gauge their relative contributions. We disentangle these effects by systematically evaluating the correspondence between intergenic nucleotide composition, where protein-level selection is absent, the AAC, and ecological parameters of 909 prokaryotes. We find that G + C content, the most frequently used measure of genomic composition, cannot capture diversity in AAC and across ecological contexts. However, di-/trinucleotide composition in intergenic DNA predicts amino acid frequencies of proteomes to the point where very little cross-species variability remains unexplained (91% of variance accounted for). Qualitatively similar results were obtained for 49 fungal genomes, where 80% of the variability in AAC could be explained by the composition of introns and intergenic regions. Upon factoring out oligonucleotide composition and phylogenetic inertia, the residual AAC is poorly predictive of the microbes’ ecological preferences, in stark contrast with the original AAC. Moreover, highly expressed genes do not exhibit more prominent environment-related AAC signatures than lowly expressed genes, despite contributing more to the effective proteome. Thus, evolutionary shifts in overall AAC appear to occur almost exclusively through factors shaping the global oligonucleotide content of the genome. We discuss these results in light of contravening evidence from biophysical data and further reading frame-specific analyses that suggest that adaptation takes place at the protein level.
amino acid composition; oligonucleotide composition; intergenic DNA; ecological preferences; prokaryotic genome; fungal genome; support vector regression
The extracellular matrix of scaly green flagellates consists of small organic scales consisting of polysaccharides and scale-associated proteins (SAPs). Molecular phylogenies have shown that these organisms represent the ancestral stock of flagellates from which all green plants (Viridiplantae) evolved. The molecular characterization of four different SAPs is presented. Three SAPs are type-2 membrane proteins with an arginine/alanine-rich short cytoplasmic tail and an extracellular domain that is most likely of bacterial origin. The fourth protein is a filamin-like protein. In addition, we report the presence of proteins similar to the integrin-associated proteins α-actinin (in transcriptomes of glaucophytes and some viridiplants), LIM-domain proteins, and integrin-associated kinase in transcriptomes of viridiplants, glaucophytes, and rhodophytes. We propose that the membrane proteins identified are the predicted linkers between scales and the cytoskeleton. These proteins are present in many green algae but are apparently absent from embryophytes. These proteins represent a new protein family we have termed gralins for green algal integrins. Gralins are absent from embryophytes. A model for the evolution of the cell surface proteins in Plantae is discussed.
gralin; filamin; actinin; Viridiplantae; Rhodophyta; Glaucophyta
As decomposers, fungi are key players in recycling plant material in global carbon cycles. We hypothesized that genomes of early diverging fungi may have inherited pectinases from an ancestral species that had been able to extract nutrients from pectin-containing land plants and their algal allies (Streptophytes). We aimed to infer, based on pectinase gene expansions and on the organismal phylogeny, the geological timing of the plant–fungus association. We analyzed 40 fungal genomes, three of which, including Gonapodya prolifera, were sequenced for this study. In the organismal phylogeny from 136 housekeeping loci, Rozella diverged first from all other fungi. Gonapodya prolifera was included among the flagellated, predominantly aquatic fungal species in Chytridiomycota. Sister to Chytridiomycota were the predominantly terrestrial fungi including zygomycota I and zygomycota II, along with the ascomycetes and basidiomycetes that comprise Dikarya. The Gonapodya genome has 27 genes representing five of the seven classes of pectin-specific enzymes known from fungi. Most of these share a common ancestry with pectinases from Dikarya. Indicating functional and sequence similarity, Gonapodya, like many Dikarya, can use pectin as a carbon source for growth in pure culture. Shared pectinases of Dikarya and Gonapodya provide evidence that even ancient aquatic fungi had adapted to extract nutrients from the plants in the green lineage. This implies that 750 million years, the estimated maximum age of origin of the pectin-containing streptophytes represents a maximum age for the divergence of Chytridiomycota from the lineage including Dikarya.
carbohydrate active enzymes; evolution; fungal phylogeny; geological time; Gonapodya; pectinases; streptophytes