Insertions and deletions of DNA segments (indels) are together with substitutions the major mutational processes that generate genetic variation. Here we focus on recent DNA insertions and deletions in protein coding regions of the human genome to investigate selective constraints on indels in protein evolution.
Frequencies of inserted and deleted amino acids differ from background amino acid frequencies in the human proteome. Small amino acids are overrepresented, while hydrophobic, aliphatic and aromatic amino acids are strongly suppressed. Indels are found to be preferentially located in protein regions that do not form important structural domains. Amino acid insertion and deletion rates in genes associated with elementary biochemical reactions (e. g. catalytic activity, ligase activity, electron transport, or catabolic process) are lower compared to those in other genes and are therefore subject to stronger purifying selection.
Our analysis indicates that indels in human protein coding regions are subject to distinct levels of selective pressure with regard to their structural impact on the amino acid sequence, as well as to general properties of the genes they are located in. These findings confirm that many commonly accepted characteristics of selective constraints for substitutions are also valid for amino acid insertions and deletions.
Alternative splicing has been shown to be one of the major evolutionary mechanisms for protein diversification and proteome expansion, since a considerable fraction of alternative splicing events appears to be species- or lineage-specific. However, most studies were restricted to the analysis of cassette exons in pairs of genomes and did not analyze functionality of the alternative variants.
We analyzed conservation of human alternative splice sites and cassette exons in the mouse and dog genomes. Alternative exons, especially minor-isofom ones, were shown to be less conserved than constitutive exons. Frame-shifting alternatives in the protein-coding regions are less conserved than frame-preserving ones. Similarly, the conservation of alternative sites is highest for evenly used alternatives, and higher when the distance between the sites is divisible by three. The rate of alternative-exon and site loss in mouse is slightly higher than in dog, consistent with faster evolution of the former. The evolutionary dynamics of alternative sites was shown to be consistent with the model of random activation of cryptic sites.
Consistent with other studies, our results show that minor cassette exons are less conserved than major-alternative and constitutive exons. However, our study provides evidence that this is caused not only by exon birth, but also lineage-specific loss of alternative exons and sites, and it depends on exon functionality.
The forests of the upper Amazon basin harbour some of the world's highest anuran species richness, but to date we have only the sparsest understanding of the distribution of genetic diversity within and among species in this region. To quantify region-wide genealogical patterns and to test for the presence of deep intraspecific divergences that have been documented in some other neotropical anurans, we developed a molecular phylogeny of the wide-spread terrestrial leaflitter frog Eleutherodactylus ockendeni (Leptodactylidae) from 13 localities throughout its range in Ecuador using data from two mitochondrial genes (16S and cyt b; 1246 base pairs). We examined the relation between divergence of mtDNA and the nuclear genome, as sampled by five species-specific microsatellite loci, to evaluate indirectly whether lineages are reproductively isolated where they co-occur. Our extensive phylogeographic survey thus assesses the spatial distribution of E. ockendeni genetic diversity across eastern Ecuador.
We identified three distinct and well-supported clades within the Ecuadorean range of E. ockendeni: an uplands clade spanning north to south, a northeastern and central lowlands clade, and a central and southeastern clade, which is basal. Clades are separated by 12% to 15% net corrected p-distance for cytochrome b, with comparatively low sequence divergence within clades. Clades marginally overlap in some geographic areas (e.g., Napo River basin) but are reproductively isolated, evidenced by diagnostic differences in microsatellite PCR amplification profiles or DNA repeat number and coalescent analyses (in MDIV) best modelled without migration. Using Bayesian (BEAST) and net phylogenetic estimates, the Southeastern Clade diverged from the Upland/Lowland clades in the mid-Miocene or late Oligocene. Lowland and Upland clades speciated more recently, in the early or late Miocene.
Our findings uncover previously unsuspected cryptic species diversity within the common leaflitter frog E. ockendeni, with at least three different species in Ecuador. While these clades are clearly geographically circumscribed, they do not coincide with any existing landscape barriers. Divergences are ancient, from the Miocene, before the most dramatic mountain building in the Ecuadorean Andes. Therefore, this diversity is not a product of Pleistocene refuges. Our research coupled with other studies suggests that species richness in the upper Amazon is drastically underestimated by current inventories based on morphospecies.
Some of the most difficult phylogenetic questions in evolutionary biology involve identification of the free-living relatives of parasitic organisms, particularly those of parasitic flowering plants. Consequently, the number of origins of parasitism and the phylogenetic distribution of the heterotrophic lifestyle among angiosperm lineages is unclear.
Here we report the results of a phylogenetic analysis of 102 species of seed plants designed to infer the position of all haustorial parasitic angiosperm lineages using three mitochondrial genes: atp1, coxI, and matR. Overall, the mtDNA phylogeny agrees with independent studies in terms of non-parasitic plant relationships and reveals at least 11 independent origins of parasitism in angiosperms, eight of which consist entirely of holoparasitic species that lack photosynthetic ability. From these results, it can be inferred that modern-day parasites have disproportionately evolved in certain lineages and that the endoparasitic habit has arisen by convergence in four clades. In addition, reduced taxon, single gene analyses revealed multiple horizontal transfers of atp1 from host to parasite lineage, suggesting that parasites may be important vectors of horizontal gene transfer in angiosperms. Furthermore, in Pilostyles we show evidence for a recent host-to-parasite atp1 transfer based on a chimeric gene sequence that indicates multiple historical xenologous gene acquisitions have occurred in this endoparasite. Finally, the phylogenetic relationships inferred for parasites indicate that the origins of parasitism in angiosperms are strongly correlated with horizontal acquisitions of the invasive coxI group I intron.
Collectively, these results indicate that the parasitic lifestyle has arisen repeatedly in angiosperm evolutionary history and results in increasing parasite genomic chimerism over time.
The major lineages of eusocial insects, the ants, termites, stingless bees, honeybees and vespid wasps, all have ancient origins (≥ 65 mya) with no reversions to solitary behaviour. This has prompted the notion of a 'point of no return' whereby the evolutionary elaboration and integration of behavioural, genetic and morphological traits over a very long period of time leads to a situation where reversion to solitary living is no longer an evolutionary option.
We show that in another group of social insects, the allodapine bees, there was a single origin of sociality > 40 mya. We also provide data on the biology of a key allodapine species, Halterapis nigrinervis, showing that it is truly social. H. nigrinervis was thought to be the only allodapine that was not social, and our findings therefore indicate that there have been no losses of sociality among extant allodapine clades. Allodapine colony sizes rarely exceed 10 females per nest and all females in virtually all species are capable of nesting and reproducing independently, so these bees clearly do not fit the 'point of no return' concept.
We argue that allodapine sociality has been maintained by ecological constraints and the benefits of alloparental care, as opposed to behavioural, genetic or morphological constraints to independent living. Allodapine brood are highly vulnerable to predation because they are progressively reared in an open nest (not in sealed brood cells), which provides potentially large benefits for alloparental care and incentives for reproductives to tolerate potential alloparents. We argue that similar vulnerabilities may also help explain the lack of reversions to solitary living in other taxa with ancient social origins.
Among the long-standing conundrums of evolutionary theory, obligatory sex is one of the hardest. Current theory suggests multiple factors that might explain the benefits of sex when compared with complete asexuality, but no satisfactory explanation for the prevalence of obligatory sex in the face of facultative sexual reproduction.
Results and Conclusion
We show that when sexual selection is present obligatory sex can evolve and be maintained even against facultative sex, under common scenarios of deleterious mutations and environmental changes.
Previous phylogenetic analyses of African elephants have included limited numbers of forest elephant samples. A large-scale assessment of mitochondrial DNA diversity in forest elephant populations here reveals a more complex evolutionary history in African elephants as a whole than two-taxon models assume.
We analysed hypervariable region 1 of the mitochondrial control region for 71 new central African forest elephants and the mitochondrial cytochrome b gene from 28 new samples and compare these sequences to other African elephant data. We find that central African forest elephant populations fall into at least two lineages and that west African elephants (both forest and savannah) share their mitochondrial history almost exclusively with central African forest elephants. We also find that central African forest populations show lower genetic diversity than those in savannahs, and infer a recent population expansion.
Our data do not support the separation of African elephants into two evolutionary lineages. The demographic history of African elephants seems more complex, with a combination of multiple refugial mitochondrial lineages and recurrent hybridization among them rendering a simple forest/savannah elephant split inapplicable to modern African elephant populations.
Lipopolysaccharide (LPS) is a pathogen associated molecular pattern (PAMP) of animal and plant pathogenic bacteria. Variation at the interstrain level is common in LPS biosynthetic gene clusters of animal pathogenic bacteria. This variation has been proposed to play a role in evading the host immune system. Even though LPS is a modulator of plant defense responses, reports of interstrain variation in LPS gene clusters of plant pathogenic bacteria are rare.
In this study we report the complete sequence of a variant 19.9 kb LPS locus present in the BXO8 strain of Xanthomonas oryzae pv. oryzae (Xoo), the bacterial blight pathogen of rice. This region is completely different in size, number and organization of genes from the LPS locus present in most other strains of Xoo from India and Asia. Surprisingly, except for one ORF, all the other ORFs at the BXO8 LPS locus are orthologous to the genes present at this locus in a sequenced strain of X. axonopodis pv. citri (Xac; a pathogen of citrus plants). One end of the BXO8 LPS gene cluster, comprised of ten genes, is also present in the related rice pathogen, X. oryzae pv. oryzicola (Xoc). In Xoc, the remainder of the LPS gene cluster, consisting of seven genes, is novel and unrelated to LPS gene clusters of any of the sequenced xanthomonads. We also report substantial interstrain variation suggestive of very recent horizontal gene transfer (HGT) at the LPS biosynthetic locus of Xanthomonas campestris pv. campestris (Xcc), the black rot pathogen of crucifers.
Our analyses indicate that HGT has altered the LPS locus during the evolution of Xanthomonas oryzae pathovars and suggest that the ancestor of all Xanthomonas oryzae pathovars had an Xac type of LPS gene cluster. Our finding of interstrain variation in two major xanthomonad pathogens infecting different hosts suggests that the LPS locus in plant pathogenic bacteria, as in animal pathogens, is under intense diversifying selection.
The human chromosomes 2q, 7, 12q and 17q show extensive intra-genomic homology, containing duplicate, triplicate and quadruplicate paralogous regions centered on the HOX gene clusters. The fact that two or more representatives of different gene families are linked with HOX clusters is taken as evidence that these paralogous gene sets might have arisen from a single chromosomal segment through block or whole chromosome duplication events. This would imply that the constituent genes including the HOX clusters reflect the architecture of a single ancestral block (before vertebrate origin) where all of these genes were linked in a single copy.
In the present study we have employed the currently available set of protein data for a wide variety of vertebrate and invertebrate genomes to analyze the phylogenetic history of 11 multigene families with three or more of their representatives linked to human HOX clusters. A topology comparison approach revealed four discrete co-duplicated groups: group 1 involves the genes from GLI, HH, INHB, IGFBP (cluster-1), and SLC4A families; group 2 involves ERBB, ZNFN1A, and IGFBP (cluster-2) gene families; group 3 involves the HOX clusters and the SP gene family; group 4 involves the integrin beta chain and myosine light chain families. The distinct genes within each co-duplicated group share the same evolutionary history and are duplicated in concert with each other, while the constituent genes of two different co-duplicated groups may not share their evolutionary history and may not have duplicated simultaneously.
We conclude that co-duplicated groups may themselves be remnants of ancient small-scale duplications (involving chromosomal segments or gene-clusters) which occurred at different time points during chordate evolution. Whereas the recent combination of genes from distinct co-duplicated groups on different chromosomal regions (human chromosomes 2q, 7, 12q, and 17q) is probably the outcome of subsequent rearrangement of genomic segments, including syntenic groups of genes.
Molecular sequence data have become the standard in modern day phylogenetics. In particular, several long-standing questions of mammalian evolutionary history have been recently resolved thanks to the use of molecular characters. Yet, most studies have focused on only a handful of standard markers. The availability of an ever increasing number of whole genome sequences is a golden mine for modern systematics. Genomic data now provide the opportunity to select new markers that are potentially relevant for further resolving branches of the mammalian phylogenetic tree at various taxonomic levels.
The EnsEMBL database was used to determine a set of orthologous genes from 12 available complete mammalian genomes. As targets for possible amplification and sequencing in additional taxa, more than 3,000 exons of length > 400 bp have been selected, among which 118, 368, 608, and 674 are respectively retrieved for 12, 11, 10, and 9 species. A bioinformatic pipeline has been developed to provide evolutionary descriptors for these candidate markers in order to assess their potential phylogenetic utility. The resulting OrthoMaM (Orthologous Mammalian Markers) database can be queried and alignments can be downloaded through a dedicated web interface .
The importance of marker choice in phylogenetic studies has long been stressed. Our database centered on complete genome information now makes possible to select promising markers to a given phylogenetic question or a systematic framework by querying a number of evolutionary descriptors. The usefulness of the database is illustrated with two biological examples. First, two potentially useful markers were identified for rodent systematics based on relevant evolutionary parameters and sequenced in additional species. Second, a complete, gapless 94 kb supermatrix of 118 orthologous exons was assembled for 12 mammals. Phylogenetic analyses using probabilistic methods unambiguously supported the new placental phylogeny by retrieving the monophyly of Glires, Euarchontoglires, Laurasiatheria, and Boreoeutheria. Muroid rodents thus do not represent a basal placental lineage as it was mistakenly reasserted in some recent phylogenomic analyses based on fewer taxa. We expect the OrthoMaM database to be useful for further resolving the phylogenetic tree of placental mammals and for better understanding the evolutionary dynamics of their genomes, i.e., the forces that shaped coding sequences in terms of selective constraints.
A major cornerstone of evolutionary biology theory is the explanation of the emergence of cooperation in communities of selfish individuals. There is an unexplained tendency in the plant and animal world – with examples from alpine plants, worms, fish, mole-rats, monkeys and humans – for cooperation to flourish where the environment is more adverse (harsher) or more unpredictable.
Using mathematical arguments and computer simulations we show that in more adverse environments individuals perceive their resources to be more unpredictable, and that this unpredictability favours cooperation. First we show analytically that in a more adverse environment the individual experiences greater perceived uncertainty. Second we show through a simulation study that more perceived uncertainty implies higher level of cooperation in communities of selfish individuals.
This study captures the essential features of the natural examples: the positive impact of resource adversity or uncertainty on cooperation. These newly discovered connections between environmental adversity, uncertainty and cooperation help to explain the emergence and evolution of cooperation in animal and human societies.
Although the patterns of co-substitutions in RNA is now well characterized, detection of coevolving positions in proteins remains a difficult task. It has been recognized that the signal is typically weak, due to the fact that (i) amino-acid are characterized by various biochemical properties, so that distinct amino acids changes are not functionally equivalent, and (ii) a given mutation can be compensated by more than one mutation, at more than one position.
We present a new method based on phylogenetic substitution mapping. The two above-mentioned problems are addressed by (i) the introduction of a weighted mapping, which accounts for the biochemical effects (volume, polarity, charge) of amino-acid changes, (ii) the use of a clustering approach to detect groups of coevolving sites of virtually any size, and (iii) the distinction between biochemical compensation and other coevolutionary mechanisms. We apply this methodology to a previously studied data set of bacterial ribosomal RNA, and to three protein data sets (myoglobin of vertebrates, S-locus Receptor Kinase and Methionine Amino-Peptidase).
We succeed in detecting groups of sites which significantly depart the null hypothesis of independence. Group sizes range from pairs to groups of size ≃ 10, depending on the substitution weights used. The structural and functional relevance of these groups of sites are assessed, and the various evolutionary processes potentially generating correlated substitution patterns are discussed.
Arthropods are infected by a wide diversity of maternally transmitted microbes. Some of these manipulate host reproduction to facilitate population invasion and persistence. Such parasites transmit vertically on an ecological timescale, but rare horizontal transmission events have permitted colonisation of new species. Here we report the first systematic investigation into the influence of the phylogenetic distance between arthropod species on the potential for reproductive parasite interspecific transfer.
We employed a well characterised reproductive parasite, a coccinellid beetle male-killer, and artificially injected the bacterium into a series of novel species. Genetic distances between native and novel hosts were ascertained by sequencing sections of the 16S and 12S mitochondrial rDNA genes. The bacterium colonised host tissues and transmitted vertically in all cases tested. However, whilst transmission efficiency was perfect within the native genus, this was reduced following some transfers of greater phylogenetic distance. The bacterium's ability to distort offspring sex ratios in novel hosts was negatively correlated with the genetic distance of transfers. Male-killing occurred with full penetrance following within-genus transfers; but whilst sex ratio distortion generally occurred, it was incomplete in more distantly related species.
This study indicates that the natural interspecific transmission of reproductive parasites might be constrained by their ability to tolerate the physiology or genetics of novel hosts. Our data suggest that horizontal transfers are more likely between closely related species. Successful bacterial transfer across large phylogenetic distances may require rapid adaptive evolution in the new species. This finding has applied relevance regarding selection of suitable bacteria to manipulate insect pest and vector populations by symbiont gene-drive systems.
Comparison of completely sequenced microbial genomes has revealed how fluid these genomes are. Detecting synteny blocks requires reliable methods to determining the orthologs among the whole set of homologs detected by exhaustive comparisons between each pair of completely sequenced genomes. This is a complex and difficult problem in the field of comparative genomics but will help to better understand the way prokaryotic genomes are evolving.
We have developed a suite of programs that automate three essential steps to study conservation of gene order, and validated them with a set of 107 bacteria and archaea that cover the majority of the prokaryotic taxonomic space. We identified the whole set of shared homologs between two or more species and computed the evolutionary distance separating each pair of homologs. We applied two strategies to extract from the set of homologs a collection of valid orthologs shared by at least two genomes. The first computes the Reciprocal Smallest Distance (RSD) using the PAM distances separating pairs of homologs. The second method groups homologs in families and reconstructs each family's evolutionary tree, distinguishing bona fide orthologs as well as paralogs created after the last speciation event. Although the phylogenetic tree method often succeeds where RSD fails, the reverse could occasionally be true. Accordingly, we used the data obtained with either methods or their intersection to number the orthologs that are adjacent in for each pair of genomes, the Positional Orthologous Genes (POGs), and to further study their properties. Once all these synteny blocks have been detected, we showed that POGs are subject to more evolutionary constraints than orthologs outside synteny groups, whichever the taxonomic distance separating the compared organisms.
The suite of programs described in this paper allows a reliable detection of orthologs and is useful for evaluating gene order conservation in prokaryotes whichever their taxonomic distance. Thus, our approach will make easy the rapid identification of POGS in the next few years as we are expecting to be inundated with thousands of completely sequenced microbial genomes.
Today it is widely accepted that plastids are of cyanobacterial origin. During their evolutionary integration into the metabolic and regulatory networks of the host cell the engulfed cyanobacteria lost their independency. This process was paralleled by a massive gene transfer from symbiont to the host nucleus challenging the development of a retrograde protein translocation system to ensure plastid functionality. Such a system includes specific targeting signals of the proteins needed for the function of the plastid and membrane-bound machineries performing the transfer of these proteins across the envelope membranes. At present, most information on protein translocation is obtained by the analysis of land plants. However, the analysis of protein import into the primitive plastids of glaucocystophyte algae, revealed distinct features placing this system as a tool to understand the evolutionary development of translocation systems. Here, bacterial outer membrane proteins of the Omp85 family have recently been discussed as evolutionary seeds for the development of translocation systems.
To further explore the initial mode of protein translocation, the observed phenylalanine dependence for protein translocation into glaucophyte plastids was pursued in detail. We document that indeed the phenylalanine has an impact on both, lipid binding and binding to proteoliposomes hosting an Omp85 homologue. Comparison to established import experiments, however, unveiled a major importance of the phenylalanine for recognition by Omp85. This finding is placed into the context of the evolutionary development of the plastid translocon.
The phenylalanine in the N-terminal domain signs as a prerequisite for protein translocation across the outer membrane assisted by a "primitive" translocon. This amino acid appears to be optimized for specifically targeting the Omp85 protein without enforcing aggregation on the membrane surface. The phenylalanine has subsequently been lost in the transit sequence, but can be found at the C-terminal position of the translocating pore. Thereby, the current hypothesis of Omp85 being the prokaryotic contribution to the ancestral Toc translocon can be supported.
Theory and artificial selection experiments show that recombination can promote adaptation by enhancing the efficacy of natural selection, but the extent to which recombination affects levels of adaptation across the genome is still an open question. Because patterns of molecular evolution reflect long-term processes of mutation and selection in nature, interactions between recombination rate and genetic differentiation between species can be used to test the benefits of recombination. However, this approach faces a major difficulty: different evolutionary processes (i.e. negative versus positive selection) produce opposing relationships between recombination rate and genetic divergence, and obscure patterns predicted by individual benefits of recombination.
We use a combination of polymorphism and genomic data from the yeast Saccharomyces cerevisiae to infer the relative importance of nearly-neutral (i.e. slightly deleterious) evolution in different gene categories. For genes with high opportunities for slightly deleterious substitution, recombination substantially reduces the rate of molecular evolution, whereas divergence in genes with little opportunity for slightly deleterious substitution is not strongly affected by recombination.
These patterns indicate that adaptation throughout the genome can be strongly influenced by each gene's recombinational environment, and suggest substantial long-term fitness benefits of enhanced purifying selection associated with sexual recombination.
Bird genomes have very different compositional structure compared with other warm-blooded animals. The variation in the base skew rules in the vertebrate genomes remains puzzling, but it must relate somehow to large-scale genome evolution. Current research is inclined to relate base skew with mutations and their fixation. Here we wish to explore base skew correlations in bird genomes, to develop methods for displaying and quantifying such correlations at different scales, and to discuss possible explanations for the peculiarities of the bird genomes in skew correlation.
We have developed a method called Base Skew Double Triangle (BSDT) for exhibiting the genome-scale change of AT/CG skew as a two-dimensional square picture, showing base skews at many scales simultaneously in a single image. By this method we found that most chicken chromosomes have high AT/CG skew correlation (symmetry in 2D picture), except for some microchromosomes. No other organisms studied (18 species) show such high skew correlations. This visualized high correlation was validated by three kinds of quantitative calculations with overlapping and non-overlapping windows, all indicating that chicken and birds in general have a special genome structure. Similar features were also found in some of the mammal genomes, but clearly much weaker than in chickens. We presume that the skew correlation feature evolved near the time that birds separated from other vertebrate lineages. When we eliminated the repeat sequences from the genomes, the AT and CG skews correlation increased for some mammal genomes, but were still clearly lower than in chickens.
Our results suggest that BSDT is an expressive visualization method for AT and CG skew and enabled the discovery of the very high skew correlation in bird genomes; this peculiarity is worth further study. Computational analysis indicated that this correlation might be a compositional characteristic, present not only in chickens, but also remained or developed in some mammals during evolution. Special aspects of bird metabolism related to e.g. flight may be the reason why birds evolved or retained the skew correlation. Our analysis also indicated that repetitive DNA sequence elements need to be taken into account in studying the evolution of the correlation between AT and CG skews.
At the last glacial maximum, Fennoscandia was covered by an ice sheet while the tundra occupied most of the rest of northern Eurasia. More or less disjunct refugial populations of plants were dispersed in southern Europe, often trapped between mountain ranges and seas. Genetic and paleobotanical evidences indicate that these populations have contributed much to Holocene recolonization of more northern latitudes. Less supportive evidence has been found for the existence of glacial populations located closer to the ice margin. Scots pine (Pinus sylvestris L.) is a nordic conifer with a wide natural range covering much of Eurasia. Fractures in its extant genetic structure might be indicative of glacial vicariance and how different refugia contributed to the current distribution at the continental level. The population structure of Scots pine was investigated on much of its Eurasian natural range using maternally inherited mitochondrial DNA polymorphisms.
A novel polymorphic region of the Scots pine mitochondrial genome has been identified, the intron 1 of nad7, with three variants caused by insertions-deletions. From 986 trees distributed among 54 populations, four distinct multi-locus mitochondrial haplotypes (mitotypes) were detected based on the three nad7 intron 1 haplotypes and two previously reported size variants for nad1 intron B/C. Population differentiation was high (GST = 0.657) and the distribution of the mitotypes was geographically highly structured, suggesting at least four genetically distinct ancestral lineages. A cosmopolitan lineage was widely distributed in much of Europe throughout eastern Asia. A previously reported lineage limited to the Iberian Peninsula was confirmed. A new geographically restricted lineage was found confined to Asia Minor. A new lineage was restricted to more northern latitudes in northeastern Europe and the Baltic region.
The contribution of the various ancestral lineages to the current distribution of Scots pine was asymmetric and extant endemism reflected the presence of large geographic barriers to migration. The results suggest a complex biogeographical history with glacial refugia shared with temperate plant species in southern European Peninsulas and Asia Minor, and a genetically distinct glacial population located more North. These results confirm recent observations for cold tolerant species about the possible existence of refugial populations at mid-northern latitudes contributing significantly to the recolonization of northern Europe. Thus, Eurasian populations of nordic plant species might not be as genetically homogenous as assumed by simply considering them as offsets of glacial populations located in southern peninsulas. As such, they might have evolved distinctive genetic adaptations during glacial vicariance, worth evaluating and considering for conservation.
Y-chromosomal haplogroup (Y-HG) Q is suggested to originate in Asia and represent recent founder paternal Native American radiation into the Americas. This group is delineated into Q1, Q2 and Q3 subgroups defined by biallelic markers M120, M25/M143 and M3, respectively. Recently, a novel subgroup Q4 has been identified which is defined by bi-allelic marker M346, representing HG Q (0.41%, 3/728) in Indian population. With scanty details of HG Q in Asia, especially India, it was pertinent to explore the status of the Y-HG Q in Indian population to gather an insight to determine the extent of diversity within this region.
We observed 15/630 (2.38%) Y-HG Q individuals in India with an ancestral state at M120, M25, M3 and M346 markers, indicating an absence of already known Q1, Q2, Q3 and Q4 sub-haplogroups. Interestingly, we further observed a novel 4 bp deletion/insertion polymorphism (ss4 bp, rs41352448) at 72,314 position of human arylsulfatase D pseudogene, defining a novel sub-lineage Q5 (in 5/15 individuals, i.e., 33.3 % of the observed Y-HG Q) with distributions independent of the social, cultural, linguistic and geographical affiliations in India.
The study adds another sublineage Q5 in the already existing arrangement of Y-HG Q in literature. It was quite interesting to observe an ancestral state Q* and a novel sub-branch Q5, not reported elsewhere, in Indian subcontinent, though in low frequency. A novel subgroup Q4 was identified recently which is also restricted to Indian subcontinent. The most plausible explanation for these observations could be an ancestral migration of individuals bearing ancestral lineage Q* to Indian subcontinent followed by an autochthonous differentiation to Q4 and Q5 sublineages later on. However, other explanations of, either the presence of both the sub haplogroups (Q4 and Q5) in ancestral migrants or recent migrations from central Asia, cannot be ruled out till the distribution and diversity of these subgroups is explored extensively in Central Asia and other regions.
Along the chromosome of the obligate intracellular bacteria Protochlamydia amoebophila UWE25, we recently described a genomic island Pam100G. It contains a tra unit likely involved in conjugative DNA transfer and lgrE, a 5.6-kb gene similar to five others of P. amoebophila: lgrA to lgrD, lgrF. We describe here the structure, regulation and evolution of these proteins termed LGRs since encoded by "Large G+C-Rich" genes.
No homologs to the whole protein sequence of LGRs were found in other organisms. Phylogenetic analyses suggest that serial duplications producing the six LGRs occurred relatively recently and nucleotide usage analyses show that lgrB, lgrE and lgrF were relocated on the chromosome. The C-terminal part of LGRs is homologous to Leucine-Rich Repeats domains (LRRs). Defined by a cumulative alignment score, the 5 to 18 concatenated octacosapeptidic (28-meric) LRRs of LGRs present all a predicted α-helix conformation. Their closest homologs are the 28-residue RI-like LRRs of mammalian NODs and the 24-meres of some Ralstonia and Legionella proteins. Interestingly, lgrE, which is present on Pam100G like the tra operon, exhibits Pfam domains related to DNA metabolism.
Comparison of the LRRs, enable us to propose a parsimonious evolutionary scenario of these domains driven by adjacent concatenations of LRRs. Our model established on bacterial LRRs can be challenged in eucaryotic proteins carrying less conserved LRRs, such as NOD proteins and Toll-like receptors.
The hydrogenosomes of the anaerobic ciliate Nyctotherus ovalis show how
mitochondria can evolve into hydrogenosomes because they possess a mitochondrial
genome and parts of an electron-transport chain on the one hand, and a hydrogenase
on the other hand. The hydrogenase permits direct reoxidation of NADH because it
consists of a [FeFe] hydrogenase module that is fused to two modules, which are
homologous to the 24 kDa and the 51 kDa subunits of a mitochondrial complex I.
The [FeFe] hydrogenase belongs to a clade of hydrogenases that are different from
well-known eukaryotic hydrogenases. The 24 kDa and the 51 kDa modules are most
closely related to homologous modules that function in bacterial [NiFe]
hydrogenases. Paralogous, mitochondrial 24 kDa and 51 kDa modules function in the
mitochondrial complex I in N. ovalis. The different hydrogenase modules
have been fused to form a polyprotein that is targeted into the hydrogenosome.
The hydrogenase and their associated modules have most likely been acquired by
independent lateral gene transfer from different sources. This scenario for a
concerted lateral gene transfer is in agreement with the evolution of the
hydrogenosome from a genuine ciliate mitochondrion by evolutionary tinkering.
Distance matrix methods constitute a major family of phylogenetic estimation methods, and the minimum evolution (ME) principle (aiming at recovering the phylogeny with shortest length) is one of the most commonly used optimality criteria for estimating phylogenetic trees. The major difficulty for its application is that the number of possible phylogenies grows exponentially with the number of taxa analyzed and the minimum evolution principle is known to belong to the NP
MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8xdX7Kaeeiuaafaaa@3888@-hard class of problems.
In this paper, we introduce an Ant Colony Optimization (ACO) algorithm to estimate phylogenies under the minimum evolution principle. ACO is an optimization technique inspired from the foraging behavior of real ant colonies. This behavior is exploited in artificial ant colonies for the search of approximate solutions to discrete optimization problems.
We show that the ACO algorithm is potentially competitive in comparison with state-of-the-art algorithms for the minimum evolution principle. This is the first application of an ACO algorithm to the phylogenetic estimation problem.
The diversity of parasites attacking a host varies substantially among different host species. Understanding the factors that explain these patterns of parasite diversity is critical to identifying the ecological principles underlying biodiversity. Seabirds (Charadriiformes, Pelecaniformes and Procellariiformes) and their ectoparasitic lice (Insecta: Phthiraptera) are ideal model groups in which to study correlates of parasite species richness. We evaluated the relative importance of morphological (body size, body weight, wingspan, bill length), life-history (longevity, clutch size), ecological (population size, geographical range) and behavioural (diving versus non-diving) variables as predictors of louse diversity on 413 seabird hosts species. Diversity was measured at the level of louse suborder, genus, and species, and uneven sampling of hosts was controlled for using literature citations as a proxy for sampling effort.
The only variable consistently correlated with louse diversity was host population size and to a lesser extent geographic range. Other variables such as clutch size, longevity, morphological and behavioural variables including body mass showed inconsistent patterns dependent on the method of analysis.
The comparative analysis presented herein is (to our knowledge) the first to test correlates of parasite species richness in seabirds. We believe that the comparative data and phylogeny provide a valuable framework for testing future evolutionary hypotheses relating to the diversity and distribution of parasites on seabirds.
A single measles vaccination provides lifelong protection. No antigenic variants that escape immunity have been observed. By contrast, influenza continually evolves new antigenic variants, and the vaccine has to be updated frequently with new strains. Both measles and influenza are RNA viruses with high mutation rates, so the mutation rate alone cannot explain the differences in antigenic variability.
We develop a new hypothesis to explain antigenic stasis versus change. We first note that the antigenically static viruses tend to have high reproductive rates and to concentrate infection in children, whereas antigenically variable viruses such as influenza tend to spread more widely across age classes. We argue that, for pathogens in a naive host population that spread more rapidly in younger individuals than in older individuals, natural selection weights more heavily a rise in reproductive rate. By contrast, pathogens that spread more readily among older individuals gain more by antigenic escape, so natural selection weights more heavily antigenic mutability.
These divergent selective pressures on reproductive rate and antigenic mutability may explain some of the observed differences between pathogens in age-class bias, reproductive rate, and antigenic variation.
Codon usage bias (CUB), the uneven use of synonymous codons, is a ubiquitous observation in virtually all organisms examined. The pattern of codon usage is generally similar among closely related species, but differs significantly among distantly related organisms, e.g., bacteria, yeast, and Drosophila. Several explanations for CUB have been offered and some have been supported by observations and experiments, although a thorough understanding of the evolutionary forces (random drift, mutation bias, and selection) and their relative importance remains to be determined. The recently available complete genome DNA sequences of twelve phylogenetically defined species of Drosophila offer a hitherto unprecedented opportunity to examine these problems. We report here the patterns of codon usage in the twelve species and offer insights on possible evolutionary forces involved.
(1) Codon usage is quite stable across 11/12 of the species: G- and especially C-ending codons are used most frequently, thus defining the preferred codons. (2) The only amino acid that changes in preferred codon is Serine with six species of the melanogaster group favoring TCC while the other species, particularly subgenus Drosophila species, favor AGC. (3) D. willistoni is an exception to these generalizations in having a shifted codon usage for seven amino acids toward A/T in the wobble position. (4) Amino acids differ in their contribution to overall CUB, Leu having the greatest and Asp the least. (5) Among two-fold degenerate amino acids, A/G ending amino acids have more selection on codon usage than T/C ending amino acids. (6) Among the different chromosome arms or elements, genes on the non-recombining element F (dot chromosome) have the least CUB, while genes on the element A (X chromosome) have the most. (7) Introns indicate that mutation bias in all species is approximately 2:1, AT:GC, the opposite of codon usage bias. (8) There is also evidence for some overall regional bias in base composition that may influence codon usage.
Overall, these results suggest that natural selection has acted on codon usage in the genus Drosophila, at least often enough to leave a footprint of selection in modern genomes. However, there is evidence in the data that random forces (drift and mutation) have also left patterns in the data, especially in genes under weak selection for codon usage for example genes in regions of low recombination. The documentation of codon usage patterns in each of these twelve genomes also aids in ongoing annotation efforts.