Search tips
Search criteria

Results 1-25 (51)

Clipboard (0)
Year of Publication
1.  Evolution of DNA ligases of Nucleo-Cytoplasmic Large DNA viruses of eukaryotes: a case of hidden complexity 
Biology Direct  2009;4:51.
Eukaryotic Nucleo-Cytoplasmic Large DNA Viruses (NCLDV) encode most if not all of the enzymes involved in their DNA replication. It has been inferred that genes for these enzymes were already present in the last common ancestor of the NCLDV. However, the details of the evolution of these genes that bear on the complexity of the putative ancestral NCLDV and on the evolutionary relationships between viruses and their hosts are not well understood.
Phylogenetic analysis of the ATP-dependent and NAD-dependent DNA ligases encoded by the NCLDV reveals an unexpectedly complex evolutionary history. The NAD-dependent ligases are encoded only by a minority of NCLDV (including mimiviruses, some iridoviruses and entomopoxviruses) but phylogenetic analysis clearly indicated that all viral NAD-dependent ligases are monophyletic. Combined with the topology of the NCLDV tree derived by consensus of trees for universally conserved genes suggests that this enzyme was represented in the ancestral NCLDV. Phylogenetic analysis of ATP-dependent ligases that are encoded by chordopoxviruses, most of the phycodnaviruses and Marseillevirus failed to demonstrate monophyly and instead revealed an unexpectedly complex evolutionary trajectory. The ligases of the majority of phycodnaviruses and Marseillevirus seem to have evolved from bacteriophage or bacterial homologs; the ligase of one phycodnavirus, Emiliana huxlei virus, belongs to the eukaryotic DNA ligase I branch; and ligases of chordopoxviruses unequivocally cluster with eukaryotic DNA ligase III.
Examination of phyletic patterns and phylogenetic analysis of DNA ligases of the NCLDV suggest that the common ancestor of the extant NCLDV encoded an NAD-dependent ligase that most likely was acquired from a bacteriophage at the early stages of evolution of eukaryotes. By contrast, ATP-dependent ligases from different prokaryotic and eukaryotic sources displaced the ancestral NAD-dependent ligase at different stages of subsequent evolution. These findings emphasize complex routes of viral evolution that become apparent through detailed phylogenomic analysis but not necessarily in reconstructions based on phyletic patterns of genes.
This article was reviewed by: Patrick Forterre, George V. Shpakovski, and Igor B. Zhulin.
PMCID: PMC2806865  PMID: 20021668
2.  Mass action models versus the Hill model: An analysis of tetrameric human thymidine kinase 1 positive cooperativity 
Biology Direct  2009;4:49.
The Hill coefficient characterizes the extent to which an enzyme exhibits positive or negative cooperativity, but it provides no information regarding the mechanism of cooperativity. In contrast, models based on the equilibrium concept of mass action can suggest mechanisms of cooperativity, but there are often many such models and often many with too many parameters.
Mass action models of tetrameric human thymidine kinase 1 (TK1) activity data were formed as pairs of plausible hypotheses that per site activities and binary dissociation constants are equal within contiguous stretches of the number of substrates bound. Of these, six 3-parameter models were fitted to 5 different datasets. Akaike's Information Criterion was then used to form model probability weighted averages. The literature average of the 5 model averages was K = (0.85, 0.69, 0.65, 0.51) μM and k = (3.3, 3.9, 4.1, 4.1) sec-1 where K and k are per-site binary dissociation constants and activities indexed by the number of substrates bound to the tetrameric enzyme.
The TK1 model presented supports both K and k positive cooperativity. Three-parameter mass action models can and should replace the 3-parameter Hill model.
This article was reviewed by Philip Hahnfeldt, Fangping Mu (nominated by William Hlavacek) and Rainer Sachs.
PMCID: PMC2799445  PMID: 20003201
3.  Automated mass action model space generation and analysis methods for two-reactant combinatorially complex equilibriums: An analysis of ATP-induced ribonucleotide reductase R1 hexamerization data 
Biology Direct  2009;4:50.
Ribonucleotide reductase is the main control point of dNTP production. It has two subunits, R1, and R2 or p53R2. R1 has 5 possible catalytic site states (empty or filled with 1 of 4 NDPs), 5 possible s-site states (empty or filled with ATP, dATP, dTTP or dGTP), 3 possible a-site states (empty or filled with ATP or dATP), perhaps two possible h-site states (empty or filled with ATP), and all of this is folded into an R1 monomer-dimer-tetramer-hexamer equilibrium where R1 j-mers can be bound by variable numbers of R2 or p53R2 dimers. Trillions of RNR complexes are possible as a result. The problem is to determine which are needed in models to explain available data. This problem is intractable for 10 reactants, but it can be solved for 2 and is here for R1 and ATP.
Thousands of ATP-induced R1 hexamerization models with up to three (s, a and h) ATP binding sites per R1 subunit were automatically generated via hypotheses that complete dissociation constants are infinite and/or that binary dissociation constants are equal. To limit the model space size, it was assumed that s-sites are always filled in oligomers and never filled in monomers, and to interpret model terms it was assumed that a-sites fill before h-sites. The models were fitted to published dynamic light scattering data. As the lowest Akaike Information Criterion (AIC) of the 3-parameter models was greater than the lowest of the 2-parameter models, only models with up to 3 parameters were fitted. Models with sums of squared errors less than twice the minimum were then partitioned into two groups: those that contained no occupied h-site terms (508 models) and those that contained at least one (1580 models). Normalized AIC densities of these two groups of models differed significantly in favor of models that did not include an h-site term (Kolmogorov-Smirnov p < 1 × 10-15); consistent with this, 28 of the top 30 models (ranked by AICs) did not include an h-site term and 28/30 > 508/2088 with p < 2 × 10-15. Finally, 99 of the 2088 models did not have any terms with ATP/R1 ratios >1.5, but of the top 30, there were 14 such models (14/30 > 99/2088 with p < 3 × 10-16), i.e. the existence of R1 hexamers with >3 a-sites occupied by ATP is also not supported by this dataset.
The analysis presented suggests that three a-sites may not be occupied by ATP in R1 hexamers under the conditions of the data analyzed. If a-sites fill before h-sites, this implies that the dataset analyzed can be explained without the existence of an h-site.
This article was reviewed by Ossama Kashlan (nominated by Philip Hahnfeldt), Bin Hu (nominated by William Hlavacek) and Rainer Sachs.
PMCID: PMC2799446  PMID: 20003203
4.  Depauperate genetic variability detected in the American and European bison using genomic techniques 
Biology Direct  2009;4:48.
A total of 929 polymorphic SNPs in EB (out of 54, 000 SNPs screened using a BovineSNP50 Illumina Genotyping BeadChip), and 1, 524 and 1, 403 polymorphic SNPs in WB and PB, respectively, were analysed. EB, WB and PB have all undergone recent drastic reductions in population size. Accordingly, they exhibited extremely depauperate genomes, deviations from genetic equilibrium and a genome organization consisting of a mosaic of haplotype blocks: regions with low haplotype diversity and high levels of linkage disequilibrium. No evidence for positive or stabilizing selection was found in EB, WB and PB, likely reflecting drift overwhelming selection. We suggest that utilization of genome-wide screening technologies, followed by utilization of less expensive techniques (e.g. VeraCode and Fluidigm EP1), holds large potential for genetic monitoring of populations. Additionally, these techniques will allow radical improvements of breeding practices in captive or managed populations, otherwise hampered by the limited availability of polymorphic markers. This result in improved possibilities for 1) estimating genetic relationships among individuals and 2) designing breeding strategies which attempt to preserve or reduce polymorphism in ecologically relevant genes and/or entire blocks.
This article was reviewed by: Fyodor Kondrashov and Shamil Sunyaev
PMCID: PMC2793249  PMID: 19995416
5.  Human γδ T cell Recognition of lipid A is predominately presented by CD1b or CD1c on dendritic cells 
Biology Direct  2009;4:47.
The γδ T cells serve as early immune defense against certain encountered microbes. Only a few γδ T cell-recognized ligands from microbial antigens have been identified so far and the mechanisms by which γδ T cells recognize these ligands remain unknown. Here we explored the mechanism of interaction of human γδ T cells in peripheral blood with Lipid A (LA).
First, resting γδ T cells (mainly Vδ2 T cells) displayed a strong proliferative response to LA-pulsed monocyte-derived dendritic cells (moDC) and LA-pulsed paraformaldehyde-fixed moDC, but not to free LA in a TCR γδ-dependent manner. Second, anti-CD1b or anti-CD1c antibodies could block proliferative response of resting γδ T cells to LA-loaded moDC. Besides, only LA-loaded CD1b/CD1c-transfected C1R lymphoblastoma cells (CD1b-/CD1c-C1R) were able to stimulate the proliferation of human γδ T cells. Third, the expressions of both Toll-like receptor (TLR)2 and TLR4 on surface of LA-activated γδ T cells were upregulated, whereas only anti-TLR4 antibody could partially block their response to LA; Finally LA-loaded moDCs induce γδ T cells to produce Th1 cytokines, such as IFN-γ.
Taken together, we found a novel mechanism that human γδ T cells recognize LA in a CD1b- or CD1c-restricted manner in first response against Gram-bacteria, while the interaction between TLR4 on γδ T cells and LA might strengthen the subsequent response of γδ T cells.
This article was reviewed by Hao Shen, Youwen He (nominated by Dr. Laurence C Eisenlohr), Dr. Michael Lenardo and Dr. Pushpa Pandiyan.
PMCID: PMC3224963  PMID: 19948070
6.  Evolution by leaps: gene duplication in bacteria 
Biology Direct  2009;4:46.
Sequence related families of genes and proteins are common in bacterial genomes. In Escherichia coli they constitute over half of the genome. The presence of families and superfamilies of proteins suggest a history of gene duplication and divergence during evolution. Genome encoded protein families, their size and functional composition, reflect metabolic potentials of the organisms they are found in. Comparing protein families of different organisms give insight into functional differences and similarities.
Equivalent enzyme families with metabolic functions were selected from the genomes of four experimentally characterized bacteria belonging to separate genera. Both similarities and differences were detected in the protein family memberships, with more similarities being detected among the more closely related organisms. Protein family memberships reflected known metabolic characteristics of the organisms. Differences in divergence of functionally characterized enzyme family members accounted for characteristics of taxa known to differ in those biochemical properties and capabilities. While some members of the gene families will have been acquired by lateral exchange and other former family members will have been lost over time, duplication and divergence of genes and functions appear to have been a significant contributor to the functional diversity of today’s microbes.
Protein families seem likely to have arisen during evolution by gene duplication and divergence where the gene copies that have been retained are the variants that have led to distinct bacterial physiologies and taxa. Thus divergence of the duplicate enzymes has been a major process in the generation of different kinds of bacteria.
This article was reviewed by Drs. Iyer Aravind, Ardcady Mushegian, and Pierre Pontarotti.
PMCID: PMC2787491  PMID: 19930658
7.  In plants, expression breadth and expression level distinctly and non-linearly correlate with gene structure 
Biology Direct  2009;4:45.
Compactness of highly/broadly expressed genes in human has been explained as selection for efficiency, regional mutation biases or genomic design. However, highly expressed genes in flowering plants were shown to be less compact than lowly expressed ones. On the other hand, opposite facts have also been documented that pollen-expressed Arabidopsis genes tend to contain shorter introns and highly expressed moss genes are compact. This issue is important because it provides a chance to compare the selectionism and the neutralism views about genome evolution. Furthermore, this issue also helps to understand the fates of introns, from the angle of gene expression.
In this study, I used expression data covering more tissues and employ new analytical methods to reexamine the correlations between gene expression and gene structure for two flowering plants, Arabidopsis thaliana and Oryza sativa. It is shown that, different aspects of expression pattern correlate with different parts of gene sequences in distinct ways. In detail, expression level is significantly negatively correlated with gene size, especially the size of non-coding regions, whereas expression breadth correlates with non-coding structural parameters positively and with coding region parameters negatively. Furthermore, the relationships between expression level and structural parameters seem to be non-linear, with the extremes of structural parameters possibly scale as power-laws or logrithmic functions of expression levels.
In plants, highly expressed genes are compact, especially in the non-coding regions. Broadly expressed genes tend to contain longer non-coding sequences, which may be necessary for complex regulations. In combination with previous studies about other plants and about animals, some common scenarios about the correlation between gene expression and gene structure begin to emerge. Based on the functional relationships between extreme values of structural characteristics and expression level, an effort was made to evaluate the relative effectiveness of the energy-cost hypothesis and the time-cost hypothesis.
This article was reviewed by Dr. I. King Jordan, Dr. Liran Carmel (nominated by Dr. Eugene V. Koonin) and Dr. Fyodor A. Kondrashov.
PMCID: PMC2794262  PMID: 19930585
8.  Exceptional error minimization in putative primordial genetic codes 
Biology Direct  2009;4:44.
The standard genetic code is redundant and has a highly non-random structure. Codons for the same amino acids typically differ only by the nucleotide in the third position, whereas similar amino acids are encoded, mostly, by codon series that differ by a single base substitution in the third or the first position. As a result, the code is highly albeit not optimally robust to errors of translation, a property that has been interpreted either as a product of selection directed at the minimization of errors or as a non-adaptive by-product of evolution of the code driven by other forces.
We investigated the error-minimization properties of putative primordial codes that consisted of 16 supercodons, with the third base being completely redundant, using a previously derived cost function and the error minimization percentage as the measure of a code's robustness to mistranslation. It is shown that, when the 16-supercodon table is populated with 10 putative primordial amino acids, inferred from the results of abiotic synthesis experiments and other evidence independent of the code's evolution, and with minimal assumptions used to assign the remaining supercodons, the resulting 2-letter codes are nearly optimal in terms of the error minimization level.
The results of the computational experiments with putative primordial genetic codes that contained only two meaningful letters in all codons and encoded 10 to 16 amino acids indicate that such codes are likely to have been nearly optimal with respect to the minimization of translation errors. This near-optimality could be the outcome of extensive early selection during the co-evolution of the code with the primordial, error-prone translation system, or a result of a unique, accidental event. Under this hypothesis, the subsequent expansion of the code resulted in a decrease of the error minimization level that became sustainable owing to the evolution of a high-fidelity translation system.
This article was reviewed by Paul Higgs (nominated by Arcady Mushegian), Rob Knight, and Sandor Pongor. For the complete reports, go to the Reviewers' Reports section.
PMCID: PMC2785773  PMID: 19925661
9.  Trees and networks before and after Darwin 
Biology Direct  2009;4:43.
It is well-known that Charles Darwin sketched abstract trees of relationship in his 1837 notebook, and depicted a tree in the Origin of Species (1859). Here I attempt to place Darwin's trees in historical context. By the mid-Eighteenth century the Great Chain of Being was increasingly seen to be an inadequate description of order in nature, and by about 1780 it had been largely abandoned without a satisfactory alternative having been agreed upon. In 1750 Donati described aquatic and terrestrial organisms as forming a network, and a few years later Buffon depicted a network of genealogical relationships among breeds of dogs. In 1764 Bonnet asked whether the Chain might actually branch at certain points, and in 1766 Pallas proposed that the gradations among organisms resemble a tree with a compound trunk, perhaps not unlike the tree of animal life later depicted by Eichwald. Other trees were presented by Augier in 1801 and by Lamarck in 1809 and 1815, the latter two assuming a transmutation of species over time. Elaborate networks of affinities among plants and among animals were depicted in the late Eighteenth and very early Nineteenth centuries. In the two decades immediately prior to 1837, so-called affinities and/or analogies among organisms were represented by diverse geometric figures. Series of plant and animal fossils in successive geological strata were represented as trees in a popular textbook from 1840, while in 1858 Bronn presented a system of animals, as evidenced by the fossil record, in a form of a tree. Darwin's 1859 tree and its subsequent elaborations by Haeckel came to be accepted in many but not all areas of biological sciences, while network diagrams were used in others. Beginning in the early 1960s trees were inferred from protein and nucleic acid sequences, but networks were re-introduced in the mid-1990s to represent lateral genetic transfer, increasingly regarded as a fundamental mode of evolution at least for bacteria and archaea. In historical context, then, the Network of Life preceded the Tree of Life and might again supersede it.
This article was reviewed by Eric Bapteste, Patrick Forterre and Dan Graur.
PMCID: PMC2793248  PMID: 19917100
10.  Is evolution Darwinian or/and Lamarckian? 
Biology Direct  2009;4:42.
The year 2009 is the 200th anniversary of the publication of Jean-Bapteste Lamarck's Philosophie Zoologique and the 150th anniversary of Charles Darwin's On the Origin of Species. Lamarck believed that evolution is driven primarily by non-randomly acquired, beneficial phenotypic changes, in particular, those directly affected by the use of organs, which Lamarck believed to be inheritable. In contrast, Darwin assigned a greater importance to random, undirected change that provided material for natural selection.
The concept
The classic Lamarckian scheme appears untenable owing to the non-existence of mechanisms for direct reverse engineering of adaptive phenotypic characters acquired by an individual during its life span into the genome. However, various evolutionary phenomena that came to fore in the last few years, seem to fit a more broadly interpreted (quasi)Lamarckian paradigm. The prokaryotic CRISPR-Cas system of defense against mobile elements seems to function via a bona fide Lamarckian mechanism, namely, by integrating small segments of viral or plasmid DNA into specific loci in the host prokaryote genome and then utilizing the respective transcripts to destroy the cognate mobile element DNA (or RNA). A similar principle seems to be employed in the piRNA branch of RNA interference which is involved in defense against transposable elements in the animal germ line. Horizontal gene transfer (HGT), a dominant evolutionary process, at least, in prokaryotes, appears to be a form of (quasi)Lamarckian inheritance. The rate of HGT and the nature of acquired genes depend on the environment of the recipient organism and, in some cases, the transferred genes confer a selective advantage for growth in that environment, meeting the Lamarckian criteria. Various forms of stress-induced mutagenesis are tightly regulated and comprise a universal adaptive response to environmental stress in cellular life forms. Stress-induced mutagenesis can be construed as a quasi-Lamarckian phenomenon because the induced genomic changes, although random, are triggered by environmental factors and are beneficial to the organism.
Both Darwinian and Lamarckian modalities of evolution appear to be important, and reflect different aspects of the interaction between populations and the environment.
this article was reviewed by Juergen Brosius, Valerian Dolja, and Martijn Huynen. For complete reports, see the Reviewers' reports section.
PMCID: PMC2781790  PMID: 19906303
11.  Network dynamics of eukaryotic LTR retroelements beyond phylogenetic trees 
Biology Direct  2009;4:41.
Sequencing projects have allowed diverse retroviruses and LTR retrotransposons from different eukaryotic organisms to be characterized. It is known that retroviruses and other retro-transcribing viruses evolve from LTR retrotransposons and that this whole system clusters into five families: Ty3/Gypsy, Retroviridae, Ty1/Copia, Bel/Pao and Caulimoviridae. Phylogenetic analyses usually show that these split into multiple distinct lineages but what is yet to be understood is how deep evolution occurred in this system.
We combined phylogenetic and graph analyses to investigate the history of LTR retroelements both as a tree and as a network. We used 268 non-redundant LTR retroelements, many of them introduced for the first time in this work, to elucidate all possible LTR retroelement phylogenetic patterns. These were superimposed over the tree of eukaryotes to investigate the dynamics of the system, at distinct evolutionary times. Next, we investigated phenotypic features such as duplication and variability of amino acid motifs, and several differences in genomic ORF organization. Using this information we characterized eight reticulate evolution markers to construct phenotypic network models.
The evolutionary history of LTR retroelements can be traced as a time-evolving network that depends on phylogenetic patterns, epigenetic host-factors and phenotypic plasticity. The Ty1/Copia and the Ty3/Gypsy families represent the oldest patterns in this network that we found mimics eukaryotic macroevolution. The emergence of the Bel/Pao, Retroviridae and Caulimoviridae families in this network can be related with distinct inflations of the Ty3/Gypsy family, at distinct evolutionary times. This suggests that Ty3/Gypsy ancestors diversified much more than their Ty1/Copia counterparts, at distinct geological eras. Consistent with the principle of preferential attachment, the connectivities among phenotypic markers, taken as network-represented combinations, are power-law distributed. This evidences an inflationary mode of evolution where the system diversity; 1) expands continuously alternating vertical and gradual processes of phylogenetic divergence with episodes of modular, saltatory and reticulate evolution; 2) is governed by the intrinsic capability of distinct LTR retroelement host-communities to self-organize their phenotypes according to emergent laws characteristic of complex systems.
This article was reviewed by Eugene V. Koonin, Eric Bapteste, and Enmanuelle Lerat (nominated by King Jordan)
PMCID: PMC2774666  PMID: 19883502
12.  CD44 expression positively correlates with Foxp3 expression and suppressive function of CD4+ Treg cells 
Biology Direct  2009;4:40.
CD4+CD25+ regulatory T (Treg) cells develop in the thymus and can suppress T cell proliferation, modulated by Foxp3 and cytokines; however, the relevance of CD44 in Treg cell development is less clear. To address this issue, we analyzed Foxp3 expression in CD44+ Treg cells by using multiple parameters, measured the levels of the immunoregulatory cytokine interleukin (IL)-10 in various thymocyte subsets, and determined the suppressor activity in different splenic Treg cell populations.
Within mouse thymocytes, we detected Treg cells with two novel phenotypes, namely the CD4+CD8-CD25+CD44+ and CD4+CD8-CD25+CD44- staining features. Additional multi-parameter analyses at the single-cell and molecular levels suggested to us that CD44 expression positively correlated with Foxp3 expression in thymocytes, the production of IL-10, and Treg activity in splenic CD4+CD25+ T cells. This suppressive effect of Treg cells on T cell proliferation could be blocked by using anti-IL-10 neutralizing antibodies. In addition, CD4+CD25+CD44+ Treg cells expressed higher levels of IL-10 and were more potent in suppressing effector T cell proliferation than were CD4+CD25+CD44- cells.
This study indicates the presence of two novel phenotypes of Treg cells in the thymus, the functional relevance of CD44 in defining Treg cell subsets, and the role of both IL-10 and Foxp3 in modulating the function of Treg cells.
This article was reviewed by Dr. M. Lenardo, Dr. L. Klein & G. Wirnsberger (nominated by Dr. JC Zungia-Pfluker), and Dr. E.M. Shevach.
PMCID: PMC2770033  PMID: 19852824
13.  Identification of an ortholog of the eukaryotic RNA polymerase III subunit RPC34 in Crenarchaeota and Thaumarchaeota suggests specialization of RNA polymerases for coding and non-coding RNAs in Archaea 
Biology Direct  2009;4:39.
One of the hallmarks of eukaryotic information processing is the co-existence of 3 distinct, multi-subunit RNA polymerase complexes that are dedicated to the transcription of specific classes of coding or non-coding RNAs. Archaea encode only one RNA polymerase that resembles the eukaryotic RNA polymerase II with respect to the subunit composition. Here we identify archaeal orthologs of the eukaryotic RNA polymerase III subunit RPC34. Genome context analysis supports a function of this archaeal protein in the transcription of non-coding RNAs. These findings suggest that functional separation of RNA polymerases for protein-coding genes and non-coding RNAs might predate the origin of the Eukaryotes.
Reviewers: This article was reviewed by Andrei Osterman and Patrick Forterre (nominated by Purificación López-García)
PMCID: PMC2770514  PMID: 19828044
14.  Strong association between pseudogenization mechanisms and gene sequence length 
Biology Direct  2009;4:38.
Pseudogenes arise from the decay of gene copies following either RNA-mediated duplication (processed pseudogenes) or DNA-mediated duplication (nonprocessed pseudogenes). Here, we show that long protein-coding genes tend to produce more nonprocessed pseudogenes than short genes, whereas the opposite is true for processed pseudogenes. Protein-coding genes longer than 3000 bp are 6 times more likely to produce nonprocessed pseudogenes than processed ones.
This article was reviewed by Dr. Dan Graur and Dr. Craig Nelson (nominated by Dr. J Peter Gogarten).
PMCID: PMC2768697  PMID: 19807910
15.  Epigenetic hereditary transcription profiles III, evidence for an epigenetic network resulting in gender, tissue and age-specific variation in overall transcription 
Biology Direct  2009;4:37.
We have previously shown that deviations from the average transcription profile of a group of functionally related genes are not only heritable, but also demonstrate specific patterns associated with age, gender and differentiation, thereby implicating genome-wide nuclear programming as the cause. To determine whether these results could be reproduced, a different micro-array database (obtained from two types of muscle tissue, derived from 81 human donors aged between 16 to 89 years) was studied.
This new database also revealed the existence of age, gender and tissue-specific features in a small group of functionally related genes. In order to further analyze this phenomenon, a method was developed for quantifying the contribution of different factors to the variability in gene expression, and for generating a database limited to residual values reflecting constitutional differences between individuals. These constitutional differences, presumably epigenetic in origin, contribute to about 50% of the observed residual variance which is connected with a network of interrelated changes in gene expression with some genes displaying a decrease or increase in residual variation with age.
Epigenetic variation in gene expression without a clear concomitant relation to gene function appears to be a widespread phenomenon. This variation is connected with interactions between genes, is gender and tissue specific and is related to cellular aging.
This finding, together with the method developed for analysis, might contribute to the elucidation of the role of nuclear programming in differentiation, aging and carcinogenesis
This article was reviewed by Thiago M. Venancio (nominated by Aravind Iyer), Hua Li (nominated by Arcady Mushegian) and Arcady Mushegian and Magelhaes (nominated by G. Church).
PMCID: PMC2762993  PMID: 19796384
16.  Hypothesis for heritable, anti-viral immunity in crustaceans and insects 
Biology Direct  2009;4:36.
Correction to Flegel, TW: Hypothesis for heritable, anti-viral immunity in crustaceans and insects. Biology Direct 2009, 4:32.
PMCID: PMC2764569
17.  Inferring clocks when lacking rocks: the variable rates of molecular evolution in bacteria 
Biology Direct  2009;4:35.
Because bacteria do not have a robust fossil record, attempts to infer the timing of events in their evolutionary history requires comparisons of molecular sequences. This use of molecular clocks is based on the assumptions that substitution rates for homologous genes or sites are fairly constant through time and across taxa. Violation of these conditions can lead to erroneous inferences and result in estimates that are off by orders of magnitude. In this study, we examine the consistency of substitution rates among a set of conserved genes in diverse bacterial lineages, and address the questions regarding the validity of molecular dating.
By examining the evolution of 16S rRNA gene in obligate endosymbionts, which can be calibrated by the fossil record of their hosts, we found that the rates are consistent within a clade but varied widely across different bacterial lineages. Genome-wide estimates of nonsynonymous and synonymous substitutions suggest that these two measures are highly variable in their rates across bacterial taxa. Genetic drift plays a fundamental role in determining the accumulation of substitutions in 16S rRNA genes and at nonsynonymous sites. Moreover, divergence estimates based on a set of universally conserved protein-coding genes also exhibit low correspondence to those based on 16S rRNA genes.
Our results document a wide range of substitution rates across genes and bacterial taxa. This high level of variation cautions against the assumption of a universal molecular clock for inferring divergence times in bacteria. However, by applying relative-rate tests to homologous genes, it is possible to derive reliable local clocks that can be used to calibrate bacterial evolution.
This article was reviewed by Adam Eyre-Walker, Simonetta Gribaldo and Tal Pupko (nominated by Dan Graur).
PMCID: PMC2760517  PMID: 19788732
18.  Prokaryotic evolution and the tree of life are two different things 
Biology Direct  2009;4:34.
The concept of a tree of life is prevalent in the evolutionary literature. It stems from attempting to obtain a grand unified natural system that reflects a recurrent process of species and lineage splittings for all forms of life. Traditionally, the discipline of systematics operates in a similar hierarchy of bifurcating (sometimes multifurcating) categories. The assumption of a universal tree of life hinges upon the process of evolution being tree-like throughout all forms of life and all of biological time. In multicellular eukaryotes, the molecular mechanisms and species-level population genetics of variation do indeed mainly cause a tree-like structure over time. In prokaryotes, they do not. Prokaryotic evolution and the tree of life are two different things, and we need to treat them as such, rather than extrapolating from macroscopic life to prokaryotes. In the following we will consider this circumstance from philosophical, scientific, and epistemological perspectives, surmising that phylogeny opted for a single model as a holdover from the Modern Synthesis of evolution.
It was far easier to envision and defend the concept of a universal tree of life before we had data from genomes. But the belief that prokaryotes are related by such a tree has now become stronger than the data to support it. The monistic concept of a single universal tree of life appears, in the face of genome data, increasingly obsolete. This traditional model to describe evolution is no longer the most scientifically productive position to hold, because of the plurality of evolutionary patterns and mechanisms involved. Forcing a single bifurcating scheme onto prokaryotic evolution disregards the non-tree-like nature of natural variation among prokaryotes and accounts for only a minority of observations from genomes.
Prokaryotic evolution and the tree of life are two different things. Hence we will briefly set out alternative models to the tree of life to study their evolution. Ultimately, the plurality of evolutionary patterns and mechanisms involved, such as the discontinuity of the process of evolution across the prokaryote-eukaryote divide, summons forth a pluralistic approach to studying evolution.
This article was reviewed by Ford Doolittle, John Logsdon and Nicolas Galtier.
PMCID: PMC2761302  PMID: 19788731
19.  The fundamental units, processes and patterns of evolution, and the Tree of Life conundrum 
Biology Direct  2009;4:33.
The elucidation of the dominant role of horizontal gene transfer (HGT) in the evolution of prokaryotes led to a severe crisis of the Tree of Life (TOL) concept and intense debates on this subject.
Prompted by the crisis of the TOL, we attempt to define the primary units and the fundamental patterns and processes of evolution. We posit that replication of the genetic material is the singular fundamental biological process and that replication with an error rate below a certain threshold both enables and necessitates evolution by drift and selection. Starting from this proposition, we outline a general concept of evolution that consists of three major precepts.
1. The primary agency of evolution consists of Fundamental Units of Evolution (FUEs), that is, units of genetic material that possess a substantial degree of evolutionary independence. The FUEs include both bona fide selfish elements such as viruses, viroids, transposons, and plasmids, which encode some of the information required for their own replication, and regular genes that possess quasi-independence owing to their distinct selective value that provides for their transfer between ensembles of FUEs (genomes) and preferential replication along with the rest of the recipient genome.
2. The history of replication of a genetic element without recombination is isomorphously represented by a directed tree graph (an arborescence, in the graph theory language). Recombination within a FUE is common between very closely related sequences where homologous recombination is feasible but becomes negligible for longer evolutionary distances. In contrast, shuffling of FUEs occurs at all evolutionary distances. Thus, a tree is a natural representation of the evolution of an individual FUE on the macro scale, but not of an ensemble of FUEs such as a genome.
3. The history of life is properly represented by the "forest" of evolutionary trees for individual FUEs (Forest of Life, or FOL). Search for trends and patterns in the FOL is a productive direction of study that leads to the delineation of ensembles of FUEs that evolve coherently for a certain time span owing to a shared history of vertical inheritance or horizontal gene transfer; these ensembles are commonly known as genomes, taxa, or clades, depending on the level of analysis. A small set of genes (the universal genetic core of life) might show a (mostly) coherent evolutionary trend that transcends the entire history of cellular life forms. However, it might not be useful to denote this trend "the tree of life", or organismal, or species tree because neither organisms nor species are fundamental units of life.
A logical analysis of the units and processes of biological evolution suggests that the natural fundamental unit of evolution is a FUE, that is, a genetic element with an independent evolutionary history. Evolution of a FUE on the macro scale is naturally represented by a tree. Only the full compendium of trees for individual FUEs (the FOL) is an adequate depiction of the evolution of life. Coherent evolution of FUEs over extended evolutionary intervals is a crucial aspect of the history of life but a "species" or "organismal" tree is not a fundamental concept.
This articles was reviewed by Valerian Dolja, W. Ford Doolittle, Nicholas Galtier, and William Martin
PMCID: PMC2761301  PMID: 19788730
20.  Hypothesis for heritable, anti-viral immunity in crustaceans and insects 
Biology Direct  2009;4:32.
It is known that crustaceans and insects can persistently carry one or more viral pathogens at low levels, without signs of disease. They may transmit them to their offspring or to naïve individuals, often with lethal consequences. The underlying molecular mechanisms have not been elucidated, but the process has been called viral accommodation. Since tolerance to one virus does not confer tolerance to another, tolerance is pathogen-specific, so the requirement for a specific pathogen response mechanism (memory) was included in the original viral accommodation concept. Later, it was hypothesized that specific responses were based on the presence of viruses in persistent infections. However, recent developments suggest that specific responses may be based on viral sequences inserted into the host genome.
Presentation of the hypothesis
Non-retroviral fragments of both RNA and DNA viruses have been found in insect and crustacean genomes. In addition, reverse-transcriptase (RT) and integrase (IN) sequences are also common in their genomes. It is hypothesized that shrimp and other arthropods use these RT to recognize "foreign" mRNA of both RNA and DNA viruses and use the integrases (IN) to randomly insert short cDNA sequences into their genomes. By chance, some of these sequences result in production of immunospecific RNA (imRNA) capable of stimulating RNAi that suppresses viral propagation. Individuals with protective inserts would pass these on to the next generation, together with similar protective inserts for other viruses that could be amalgamated rapidly in individual offspring by random assortment of chromosomes. The most successful individuals would be environmentally selected from billions of offspring.
This hypothesis for immunity based on an imRNA generation mechanism fits with the general principle of invertebrate immunity based on a non-host, "pattern recognition" process. If proven correct, understanding the process would allow directed preparation of vaccines for selection of crustacean and insect lines applicable in commercial production species (e.g., shrimp and bees) or in control of insect-borne diseases. Arising from a natural host mechanism, the resulting animals would not be artificially, genetically modified (GMO).
This article was reviewed by Akria Shibuya, Eugene V. Koonin and L. Aravind.
PMCID: PMC2757015  PMID: 19725947
21.  Multipotent adult germ-line stem cells, like other pluripotent stem cells, can be killed by cytotoxic T lymphocytes despite low expression of major histocompatibility complex class I molecules 
Biology Direct  2009;4:31.
Multipotent adult germ-line stem cells (maGSCs) represent a new pluripotent cell type that can be derived without genetic manipulation from spermatogonial stem cells (SSCs) present in adult testis. Similarly to induced pluripotent stem cells (iPSCs), they could provide a source of cellular grafts for new transplantation therapies of a broad variety of diseases. To test whether these stem cells can be rejected by the recipients, we have analyzed whether maGSCs and iPSCs can become targets for cytotoxic T lymphocytes (CTL) or whether they are protected, as previously proposed for embryonic stem cells (ESCs).
We have observed that maGSCs can be maintained in prolonged culture with or without leukemia inhibitory factor and/or feeder cells and still retain the capacity to form teratomas in immunodeficient recipients. They were, however, rejected in immunocompetent allogeneic recipients, and the immune response controlled teratoma growth. We analyzed the susceptibility of three maGSC lines to CTL in comparison to ESCs, iPSCs, and F9 teratocarcinoma cells. Major histocompatibility complex (MHC) class I molecules were not detectable by flow cytometry on these stem cell lines, apart from low levels on one maGSC line (maGSC Stra8 SSC5). However, using a quantitative real time PCR analysis H2K and B2m transcripts were detected in all pluripotent stem cell lines. All pluripotent stem cell lines were killed in a peptide-dependent manner by activated CTLs derived from T cell receptor transgenic OT-I mice after pulsing of the targets with the SIINFEKL peptide.
Pluripotent stem cells, including maGSCs, ESCs, and iPSCs can become targets for CTLs, even if the expression level of MHC class I molecules is below the detection limit of flow cytometry. Thus they are not protected against CTL-mediated cytotoxicity. Therefore, pluripotent cells might be rejected after transplantation by this mechanism if specific antigens are presented and if specific activated CTLs are present. Our results show that the adaptive immune system has in principle the capacity to kill pluripotent and teratoma forming stem cells. This finding might help to develop new strategies to increase the safety of future transplantations of in vitro differentiated cells by exploiting a selective immune response against contaminating undifferentiated cells.
This article was reviewed by Bhagirath Singh, Etienne Joly and Lutz Walter.
PMCID: PMC2745366  PMID: 19715575
22.  Structural analysis of polarizing indels: an emerging consensus on the root of the tree of life 
Biology Direct  2009;4:30.
The root of the tree of life has been a holy grail ever since Darwin first used the tree as a metaphor for evolution. New methods seek to narrow down the location of the root by excluding it from branches of the tree of life. This is done by finding traits that must be derived, and excluding the root from the taxa those traits cover. However the two most comprehensive attempts at this strategy, performed by Cavalier-Smith and Lake et al., have excluded each other's rootings.
The indel polarizations of Lake et al. rely on high quality alignments between paralogs that diverged before the last universal common ancestor (LUCA). Therefore, sequence alignment artifacts may skew their conclusions. We have reviewed their data using protein structure information where available. Several of the conclusions are quite different when viewed in the light of structure which is conserved over longer evolutionary time scales than sequence. We argue there is no polarization that excludes the root from all Gram-negatives, and that polarizations robustly exclude the root from the Archaea.
We conclude that there is no contradiction between the polarization datasets. The combination of these datasets excludes the root from every possible position except near the Chloroflexi.
This article was reviewed by Greg Fournier (nominated by J. Peter Gogarten), Purificación López-García, and Eugene Koonin.
PMCID: PMC3224940  PMID: 19706177
23.  On the need for widespread horizontal gene transfers under genome size constraint 
Biology Direct  2009;4:28.
While eukaryotes primarily evolve by duplication-divergence expansion (and reduction) of their own gene repertoire with only rare horizontal gene transfers, prokaryotes appear to evolve under both gene duplications and widespread horizontal gene transfers over long evolutionary time scales. But, the evolutionary origin of this striking difference in the importance of horizontal gene transfers remains by and large a mystery.
We propose that the abundance of horizontal gene transfers in free-living prokaryotes is a simple but necessary consequence of two opposite effects: i) their apparent genome size constraint compared to typical eukaryote genomes and ii) their underlying genome expansion dynamics through gene duplication-divergence evolution, as demonstrated by the presence of many tandem and block repeated genes. In principle, this combination of genome size constraint and underlying duplication expansion should lead to a coalescent-like process with extensive turnover of functional genes. This would, however, imply the unlikely, systematic reinvention of functions from discarded genes within independent phylogenetic lineages. Instead, we propose that the long-term evolutionary adaptation of free-living prokaryotes must have resulted in the emergence of efficient non-phylogenetic pathways to circumvent gene loss.
This need for widespread horizontal gene transfers due to genome size constraint implies, in particular, that prokaryotes must remain under strong selection pressure in order to maintain the long-term evolutionary adaptation of their "mutualized" gene pool, beyond the inevitable turnover of individual prokaryote species. By contrast, the absence of genome size constraint for typical eukaryotes has presumably relaxed their need for widespread horizontal gene transfers and strong selection pressure. Yet, the resulting loss of genetic functions, due to weak selection pressure and inefficient gene recovery mechanisms, must have ultimately favored the emergence of more complex life styles and ecological integration of many eukaryotes.
This article was reviewed by Pierre Pontarotti, Eugene V Koonin and Sergei Maslov.
PMCID: PMC2740843  PMID: 19703318
24.  Prokaryotic homologs of Argonaute proteins are predicted to function as key components of a novel system of defense against mobile genetic elements 
Biology Direct  2009;4:29.
In eukaryotes, RNA interference (RNAi) is a major mechanism of defense against viruses and transposable elements as well of regulating translation of endogenous mRNAs. The RNAi systems recognize the target RNA molecules via small guide RNAs that are completely or partially complementary to a region of the target. Key components of the RNAi systems are proteins of the Argonaute-PIWI family some of which function as slicers, the nucleases that cleave the target RNA that is base-paired to a guide RNA. Numerous prokaryotes possess the CRISPR-associated system (CASS) of defense against phages and plasmids that is, in part, mechanistically analogous but not homologous to eukaryotic RNAi systems. Many prokaryotes also encode homologs of Argonaute-PIWI proteins but their functions remain unknown.
We present a detailed analysis of Argonaute-PIWI protein sequences and the genomic neighborhoods of the respective genes in prokaryotes. Whereas eukaryotic Ago/PIWI proteins always contain PAZ (oligonucleotide binding) and PIWI (active or inactivated nuclease) domains, the prokaryotic Argonaute homologs (pAgos) fall into two major groups in which the PAZ domain is either present or absent. The monophyly of each group is supported by a phylogenetic analysis of the conserved PIWI-domains. Almost all pAgos that lack a PAZ domain appear to be inactivated, and the respective genes are associated with a variety of predicted nucleases in putative operons. An additional, uncharacterized domain that is fused to various nucleases appears to be a unique signature of operons encoding the short (lacking PAZ) pAgo form. By contrast, almost all PAZ-domain containing pAgos are predicted to be active nucleases. Some proteins of this group (e.g., that from Aquifex aeolicus) have been experimentally shown to possess nuclease activity, and are not typically associated with genes for other (putative) nucleases. Given these observations, the apparent extensive horizontal transfer of pAgo genes, and their common, statistically significant over-representation in genomic neighborhoods enriched in genes encoding proteins involved in the defense against phages and/or plasmids, we hypothesize that pAgos are key components of a novel class of defense systems. The PAZ-domain containing pAgos are predicted to directly destroy virus or plasmid nucleic acids via their nuclease activity, whereas the apparently inactivated, PAZ-lacking pAgos could be structural subunits of protein complexes that contain, as active moieties, the putative nucleases that we predict to be co-expressed with these pAgos. All these nucleases are predicted to be DNA endonucleases, so it seems most probable that the putative novel phage/plasmid-defense system targets phage DNA rather than mRNAs. Given that in eukaryotic RNAi systems, the PAZ domain binds a guide RNA and positions it on the complementary region of the target, we further speculate that pAgos function on a similar principle (the guide being either DNA or RNA), and that the uncharacterized domain found in putative operons with the short forms of pAgos is a functional substitute for the PAZ domain.
The hypothesis that pAgos are key components of a novel prokaryotic immune system that employs guide RNA or DNA molecules to degrade nucleic acids of invading mobile elements implies a functional analogy with the prokaryotic CASS and a direct evolutionary connection with eukaryotic RNAi. The predictions of the hypothesis including both the activities of pAgos and those of the associated endonucleases are readily amenable to experimental tests.
This article was reviewed by Daniel Haft, Martijn Huynen, and Chris Ponting.
PMCID: PMC2743648  PMID: 19706170
25.  On the origin of life in the Zinc world: 1. Photosynthesizing, porous edifices built of hydrothermally precipitated zinc sulfide as cradles of life on Earth 
Biology Direct  2009;4:26.
The complexity of the problem of the origin of life has spawned a large number of possible evolutionary scenarios. Their number, however, can be dramatically reduced by the simultaneous consideration of various bioenergetic, physical, and geological constraints.
This work puts forward an evolutionary scenario that satisfies the known constraints by proposing that life on Earth emerged, powered by UV-rich solar radiation, at photosynthetically active porous edifices made of precipitated zinc sulfide (ZnS) similar to those found around modern deep-sea hydrothermal vents. Under the high pressure of the primeval, carbon dioxide-dominated atmosphere ZnS could precipitate at the surface of the first continents, within reach of solar light. It is suggested that the ZnS surfaces (1) used the solar radiation to drive carbon dioxide reduction, yielding the building blocks for the first biopolymers, (2) served as templates for the synthesis of longer biopolymers from simpler building blocks, and (3) prevented the first biopolymers from photo-dissociation, by absorbing from them the excess radiation. In addition, the UV light may have favoured the selective enrichment of photostable, RNA-like polymers. Falsification tests of this hypothesis are described in the accompanying article (A.Y. Mulkidjanian, M.Y. Galperin, Biology Direct 2009, 4:27).
The suggested "Zn world" scenario identifies the geological conditions under which photosynthesizing ZnS edifices of hydrothermal origin could emerge and persist on primordial Earth, includes a mechanism of the transient storage and utilization of solar light for the production of diverse organic compounds, and identifies the driving forces and selective factors that could have promoted the transition from the first simple, photostable polymers to more complex living organisms.
This paper was reviewed by Arcady Mushegian, Simon Silver (nominated by Arcady Mushegian), Antoine Danchin (nominated by Eugene Koonin) and Dieter Braun (nominated by Sergey Maslov).
PMCID: PMC3152778  PMID: 19703272

Results 1-25 (51)