Search tips
Search criteria

Results 1-24 (24)

Clipboard (0)
Year of Publication
1.  CRISPR loci reveal networks of gene exchange in archaea 
Biology Direct  2011;6:65.
CRISPR (Clustered, Regularly, Interspaced, Short, Palindromic Repeats) loci provide prokaryotes with an adaptive immunity against viruses and other mobile genetic elements. CRISPR arrays can be transcribed and processed into small crRNA molecules, which are then used by the cell to target the foreign nucleic acid. Since spacers are accumulated by active CRISPR/Cas systems, the sequences of these spacers provide a record of the past "infection history" of the organism.
Here we analyzed all currently known spacers present in archaeal genomes and identified their source by DNA similarity. While nearly 50% of archaeal spacers matched mobile genetic elements, such as plasmids or viruses, several others matched chromosomal genes of other organisms, primarily other archaea. Thus, networks of gene exchange between archaeal species were revealed by the spacer analysis, including many cases of inter-genus and inter-species gene transfer events. Spacers that recognize viral sequences tend to be located further away from the leader sequence, implying that there exists a selective pressure for their retention.
CRISPR spacers provide direct evidence for extensive gene exchange in archaea, especially within genera, and support the current dogma where the primary role of the CRISPR/Cas system is anti-viral and anti-plasmid defense.
Open peer review
This article was reviewed by: Profs. W. Ford Doolittle, John van der Oost, Christa Schleper (nominated by board member Prof. J Peter Gogarten)
PMCID: PMC3285040  PMID: 22188759
CRISPR; Lateral Gene transfer; Horizontal gene transfer; viruses; archaea; competence
2.  Accurate state estimation from uncertain data and models: an application of data assimilation to mathematical models of human brain tumors 
Biology Direct  2011;6:64.
Data assimilation refers to methods for updating the state vector (initial condition) of a complex spatiotemporal model (such as a numerical weather model) by combining new observations with one or more prior forecasts. We consider the potential feasibility of this approach for making short-term (60-day) forecasts of the growth and spread of a malignant brain cancer (glioblastoma multiforme) in individual patient cases, where the observations are synthetic magnetic resonance images of a hypothetical tumor.
We apply a modern state estimation algorithm (the Local Ensemble Transform Kalman Filter), previously developed for numerical weather prediction, to two different mathematical models of glioblastoma, taking into account likely errors in model parameters and measurement uncertainties in magnetic resonance imaging. The filter can accurately shadow the growth of a representative synthetic tumor for 360 days (six 60-day forecast/update cycles) in the presence of a moderate degree of systematic model error and measurement noise.
The mathematical methodology described here may prove useful for other modeling efforts in biology and oncology. An accurate forecast system for glioblastoma may prove useful in clinical settings for treatment planning and patient counseling.
This article was reviewed by Anthony Almudevar, Tomas Radivoyevitch, and Kristin Swanson (nominated by Georg Luebeck).
PMCID: PMC3340325  PMID: 22185645
State estimation; data assimiliation; mathematical models; glioblastoma multiforme
3.  The existence of species rests on a metastable equilibrium between inbreeding and outbreeding. An essay on the close relationship between speciation, inbreeding and recessive mutations 
Biology Direct  2011;6:62.
Speciation corresponds to the progressive establishment of reproductive barriers between groups of individuals derived from an ancestral stock. Since Darwin did not believe that reproductive barriers could be selected for, he proposed that most events of speciation would occur through a process of separation and divergence, and this point of view is still shared by most evolutionary biologists today.
I do, however, contend that, if so much speciation occurs, the most likely explanation is that there must be conditions where reproductive barriers can be directly selected for. In other words, situations where it is advantageous for individuals to reproduce preferentially within a small group and reduce their breeding with the rest of the ancestral population. This leads me to propose a model whereby new species arise not by populations splitting into separate branches, but by small inbreeding groups "budding" from an ancestral stock. This would be driven by several advantages of inbreeding, and mainly by advantageous recessive phenotypes, which could only be retained in the context of inbreeding. Reproductive barriers would thus not arise as secondary consequences of divergent evolution in populations isolated from one another, but under the direct selective pressure of ancestral stocks. Many documented cases of speciation in natural populations appear to fit the model proposed, with more speciation occurring in populations with high inbreeding coefficients, and many recessive characters identified as central to the phenomenon of speciation, with these recessive mutations expected to be surrounded by patterns of limited genomic diversity.
Whilst adaptive evolution would correspond to gains of function that would, most of the time, be dominant, this type of speciation by budding would thus be driven by mutations resulting in the advantageous loss of certain functions since recessive mutations very often correspond to the inactivation of a gene. A very important further advantage of inbreeding is that it reduces the accumulation of recessive mutations in genomes. A consequence of the model proposed is that the existence of species would correspond to a metastable equilibrium between inbreeding and outbreeding, with excessive inbreeding promoting speciation, and excessive outbreeding resulting in irreversible accumulation of recessive mutations that could ultimately only lead to extinction.
Reviewer names
Eugene V. Koonin, Patrick Nosil (nominated by Dr Jerzy Jurka), Pierre Pontarotti
PMCID: PMC3275546  PMID: 22152499
speciation; inbreeding; saeptation; mutation load; extinction; evolution
4.  On universal common ancestry, sequence similarity, and phylogenetic structure: the sins of P-values and the virtues of Bayesian evidence 
Biology Direct  2011;6:60.
The universal common ancestry (UCA) of all known life is a fundamental component of modern evolutionary theory, supported by a wide range of qualitative molecular evidence. Nevertheless, recently both the status and nature of UCA has been questioned. In earlier work I presented a formal, quantitative test of UCA in which model selection criteria overwhelmingly choose common ancestry over independent ancestry, based on a dataset of universally conserved proteins. These model-based tests are founded in likelihoodist and Bayesian probability theory, in opposition to classical frequentist null hypothesis tests such as Karlin-Altschul E-values for sequence similarity. In a recent comment, Koonin and Wolf (K&W) claim that the model preference for UCA is "a trivial consequence of significant sequence similarity". They support this claim with a computational simulation, derived from universally conserved proteins, which produces similar sequences lacking phylogenetic structure. The model selection tests prefer common ancestry for this artificial data set.
For the real universal protein sequences, hierarchical phylogenetic structure (induced by genealogical history) is the overriding reason for why the tests choose UCA; sequence similarity is a relatively minor factor. First, for cases of conflicting phylogenetic structure, the tests choose independent ancestry even with highly similar sequences. Second, certain models, like star trees and K&W's profile model (corresponding to their simulation), readily explain sequence similarity yet lack phylogenetic structure. However, these are extremely poor models for the real proteins, even worse than independent ancestry models, though they explain K&W's artificial data well. Finally, K&W's simulation is an implementation of a well-known phylogenetic model, and it produces sequences that mimic homologous proteins. Therefore the model selection tests work appropriately with the artificial data.
For K&W's artificial protein data, sequence similarity is the predominant factor influencing the preference for common ancestry. In contrast, for the real proteins, model selection tests show that phylogenetic structure is much more important than sequence similarity. Hence, the model selection tests demonstrate that real universally conserved proteins are homologous, a conclusion based primarily on the specific nested patterns of correlations induced in genetically related protein sequences.
This article was reviewed by Rob Knight, Robert Beiko (nominated by Peter Gogarten), and Michael Gilchrist.
PMCID: PMC3314578  PMID: 22114984
5.  On the relationship between the load and the variance of relative fitness 
Biology Direct  2011;6:20.
Operation of natural selection can be characterized by a variety of quantities. Among them, variance of relative fitness V and load L are the most fundamental.
Among all modes of selection that produce a particular value V of the variance of relative fitness, the minimal value Lmin of load L is produced by a mode under which fitness takes only two values, 0 and some positive value, and is equal to V/(1+V).
Although it is impossible to deduce the load from knowledge of the variance of relative fitness alone, it is possible to determine the minimal load consistent with a particular variance of relative fitness. The concept of minimal load consistent with a particular biological phenomenon may be applicable to studying several aspects of natural selection.
The manuscript was reviewed by Sergei Maslov, Alexander Gordon, and Eugene Koonin.
PMCID: PMC3094333  PMID: 21492441
6.  The struggle for life of the genome's selfish architects 
Biology Direct  2011;6:19.
Transposable elements (TEs) were first discovered more than 50 years ago, but were totally ignored for a long time. Over the last few decades they have gradually attracted increasing interest from research scientists. Initially they were viewed as totally marginal and anecdotic, but TEs have been revealed as potentially harmful parasitic entities, ubiquitous in genomes, and finally as unavoidable actors in the diversity, structure, and evolution of the genome. Since Darwin's theory of evolution, and the progress of molecular biology, transposable elements may be the discovery that has most influenced our vision of (genome) evolution. In this review, we provide a synopsis of what is known about the complex interactions that exist between transposable elements and the host genome. Numerous examples of these interactions are provided, first from the standpoint of the genome, and then from that of the transposable elements. We also explore the evolutionary aspects of TEs in the light of post-Darwinian theories of evolution.
This article was reviewed by Jerzy Jurka, Jürgen Brosius and I. King Jordan. For complete reports, see the Reviewers' reports section.
PMCID: PMC3072357  PMID: 21414203
7.  Functional analysis of archaeal MBF1 by complementation studies in yeast 
Biology Direct  2011;6:18.
Multiprotein-bridging factor 1 (MBF1) is a transcriptional co-activator that bridges a sequence-specific activator (basic-leucine zipper (bZIP) like proteins (e.g. Gcn4 in yeast) or steroid/nuclear-hormone receptor family (e.g. FTZ-F1 in insect)) and the TATA-box binding protein (TBP) in Eukaryotes. MBF1 is absent in Bacteria, but is well- conserved in Eukaryotes and Archaea and harbors a C-terminal Cro-like Helix Turn Helix (HTH) domain, which is the only highly conserved, classical HTH domain that is vertically inherited in all Eukaryotes and Archaea. The main structural difference between archaeal MBF1 (aMBF1) and eukaryotic MBF1 is the presence of a Zn ribbon motif in aMBF1. In addition MBF1 interacting activators are absent in the archaeal domain. To study the function and therefore the evolutionary conservation of MBF1 and its single domains complementation studies in yeast (mbf1Δ) as well as domain swap experiments between aMBF1 and yMbf1 were performed.
In contrast to previous reports for eukaryotic MBF1 (i.e. Arabidopsis thaliana, insect and human) the two archaeal MBF1 orthologs, TMBF1 from the hyperthermophile Thermoproteus tenax and MMBF1 from the mesophile Methanosarcina mazei were not functional for complementation of an Saccharomyces cerevisiae mutant lacking Mbf1 (mbf1Δ). Of twelve chimeric proteins representing different combinations of the N-terminal, core domain, and the C-terminal extension from yeast and aMBF1, only the chimeric MBF1 comprising the yeast N-terminal and core domain fused to the archaeal C-terminal part was able to restore full wild-type activity of MBF1.
However, as reported previously for Bombyx mori, the C-terminal part of yeast Mbf1 was shown to be not essential for function. In addition phylogenetic analyses revealed a common distribution of MBF1 in all Archaea with available genome sequence, except of two of the three Thaumarchaeota; Cenarchaeum symbiosum A and Nitrosopumilus maritimus SCM1.
The absence of MBF1-interacting activators in the archaeal domain, the presence of a Zn ribbon motif in the divergent N-terminal domain of aMBF1 and the complementation experiments using archaeal- yeast chimeric proteins presented here suggests that archaeal MBF1 is not able to functionally interact with the transcription machinery and/or Gcn4 of S. cerevisiae. Based on modeling and structural prediction it is tempting to speculate that aMBF1 might act as a single regulator or non-essential transcription factor, which directly interacts with DNA via the positive charged linker or the basal transcription machinery via its Zn ribbon motif and the HTH domain. However, also alternative functions in ribosome biosynthesis and/or functionality have been discussed and therefore further experiments are required to unravel the function of MBF1 in Archaea.
This article was reviewed by William Martin, Patrick Forterre, John van der Oost and Fabian Blombach (nominated by Eugene V Koonin (United States)). For the full reviews, please go to the Reviewer's Reports section.
PMCID: PMC3062615  PMID: 21392374
8.  Do clones degenerate over time? Explaining the genetic variability of asexuals through population genetic models 
Biology Direct  2011;6:17.
Quest for understanding the nature of mechanisms governing the life span of clonal organisms lasts for several decades. Phylogenetic evidence for recent origins of most clones is usually interpreted as proof that clones suffer from gradual age-dependent fitness decay (e.g. Muller's ratchet). However, we have shown that a neutral drift can also qualitatively explain the observed distribution of clonal ages. This finding was followed by several attempts to distinguish the effects of neutral and non-neutral processes. Most recently, Neiman et al. 2009 (Ann N Y Acad Sci.:1168:185-200.) reviewed the distribution of asexual lineage ages estimated from a diverse array of taxa and concluded that neutral processes alone may not explain the observed data. Moreover, the authors inferred that similar types of mechanisms determine maximum asexual lineage ages in all asexual taxa. In this paper we review recent methods for distinguishing the effects of neutral and non-neutral processes and point at methodological problems related with them.
Results and Discussion
We found that contemporary analyses based on phylogenetic data are inadequate to provide any clear-cut answer about the nature and generality of processes affecting evolution of clones. As an alternative approach, we demonstrate that sequence variability in asexual populations is suitable to detect age-dependent selection against clonal lineages. We found that asexual taxa with relatively old clonal lineages are characterised by progressively stronger deviations from neutrality.
Our results demonstrate that some type of age-dependent selection against clones is generally operational in asexual animals, which cover a wide taxonomic range spanning from flatworms to vertebrates. However, we also found a notable difference between the data distribution predicted by available models of sequence evolution and those observed in empirical data. These findings point at the possibility that processes affecting clonal evolution differ from those described in recent studies, suggesting that theoretical models of asexual populations must evolve to address this problem in detail.
This article was reviewed by Isa Schön (nominated by John Logsdon), Arcady Mushegian and Timothy G. Barraclough (nominated by Laurence Hurst).
PMCID: PMC3064643  PMID: 21371316
9.  Assessing quality and completeness of human transcriptional regulatory pathways on a genome-wide scale 
Biology Direct  2011;6:15.
Pathway databases are becoming increasingly important and almost omnipresent in most types of biological and translational research. However, little is known about the quality and completeness of pathways stored in these databases. The present study conducts a comprehensive assessment of transcriptional regulatory pathways in humans for seven well-studied transcription factors: MYC, NOTCH1, BCL6, TP53, AR, STAT1, and RELA. The employed benchmarking methodology first involves integrating genome-wide binding with functional gene expression data to derive direct targets of transcription factors. Then the lists of experimentally obtained direct targets are compared with relevant lists of transcriptional targets from 10 commonly used pathway databases.
The results of this study show that for the majority of pathway databases, the overlap between experimentally obtained target genes and targets reported in transcriptional regulatory pathway databases is surprisingly small and often is not statistically significant. The only exception is MetaCore pathway database which yields statistically significant intersection with experimental results in 84% cases. Additionally, we suggest that the lists of experimentally derived direct targets obtained in this study can be used to reveal new biological insight in transcriptional regulation and suggest novel putative therapeutic targets in cancer.
Our study opens a debate on validity of using many popular pathway databases to obtain transcriptional regulatory targets. We conclude that the choice of pathway databases should be informed by solid scientific evidence and rigorous empirical evaluation.
This article was reviewed by Prof. Wing Hung Wong, Dr. Thiago Motta Venancio (nominated by Dr. L Aravind), and Prof. Geoff J McLachlan.
PMCID: PMC3055855  PMID: 21356087
10.  The origin of a derived superkingdom: how a gram-positive bacterium crossed the desert to become an archaeon 
Biology Direct  2011;6:16.
The tree of life is usually rooted between archaea and bacteria. We have previously presented three arguments that support placing the root of the tree of life in bacteria. The data have been dismissed because those who support the canonical rooting between the prokaryotic superkingdoms cannot imagine how the vast divide between the prokaryotic superkingdoms could be crossed.
We review the evidence that archaea are derived, as well as their biggest differences with bacteria. We argue that using novel data the gap between the superkingdoms is not insurmountable. We consider whether archaea are holophyletic or paraphyletic; essential to understanding their origin. Finally, we review several hypotheses on the origins of archaea and, where possible, evaluate each hypothesis using bioinformatics tools. As a result we argue for a firmicute ancestry for archaea over proposals for an actinobacterial ancestry.
We believe a synthesis of the hypotheses of Lake, Gupta, and Cavalier-Smith is possible where a combination of antibiotic warfare and viral endosymbiosis in the bacilli led to dramatic changes in a bacterium that resulted in the birth of archaea and eukaryotes.
This article was reviewed by Patrick Forterre, Eugene Koonin, and Gáspár Jékely
PMCID: PMC3056875  PMID: 21356104
11.  On origin of genetic code and tRNA before translation 
Biology Direct  2011;6:14.
Synthesis of proteins is based on the genetic code - a nearly universal assignment of codons to amino acids (aas). A major challenge to the understanding of the origins of this assignment is the archetypal "key-lock vs. frozen accident" dilemma. Here we re-examine this dilemma in light of 1) the fundamental veto on "foresight evolution", 2) modular structures of tRNAs and aminoacyl-tRNA synthetases, and 3) the updated library of aa-binding sites in RNA aptamers successfully selected in vitro for eight amino acids.
The aa-binding sites of arginine, isoleucine and tyrosine contain both their cognate triplets, anticodons and codons. We have noticed that these cases might be associated with palindrome-dinucleotides. For example, one-base shift to the left brings arginine codons CGN, with CG at 1-2 positions, to the respective anticodons NCG, with CG at 2-3 positions. Formally, the concomitant presence of codons and anticodons is also expected in the reverse situation, with codons containing palindrome-dinucleotides at their 2-3 positions, and anticodons exhibiting them at 1-2 positions. A closer analysis reveals that, surprisingly, RNA binding sites for Arg, Ile and Tyr "prefer" (exactly as in the actual genetic code) the anticodon(2-3)/codon(1-2) tetramers to their anticodon(1-2)/codon(2-3) counterparts, despite the seemingly perfect symmetry of the latter. However, since in vitro selection of aa-specific RNA aptamers apparently had nothing to do with translation, this striking preference provides a new strong support to the notion of the genetic code emerging before translation, in response to catalytic (and possibly other) needs of ancient RNA life. Consistently with the pre-translation origin of the code, we propose here a new model of tRNA origin by the gradual, Fibonacci process-like, elongation of a tRNA molecule from a primordial coding triplet and 5'DCCA3' quadruplet (D is a base-determinator) to the eventual 76 base-long cloverleaf-shaped molecule.
Taken together, our findings necessarily imply that primordial tRNAs, tRNA aminoacylating ribozymes, and (later) the translation machinery in general have been co-evolving to ''fit'' the (likely already defined) genetic code, rather than the opposite way around. Coding triplets in this primal pre-translational code were likely similar to the anticodons, with second and third nucleotides being more important than the less specific first one. Later, when the code was expanding in co-evolution with the translation apparatus, the importance of 2-3 nucleotides of coding triplets "transferred" to the 1-2 nucleotides of their complements, thus distinguishing anticodons from codons. This evolutionary primacy of anticodons in genetic coding makes the hypothesis of primal stereo-chemical affinity between amino acids and cognate triplets, the hypothesis of coding coenzyme handles for amino acids, the hypothesis of tRNA-like genomic 3' tags suggesting that tRNAs originated in replication, and the hypothesis of ancient ribozymes-mediated operational code of tRNA aminoacylation not mutually contradicting but rather co-existing in harmony.
This article was reviewed by Eugene V. Koonin, Wentao Ma (nominated by Juergen Brosius) and Anthony Poole.
PMCID: PMC3050877  PMID: 21342520
12.  Nonsynonymous substitution rate (Ka) is a relatively consistent parameter for defining fast-evolving and slow-evolving protein-coding genes 
Biology Direct  2011;6:13.
Mammalian genome sequence data are being acquired in large quantities and at enormous speeds. We now have a tremendous opportunity to better understand which genes are the most variable or conserved, and what their particular functions and evolutionary dynamics are, through comparative genomics.
We chose human and eleven other high-coverage mammalian genome data–as well as an avian genome as an outgroup–to analyze orthologous protein-coding genes using nonsynonymous (Ka) and synonymous (Ks) substitution rates. After evaluating eight commonly-used methods of Ka and Ks calculation, we observed that these methods yielded a nearly uniform result when estimating Ka, but not Ks (or Ka/Ks). When sorting genes based on Ka, we noticed that fast-evolving and slow-evolving genes often belonged to different functional classes, with respect to species-specificity and lineage-specificity. In particular, we identified two functional classes of genes in the acquired immune system. Fast-evolving genes coded for signal-transducing proteins, such as receptors, ligands, cytokines, and CDs (cluster of differentiation, mostly surface proteins), whereas the slow-evolving genes were for function-modulating proteins, such as kinases and adaptor proteins. In addition, among slow-evolving genes that had functions related to the central nervous system, neurodegenerative disease-related pathways were enriched significantly in most mammalian species. We also confirmed that gene expression was negatively correlated with evolution rate, i.e. slow-evolving genes were expressed at higher levels than fast-evolving genes. Our results indicated that the functional specializations of the three major mammalian clades were: sensory perception and oncogenesis in primates, reproduction and hormone regulation in large mammals, and immunity and angiotensin in rodents.
Our study suggests that Ka calculation, which is less biased compared to Ks and Ka/Ks, can be used as a parameter to sort genes by evolution rate and can also provide a way to categorize common protein functions and define their interaction networks, either pair-wise or in defined lineages or subgroups. Evaluating gene evolution based on Ka and Ks calculations can be done with large datasets, such as mammalian genomes.
This article has been reviewed by Drs. Anamaria Necsulea (nominated by Nicolas Galtier), Subhajyoti De (nominated by Sarah Teichmann) and Claus O. Wilke.
PMCID: PMC3055854  PMID: 21342519
13.  Endosymbiont or host: who drove mitochondrial and plastid evolution? 
Biology Direct  2011;6:12.
The recognition that mitochondria and plastids are derived from alphaproteobacterial and cyanobacterial endosymbionts, respectively, was one of the greatest advances in modern evolutionary biology. Researchers have yet however to provide detailed cell biological descriptions of how these once free-living prokaryotes were transformed into intracellular organelles. A key area of study in this realm is elucidating the evolution of the molecular machines that control organelle protein topogenesis. Alcock et al. (Science 2010, 327 [5966]:649-650) suggest that evolutionary innovations that established the mitochondrial protein sorting system were driven by the alphaproteobacterial endosymbiont (an "insiders' perspective"). In contrast, here we argue that evolution of mitochondrial and plastid topogenesis may better be understood as an outcome of selective pressures acting on host cell chromosomes (the "outsiders' view").
This manuscript was reviewed by Gáspár Jékely, Martijn Huynen, and Purificación López-García.
PMCID: PMC3050876  PMID: 21333023
14.  The role of duplications in the evolution of genomes highlights the need for evolutionary-based approaches in comparative genomics 
Biology Direct  2011;6:11.
Understanding the evolutionary plasticity of the genome requires a global, comparative approach in which genetic events are considered both in a phylogenetic framework and with regard to population genetics and environmental variables. In the mechanisms that generate adaptive and non-adaptive changes in genomes, segmental duplications (duplication of individual genes or genomic regions) and polyploidization (whole genome duplications) are well-known driving forces. The probability of fixation and maintenance of duplicates depends on many variables, including population sizes and selection regimes experienced by the corresponding genes: a combination of stochastic and adaptive mechanisms has shaped all genomes. A survey of experimental work shows that the distinction made between fixation and maintenance of duplicates still needs to be conceptualized and mathematically modeled. Here we review the mechanisms that increase or decrease the probability of fixation or maintenance of duplicated genes, and examine the outcome of these events on the adaptation of the organisms.
This article was reviewed by Dr. Etienne Joly, Dr. Lutz Walter and Dr. W. Ford Doolittle.
PMCID: PMC3052240  PMID: 21333002
15.  A rebuttal to the comments on the genome order index and the Z-curve 
Biology Direct  2011;6:10.
Elhaik, Graur and Josic recently commented on the genome order index (S) and the Z-curve (Elhaik et al. Biol Direct 2010, 5: 10). S is a quantity defined as S = a2 + c2 + g2 + t2, where a, c, g and t denote corresponding base frequencies. The Z-curve is a three dimensional curve that represents a DNA sequence in the manner that each can be uniquely reconstructed given the other. Elhaik et al. made 4 major claims. 1) In the previous mapping system with the regular tetrahedron, calculation of the radius of the inscribed sphere is "a mathematical error". 2) S follows an exponential distribution and is narrowly distributed with a range of (0.25 - 0.33). 3) Based on the Chargaff's second parity rule (PR2), "S is equivalent to H [Shannon entropy]" and they are derivable from each other. 4) Z-curve "suffers from over dimensionality", because based on the analysis of 235 bacterial genomes, x and y components contributed only less than 1% of the variance and therefore "would be of little use".
1) Elhaik et al. mistakenly neglected the parameter 4/3 when calculating the radius of the inscribed sphere. 2) The exponential distribution of S is a restatement of our previous conclusion, and the range of (0.25 - 0.33) only paraphrases the previously suggested S range (0.25 -1/3). 3) Elhaik et al. incorrectly disregard deviations from PR2 by treating the deviations as 0 altogether, reduce S and H, both having 4 variables, a, c, g and t, into functions of one single variable, a only, and apply this treatment to all DNA sequences as the basis of their "demonstration", which is therefore invalid. 4) Elhaik et al. confuse numeral smallness with biological insignificance, and disregard the distributions of purine/pyrimidine and amino/keto bases (x and y components), the variations of which, although can be less than that of GC content, contain rich information that is important and useful, such as in locating replication origins of bacterial and archaeal genomes, and in studies of gene recognition in various species.
Elhaik et al. confuse S (a single number) with Z-curve (a series of 3D coordinates), which are distinct. To use S as a case study of Z-curve, by itself, is invalid. S and H are neither equivalent nor derivable from each other. The criticisms of Elhaik, Graur and Josic are wrong.
This article was reviewed by Erik van Nimwegen.
PMCID: PMC3046898  PMID: 21324187
16.  Evolutionary patterns of phosphorylated serines 
Biology Direct  2011;6:8.
Posttranslationally modified amino acids are chemically distinct types of amino acids and in terms of evolution they might behave differently from their non-modified counterparts. In order to check this possibility, we reconstructed the evolutionary history of phosphorylated serines in several groups of organisms. Comparisons of substitution vectors have revealed some significant differences in the evolution of modified and corresponding non-modified amino acids. In particular, phosphoserines are more frequently substituted to aspartate and glutamate, compared to non-phosphorylated serines.
This article was reviewed by Arcady Mushegian and Sandor Pongor.
PMCID: PMC3044110  PMID: 21306633
17.  Enhanced immunogenicity of pneumococcal surface adhesin A (PsaA) in mice via fusion to recombinant human B lymphocyte stimulator (BLyS) 
Biology Direct  2011;6:9.
B lymphocyte stimulator (BLyS) is a member of the tumor necrosis factor superfamily of ligands that mediates its action through three known receptors. BLyS has been shown to enhance the production of antibodies against heterologous antigens when present at elevated concentrations, supporting an immunostimulatory role for BLyS in vivo.
We constructed a fusion protein consisting of human BLyS and Pneumococcal Surface Adhesin A (PsaA) and used this molecule to immunize mice. The immunostimulatory attributes mediated by BLyS in vivo were evaluated by characterizing immune responses directed against PsaA.
The PsaA-BLyS fusion protein was able to act as a co-stimulant for murine spleen cell proliferation induced with F(ab')2 fragments of anti-IgM in vitro in a fashion similar to recombinant BLyS, and immunization of mice with the PsaA-BLyS fusion protein resulted in dramatically elevated serum antibodies specific for PsaA. Mice immunized with PsaA admixed with recombinant BLyS exhibited only modest elevations in PsaA-specific responses following two immunizations, while mice immunized twice with PsaA alone exhibited undetectable PsaA-specific serum antibody responses. Sera obtained from PsaA-BLyS immunized mice exhibited high titers of IgG1, IgG2a, IgG2b, and IgG3, but no IgA, while mice immunized with PsaA admixed with BLyS exhibited only elevated titers of IgG1 following two immunizations. Splenocytes from PsaA-BLyS immunized mice exhibited elevated levels of secretion of IL-2, IL-4 and IL-5, and a very modest but consistent elevation of IFN-γ following in vitro stimulation with PsaA. In contrast, mice immunized with either PsaA admixed with BLyS or PsaA alone exhibited modestly elevated to absent PsaA-specific recall responses for the same cytokines. Mice deficient for one of the three receptors for BLyS designated Transmembrane activator, calcium modulator, and cyclophilin ligand [CAML] interactor (TACI) exhibited attenuated PsaA-specific serum antibody responses following immunization with PsaA-BLyS relative to wild-type littermates. TACI-deficient mice also exhibited decreased responsiveness to a standard pneumococcal conjugate vaccine.
This study identifies covalent attachment of BLyS as a highly effective adjuvant strategy that may yield improved vaccines. In addition, this is the first report demonstrating an unexpected role for TACI in the elicitation of antibodies by the PsaA-BLyS fusion protein.
This article was reviewed by Jonathan Yewdell, Rachel Gerstein, and Michael Cancro (nominated by Andy Caton).
PMCID: PMC3055212  PMID: 21306646
18.  Gene gain and loss events in Rickettsia and Orientia species 
Biology Direct  2011;6:6.
Genome degradation is an ongoing process in all members of the Rickettsiales order, which makes these bacterial species an excellent model for studying reductive evolution through interspecies variation in genome size and gene content. In this study, we evaluated the degree to which gene loss shaped the content of some Rickettsiales genomes. We shed light on the role played by horizontal gene transfers in the genome evolution of Rickettsiales.
Our phylogenomic tree, based on whole-genome content, presented a topology distinct from that of the whole core gene concatenated phylogenetic tree, suggesting that the gene repertoires involved have different evolutionary histories. Indeed, we present evidence for 3 possible horizontal gene transfer events from various organisms to Orientia and 6 to Rickettsia spp., while we also identified 3 possible horizontal gene transfer events from Rickettsia and Orientia to other bacteria. We found 17 putative genes in Rickettsia spp. that are probably the result of de novo gene creation; 2 of these genes appear to be functional. On the basis of these results, we were able to reconstruct the gene repertoires of "proto-Rickettsiales" and "proto-Rickettsiaceae", which correspond to the ancestors of Rickettsiales and Rickettsiaceae, respectively. Finally, we found that 2,135 genes were lost during the evolution of the Rickettsiaceae to an intracellular lifestyle.
Our phylogenetic analysis allowed us to track the gene gain and loss events occurring in bacterial genomes during their evolution from a free-living to an intracellular lifestyle. We have shown that the primary mechanism of evolution and specialization in strictly intracellular bacteria is gene loss. Despite the intracellular habitat, we found several horizontal gene transfers between Rickettsiales species and various prokaryotic, viral and eukaryotic species.
Open peer review
Reviewed by Arcady Mushegian, Eugene V. Koonin and Patrick Forterre. For the full reviews please go to the Reviewers' comments section.
PMCID: PMC3055210  PMID: 21303508
19.  The multiple personalities of Watson and Crick strands 
Biology Direct  2011;6:7.
In genetics it is customary to refer to double-stranded DNA as containing a "Watson strand" and a "Crick strand." However, there seems to be no consensus in the literature on the exact meaning of these two terms, and the many usages contradict one another as well as the original definition. Here, we review the history of the terminology and suggest retaining a single sense that is currently the most useful and consistent.
The Saccharomyces Genome Database defines the Watson strand as the strand which has its 5'-end at the short-arm telomere and the Crick strand as its complement. The Watson strand is always used as the reference strand in their database. Using this as the basis of our standard, we recommend that Watson and Crick strand terminology only be used in the context of genomics. When possible, the centromere or other genomic feature should be used as a reference point, dividing the chromosome into two arms of unequal lengths. Under our proposal, the Watson strand is standardized as the strand whose 5'-end is on the short arm of the chromosome, and the Crick strand as the one whose 5'-end is on the long arm. Furthermore, the Watson strand should be retained as the reference (plus) strand in a genomic database. This usage not only makes the determination of Watson and Crick unambiguous, but also allows unambiguous selection of reference stands for genomics.
This article was reviewed by John M. Logsdon, Igor B. Rogozin (nominated by Andrey Rzhetsky), and William Martin.
PMCID: PMC3055211  PMID: 21303550
20.  Consequences of cell-to-cell P-glycoprotein transfer on acquired multidrug resistance in breast cancer: a cell population dynamics model 
Biology Direct  2011;6:5.
Cancer is a proliferation disease affecting a genetically unstable cell population, in which molecular alterations can be somatically inherited by genetic, epigenetic or extragenetic transmission processes, leading to a cooperation of neoplastic cells within tumoural tissue. The efflux protein P-glycoprotein (P-gp) is overexpressed in many cancer cells and has known capacity to confer multidrug resistance to cytotoxic therapies. Recently, cell-to-cell P-gp transfers have been shown. Herein, we combine experimental evidence and a mathematical model to examine the consequences of an intercellular P-gp trafficking in the extragenetic transfer of multidrug resistance from resistant to sensitive cell subpopulations.
Methodology and Principal Findings
We report cell-to-cell transfers of functional P-gp in co-cultures of a P-gp overexpressing human breast cancer MCF-7 cell variant, selected for its resistance towards doxorubicin, with the parental sensitive cell line. We found that P-gp as well as efflux activity distribution are progressively reorganized over time in co-cultures analyzed by flow cytometry. A mathematical model based on a Boltzmann type integro-partial differential equation structured by a continuum variable corresponding to P-gp activity describes the cell populations in co-culture. The mathematical model elucidates the population elements in the experimental data, specifically, the initial proportions, the proliferative growth rates, and the transfer rates of P-gp in the sensitive and resistant subpopulations.
We confirmed cell-to-cell transfer of functional P-gp. The transfer process depends on the gradient of P-gp expression in the donor-recipient cell interactions, as they evolve over time. Extragenetically acquired drug resistance is an additional aptitude of neoplastic cells which has implications in the diagnostic value of P-gp expression and in the design of chemotherapy regimens.
This article was reviewed by Leonid Hanin, Anna Marciniak-Czochra and Marek Kimmel.
PMCID: PMC3038988  PMID: 21269489
21.  Issues associated with the use of phosphospecific antibodies to localise active and inactive pools of GSK-3 in cells 
Biology Direct  2011;6:4.
Glycogen synthase kinase-3 (GSK-3) is a ubiquitously expressed serine/threonine (Ser/Thr) kinase comprising two isoforms, GSK-3α and GSK-3β. Both enzymes are similarly inactivated by serine phosphorylation (GSK-3α at Ser21 and GSK-3β at Ser9) and activated by tyrosine phosphorylation (GSK-3α at Tyr279 and GSK-3β at Tyr216). Antibodies raised to phosphopeptides containing the sequences around these phosphorylation sites are frequently used to provide an indication of the activation state of GSK-3 in cell and tissue extracts. These antibodies have further been used to determine the subcellular localisation of active and inactive forms of GSK-3, and the results of those studies support roles for GSK-3 phosphorylation in diverse cellular processes. However, the specificity of these antibodies in immunocytochemistry has not been addressed in any detail.
Taking advantage of gene silencing technology, we examined the specificity of several commercially available anti-phosphorylated GSK-3 antibodies. We show that antibodies raised to peptides containing the phosphorylated Ser21/9 epitope crossreact with unidentified antigens that are highly expressed by mitotic cells and that mainly localise to spindle poles. In addition, two antibodies raised to peptides containing the phosphorylated Tyr279/216 epitope recognise an unidentified protein at focal contacts, and a third antibody recognises a protein found in Ki-67-positive cell nuclei. While the phosphorylated Ser9/21 GSK-3 antibodies also recognise other proteins whose levels increase in mitotic cells in western blots, the phosphorylated Tyr279/216 antibodies appear to be specific in western blotting. However, we cannot rule out the posssibility that they recognise very large or very small proteins that might not be detected using a standard western blotting approach.
Our findings indicate that care should be taken when examining the subcellular localisation of active or inactive GSK-3 and, furthermore, suggest that the role of GSK-3 phosphorylation in some cellular processes be reassessed.
Dr. David Kaplan, Dr. Robert Murphy and Dr. Cara Gottardi (nominated by Dr Avinash Bhandoola.)
PMCID: PMC3039639  PMID: 21261990
22.  Modeling RNA polymerase competition: the effect of σ-subunit knockout and heat shock on gene transcription level 
Biology Direct  2011;6:3.
Modeling of a complex biological process can explain the results of experimental studies and help predict its characteristics. Among such processes is transcription in the presence of competing RNA polymerases. This process involves RNA polymerases collision followed by transcription termination.
A mathematical and computer simulation model is developed to describe the competition of RNA polymerases during genes transcription on complementary DNA strands. E.g., in the barley Hordeum vulgare the polymerase competition occurs in the locus containing plastome genes psbA, rpl23, rpl2 and four bacterial type promoters. In heat shock experiments on isolated chloroplasts, a twofold decrease of psbA transcripts and even larger increase of rpl23-rpl2 transcripts were observed, which is well reproduced in the model. The model predictions are in good agreement with virtually all relevant experimental data (knockout, heat shock, chromatogram data, etc.). The model allows to hypothesize a mechanism of cell response to knockout and heat shock, as well as a mechanism of gene expression regulation in presence of RNA polymerase competition. The model is implemented for multiprocessor platforms with MPI and supported on Linux and MS Windows. The source code written in C++ is available under the GNU General Public License from the laboratory website. A user-friendly GUI version is also provided at
The developed model is in good agreement with virtually all relevant experimental data. The model can be applied to estimate intensities of binding of the holoenzyme and phage type RNA polymerase to their promoters using data on gene transcription levels, as well as to predict characteristics of RNA polymerases and the transcription process that are difficult to measure directly, e.g., the intensity (frequency) of holoenzyme binding to the promoter in correlation to its nucleotide composition and the type of σ-subunit, the amount of transcription initiation aborts, etc. The model can be used to make functional predictions, e.g., heat shock response in isolated chloroplasts and changes of gene transcription levels under knockout of different σ-subunits or RNA polymerases or due to gene expression regulation.
This article was reviewed by Dr. Anthony Almudevar, Dr. Aniko Szabo, Dr. Yuri Wolf (nominated by Dr. Peter Olofsson) and Prof. Marek Kimmel.
PMCID: PMC3038987  PMID: 21255416
23.  Impact of Alu repeats on the evolution of human p53 binding sites 
Biology Direct  2011;6:2.
The p53 tumor suppressor protein is involved in a complicated regulatory network, mediating expression of ~1000 human genes. Recent studies have shown that many p53 in vivo binding sites (BSs) reside in transposable repeats. The relationship between these BSs and functional p53 response elements (REs) remains unknown, however. We sought to understand whether the p53 REs also reside in transposable elements and particularly in the most-abundant Alu repeats.
We have analyzed ~160 functional p53 REs identified so far and found that 24 of them occur in repeats. More than half of these repeat-associated REs reside in Alu elements. In addition, using a position weight matrix approach, we found ~400,000 potential p53 BSs in Alu elements genome-wide. Importantly, these putative BSs are located in the same regions of Alu repeats as the functional p53 REs - namely, in the vicinity of Boxes A/A' and B of the internal RNA polymerase III promoter. Earlier nucleosome-mapping experiments showed that the Boxes A/A' and B have a different chromatin environment, which is critical for the binding of p53 to DNA. Here, we compare the Alu-residing p53 sites with the corresponding Alu consensus sequences and conclude that the p53 sites likely evolved through two different mechanisms - the sites overlapping with the Boxes A/A' were generated by CG → TG mutations; the other sites apparently pre-existed in the progenitors of several Alu subfamilies, such as AluJo and AluSq. The binding affinity of p53 to the Alu-residing sites generally correlates with the age of Alu subfamilies, so that the strongest sites are embedded in the 'relatively young' Alu repeats.
The primate-specific Alu repeats play an important role in shaping the p53 regulatory network in the context of chromatin. One of the selective factors responsible for the frequent occurrence of Alu repeats in introns may be related to the p53-mediated regulation of Alu transcription, which, in turn, influences expression of the host genes.
This paper was reviewed by Igor B. Rogozin (nominated by Pavel A. Pevzner), Sandor Pongor, and I. King Jordan.
PMCID: PMC3032802  PMID: 21208455
24.  The advantages and disadvantages of horizontal gene transfer and the emergence of the first species 
Biology Direct  2011;6:1.
Horizontal Gene Transfer (HGT) is beneficial to a cell if the acquired gene confers a useful function, but is detrimental if the gene has no function, if it is incompatible with existing genes, or if it is a selfishly replicating mobile element. If the balance of these effects is beneficial on average, we would expect cells to evolve high rates of acceptance of horizontally transferred genes, whereas if it is detrimental, cells should reduce the rate of HGT as far as possible. It has been proposed that the rate of HGT was very high in the early stages of prokaryotic evolution, and hence there were no separate lineages of organisms. Only when the HGT rate began to fall, would lineages begin to emerge with their own distinct sets of genes. Evolution would then become more tree-like. This phenomenon has been called the Darwinian Threshold.
We study a model for genome evolution that incorporates both beneficial and detrimental effects of HGT. We show that if rate of gene loss during genome replication is high, as was probably the case in the earliest genomes before the time of the last universal common ancestor, then a high rate of HGT is favourable. HGT leads to the rapid spread of new genes and allows the build-up of larger, fitter genomes than could be achieved by purely vertical inheritance. In contrast, if the gene loss rate is lower, as in modern prokaryotes, then HGT is, on average, unfavourable.
Modern cells should therefore evolve to reduce HGT if they can, although the prevalence of independently replicating mobile elements and viruses may mean that cells cannot avoid HGT in practice. In the model, natural selection leads to gradual improvement of the replication accuracy and gradual decrease in the optimal rate of HGT. By clustering genomes based on gene content, we show that there are no separate lineages of organisms when the rate of HGT is high; however, as the rate of HGT decreases, a tree-like structure emerges with well-defined lineages. The model therefore passes through a Darwinian Threshold.
This article was reviewed by Eugene V. Koonin, Anthony Poole and J. Peter Gogarten.
PMCID: PMC3043529  PMID: 21199581

Results 1-24 (24)