tRNA-derived RNA fragments (tRFs) are 19mer small RNAs that associate with Argonaute (AGO) proteins in humans. However, in plants, it is unknown if tRFs bind with AGO proteins. Here, using public deep sequencing libraries of immunoprecipitated Argonaute proteins (AGO-IP) and bioinformatics approaches, we identified the Arabidopsis thaliana AGO-IP tRFs. Moreover, using three degradome deep sequencing libraries, we identified four putative tRF targets. The expression pattern of tRFs, based on deep sequencing data, was also analyzed under abiotic and biotic stresses. The results obtained here represent a useful starting point for future studies on tRFs in plants.
doi:10.1186/1745-6150-8-6
PMCID: PMC3574835
PMID: 23402430
tRNAs; Small RNA; tRFs; tRNA-derived RNA fragments; Argonaute and Arabidopsis
Abstract
Schizophrenia is a complex disease with uncertain aetiology. We suggest GABBR1, GABA receptor B1 implicated in schizophrenia based on a HERV-W LTR in the regulatory region of GABBR1. Our hypothesis is supported by: (i) GABBR1 is in the 6p22 genomic region most often implicated in schizophrenia; (ii) microarray studies found that only presynaptic pathway-related genes, including GABA receptors, have altered expression in schizophrenic patients and (iii) it explains how HERV-W elements, expressed in schizophrenia, play a role in the disease: by altering the expression of GABBR1 via a long terminal repeat that is also a regulatory element to GABBR1.
Reviewers
This paper was reviewed by Sandor Pongor and Martijn Huynen.
doi:10.1186/1745-6150-8-5
PMCID: PMC3574838
PMID: 23391219
Schizophrenia; Human endogenous retrovirus; HERV-W; long terminal repeat; LTR; GABA; GABBR1; GABA receptor; Enhancer; Silencer
Twintrons represent a special intronic arrangement in which introns of two different types occupy the same gene position. Consequently, alternative splicing of these introns requires two different spliceosomes competing for the same RNA molecule. So far, only two twintrons have been described in insects. Surprisingly, we discovered several such arrangements in vertebrate genomes, which are quite conserved throughout the lineages.
Reviewers
This article was reviewed by Fyodor Kondrashow and Eugene Koonin.
doi:10.1186/1745-6150-8-4
PMCID: PMC3564746
PMID: 23356793
Twintrons; Vertebrate genomes; Gene expression
Background
Alvinella pompejana is an annelid worm that inhabits deep-sea hydrothermal vent sites in the Pacific Ocean. Living at a depth of approximately 2500 meters, these worms experience extreme environmental conditions, including high temperature and pressure as well as high levels of sulfide and heavy metals. A. pompejana is one of the most thermotolerant metazoans, making this animal a subject of great interest for studies of eukaryotic thermoadaptation.
Results
In order to complement existing EST resources we performed deep sequencing of the A. pompejana transcriptome. We identified several thousand novel protein-coding transcripts, nearly doubling the sequence data for this annelid. We then performed an extensive survey of previously established prokaryotic thermoadaptation measures to search for global signals of thermoadaptation in A. pompejana in comparison with mesophilic eukaryotes. In an orthologous set of 457 proteins, we found that the best indicator of thermoadaptation was the difference in frequency of charged versus polar residues (CvP-bias), which was highest in A. pompejana. CvP-bias robustly distinguished prokaryotic thermophiles from prokaryotic mesophiles, as well as the thermophilic fungus Chaetomium thermophilum from mesophilic eukaryotes. Experimental values for thermophilic proteins supported higher CvP-bias as a measure of thermal stability when compared to their mesophilic orthologs. Proteome-wide mean CvP-bias also correlated with the body temperatures of homeothermic birds and mammals.
Conclusions
Our work extends the transcriptome resources for A. pompejana and identifies the CvP-bias as a robust and widely applicable measure of eukaryotic thermoadaptation.
Reviewer
This article was reviewed by Sándor Pongor, L. Aravind and Anthony M. Poole.
doi:10.1186/1745-6150-8-2
PMCID: PMC3564776
PMID: 23324115
Abstract
Interchromosomal chimeric RNA molecules are often transcription products from genomic rearrangement in cancerous cells. Here we report the computational detection of an interchromosomal RNA fusion between ZC3HAV1L and CHMP1A from RNA-seq data of normal human mammary epithelial cells, and experimental confirmation of the chimeric transcript in multiple human cells and tissues. Our experimental characterization also detected three variants of the ZC3HAV1L-CHMP1A chimeric RNA, suggesting that these genes are involved in complex splicing. The fusion sequence at the novel exon-exon boundary, and the absence of corresponding DNA rearrangement suggest that this chimeric RNA is likely produced by trans-splicing in human cells.
Reviewers
This article was reviewed by Rory Johnson (nominated by Fyodor Kondrashov); Gal Avital and Itai Yanai
doi:10.1186/1745-6150-7-49
PMCID: PMC3538553
PMID: 23273016
Chimeric transcripts; RNA fusion; trans-splicing; Genome rearrangement
Background
A long recognized problem is the inference of the supertree S that amalgamates a given set {Gj} of trees Gj, with leaves in each Gj being assigned homologous elements.
We ground on an approach to find the tree S by minimizing the total cost of mappings αj of individual gene trees Gj into S. Traditionally, this cost is defined basically as a sum of duplications and gaps in each αj. The classical problem is to minimize the total cost, where S runs over the set of all trees that contain an exhaustive non-redundant set of species from all input Gj.
Results
We suggest a reformulation of the classical NP-hard problem of building a supertree in terms of the global minimization of the same cost functional but only over species trees S that consist of clades belonging to a fixed set P (e.g., an exhaustive set of clades in all Gj). We developed a deterministic solving algorithm with a low degree polynomial (typically cubic) time complexity with respect to the size of input data.
We define an extensive set of elementary evolutionary events and suggest an original definition of mapping β of tree G into tree S. We introduce the cost functional c(G, S, f ) and define the mapping β as the global minimum of this functional with respect to the variable f, in which sense it is a generalization of classical mapping α.
We suggest a reformulation of the classical NP-hard mapping (reconciliation) problem by introducing time slices into the species tree S and present a cubic time solving algorithm to compute the mapping β. We introduce two novel definitions of the evolutionary scenario based on mapping β or a random process of gene evolution along a species tree.
Conclusions
Developed algorithms are mathematically proved, which justifies the following statements. The supertree building algorithm finds exactly the global minimum of the total cost if only gene duplications and losses are allowed and the given sets of gene trees satisfies a certain condition. The mapping algorithm finds exactly the minimal mapping β, the minimal total cost and the evolutionary scenario as a minimum over all possible distributions of elementary evolutionary events along the edges of tree S.
The algorithms and their effective software implementations provide useful tools in many biological studies. They facilitate processing of voluminous tree data in acceptable time still largely avoiding heuristics. Performance of the tools is tested with artificial and prokaryotic tree data.
Reviewers
This article was reviewed by Prof. Anthony Almudevar, Prof. Alexander Bolshoy (nominated by Prof. Peter Olofsson), and Prof. Marek Kimmel.
doi:10.1186/1745-6150-7-48
PMCID: PMC3577452
PMID: 23259766
Phylogenetics; Fast algorithms; Tree inference; Species tree; Tree amalgamation; Tree reconciliation; Supertree; Evolutionary events; Gene duplication; Gene loss; Horizontal gene transfer; Gene gain; Time slices
Clusters of localized hypermutation in human breast cancer genomes, named “kataegis” (from the Greek for thunderstorm), are hypothesized to result from multiple cytosine deaminations catalyzed by AID/APOBEC proteins. However, a direct link between APOBECs and kataegis is still lacking. We have sequenced the genomes of yeast mutants induced in diploids by expression of the gene for PmCDA1, a hypermutagenic deaminase from sea lamprey. Analysis of the distribution of 5,138 induced mutations revealed localized clusters very similar to those found in tumors. Our data provide evidence that unleashed cytosine deaminase activity is an evolutionary conserved, prominent source of genome-wide kataegis events.
Reviewers
This article was reviewed by: Professor Sandor Pongor, Professor Shamil R. Sunyaev, and Dr Vladimir Kuznetsov.
doi:10.1186/1745-6150-7-47
PMCID: PMC3542020
PMID: 23249472
APOBEC; Deaminase; Mutation; Kataegis; Cancer; Diploid yeast; Hypermutation
Background
Collections of Clusters of Orthologous Genes (COGs) provide indispensable tools for comparative genomic analysis, evolutionary reconstruction and functional annotation of new genomes. Initially, COGs were made for all complete genomes of cellular life forms that were available at the time. However, with the accumulation of thousands of complete genomes, construction of a comprehensive COG set has become extremely computationally demanding and prone to error propagation, necessitating the switch to taxon-specific COG collections. Previously, we reported the collection of COGs for 41 genomes of Archaea (arCOGs). Here we present a major update of the arCOGs and describe evolutionary reconstructions to reveal general trends in the evolution of Archaea.
Results
The updated version of the arCOG database incorporates 91% of the pangenome of 120 archaea (251,032 protein-coding genes altogether) into 10,335 arCOGs. Using this new set of arCOGs, we performed maximum likelihood reconstruction of the genome content of archaeal ancestral forms and gene gain and loss events in archaeal evolution. This reconstruction shows that the last Common Ancestor of the extant Archaea was an organism of greater complexity than most of the extant archaea, probably with over 2,500 protein-coding genes. The subsequent evolution of almost all archaeal lineages was apparently dominated by gene loss resulting in genome streamlining. Overall, in the evolution of Archaea as well as a representative set of bacteria that was similarly analyzed for comparison, gene losses are estimated to outnumber gene gains at least 4 to 1. Analysis of specific patterns of gene gain in Archaea shows that, although some groups, in particular Halobacteria, acquire substantially more genes than others, on the whole, gene exchange between major groups of Archaea appears to be largely random, with no major ‘highways’ of horizontal gene transfer.
Conclusions
The updated collection of arCOGs is expected to become a key resource for comparative genomics, evolutionary reconstruction and functional annotation of new archaeal genomes. Given that, in spite of the major increase in the number of genomes, the conserved core of archaeal genes appears to be stabilizing, the major evolutionary trends revealed here have a chance to stand the test of time.
Reviewers
This article was reviewed by (for complete reviews see the Reviewers’ Reports section): Dr. PLG, Prof. PF, Dr. PL (nominated by Prof. JPG).
doi:10.1186/1745-6150-7-46
PMCID: PMC3534625
PMID: 23241446
Archaea; Orthologs; Horizontal gene transfer
Abstract
Ethanolamine is used as an energy source by phylogenetically diverse bacteria including pathogens, by the concerted action of proteins from the eut-operon. Previous studies have revealed the presence of eutBC genes encoding ethanolamine-ammonia lyase, a key enzyme that breaks ethanolamine into acetaldehyde and ammonia, in about 100 bacterial genomes including members of gamma-proteobacteria. However, ethanolamine utilization has not been reported for any member of the Vibrio genus. Our comparative genomics study reveals the presence of genes that are involved in ethanolamine utilization in several Vibrio species. Using Vibrio alginolyticus as a model system we demonstrate that ethanolamine is better utilized as a nitrogen source than as a carbon source.
Reviewers
This article was reviewed by Dr. Lakshminarayan Iyer and Dr. Vivek Anantharaman (nominated by Dr. L Aravind).
doi:10.1186/1745-6150-7-45
PMCID: PMC3542024
PMID: 23234435
Pathogenesis; Ethanolamine; Vibrio; Eut operon; Metabolosome
Background
Life depends on biopolymer sequences as catalysts and as genetic material. A key step in the Origin of Life is the emergence of an autocatalytic system of biopolymers. Here we study computational models that address the way a living autocatalytic system could have emerged from a non-living chemical system, as envisaged in the RNA World hypothesis.
Results
We consider (i) a chemical reaction system describing RNA polymerization, and (ii) a simple model of catalytic replicators that we call the Two’s Company model. Both systems have two stable states: a non-living state, characterized by a slow spontaneous rate of RNA synthesis, and a living state, characterized by rapid autocatalytic RNA synthesis. The origin of life is a transition between these two stable states. The transition is driven by stochastic concentration fluctuations involving relatively small numbers of molecules in a localized region of space. These models are simulated on a two-dimensional lattice in which reactions occur locally on single sites and diffusion occurs by hopping of molecules to neighbouring sites.
Conclusions
If diffusion is very rapid, the system is well-mixed. The transition to life becomes increasingly difficult as the lattice size is increased because the concentration fluctuations that drive the transition become relatively smaller when larger numbers of molecules are involved. In contrast, when diffusion occurs at a finite rate, concentration fluctuations are local. The transition to life occurs in one local region and then spreads across the rest of the surface. The transition becomes easier with larger lattice sizes because there are more independent regions in which it could occur. The key observations that apply to our models and to the real world are that the origin of life is a rare stochastic event that is localized in one region of space due to the limited rate of diffusion of the molecules involved and that the subsequent spread across the surface is deterministic. It is likely that the time required for the deterministic spread is much shorter than the waiting time for the origin, in which case life evolves only once on a planet, and then rapidly occupies the whole surface.
Reviewers
Reviewed by Omer Markovitch (nominated by Doron Lancet), Claus Wilke, and Nobuto Takeuchi (nominated by Eugene Koonin).
doi:10.1186/1745-6150-7-42
PMCID: PMC3541068
PMID: 23176307
Abstract
It has been recently discovered that transposable elements show high activity in the brain of mammals, however, the magnitude of their influence on its functioning is unclear so far. In this paper, I use flux balance analysis to examine the influence of somatic retrotransposition on brain metabolism, and the biosynthesis of its key metabolites, including neurotransmitters. The analysis shows that somatic transposition in the human brain can influence the biosynthesis of more than 250 metabolites, including dopamine, serotonin and glutamate, shows large inter-individual variability in metabolic effects, and may contribute to the development of Parkinson’s disease and schizophrenia.
Reviewers
This article was reviewed by Dr Kenji Kojima (nominated by Dr Jerzy Jurka) and Dr Eugene Koonin.
doi:10.1186/1745-6150-7-41
PMCID: PMC3534579
PMID: 23176288
Retrotransposition; Brain; Metabolic network; Parkinson’s disease; Schizophrenia
Background
The virus-host arms race is a major theater for evolutionary innovation. Archaea and bacteria have evolved diverse, elaborate antivirus defense systems that function on two general principles: i) immune systems that discriminate self DNA from nonself DNA and specifically destroy the foreign, in particular viral, genomes, whereas the host genome is protected, or ii) programmed cell suicide or dormancy induced by infection.
Presentation of the hypothesis
Almost all genomic loci encoding immunity systems such as CRISPR-Cas, restriction-modification and DNA phosphorothioation also encompass suicide genes, in particular those encoding known and predicted toxin nucleases, which do not appear to be directly involved in immunity. In contrast, the immunity systems do not appear to encode antitoxins found in typical toxin-antitoxin systems. This raises the possibility that components of the immunity system themselves act as reversible inhibitors of the associated toxin proteins or domains as has been demonstrated for the Escherichia coli anticodon nuclease PrrC that interacts with the PrrI restriction-modification system. We hypothesize that coupling of diverse immunity and suicide/dormancy systems in prokaryotes evolved under selective pressure to provide robustness to the antivirus response. We further propose that the involvement of suicide/dormancy systems in the coupled antivirus response could take two distinct forms:
1) induction of a dormancy-like state in the infected cell to ‘buy time’ for activation of adaptive immunity; 2) suicide or dormancy as the final recourse to prevent viral spread triggered by the failure of immunity.
Testing the hypothesis
This hypothesis entails many experimentally testable predictions. Specifically, we predict that Cas2 protein present in all cas operons is a mRNA-cleaving nuclease (interferase) that might be activated at an early stage of virus infection to enable incorporation of virus-specific spacers into the CRISPR locus or to trigger cell suicide when the immune function of CRISPR-Cas systems fails. Similarly, toxin-like activity is predicted for components of numerous other defense loci.
Implications of the hypothesis
The hypothesis implies that antivirus response in prokaryotes involves key decision-making steps at which the cell chooses the path to follow by sensing the course of virus infection.
Reviewers
This article was reviewed by Arcady Mushegian, Etienne Joly and Nick Grishin. For complete reviews, go to the Reviewers’ reports section.
doi:10.1186/1745-6150-7-40
PMCID: PMC3506569
PMID: 23151069
Members of the Arabidopsis LSH1 and Oryza G1 (ALOG) family of proteins have been shown to function as key developmental regulators in land plants. However, their precise mode of action remains unclear. Using sensitive sequence and structure analysis, we show that the ALOG domains are a distinct version of the N-terminal DNA-binding domain shared by the XerC/D-like, protelomerase, topoisomerase-IA, and Flp tyrosine recombinases. ALOG domains are distinguished by the insertion of an additional zinc ribbon into this DNA-binding domain. In particular, we show that the ALOG domain is derived from the XerC/D-like recombinases of a novel class of DIRS-1-like retroposons. Copies of this element, which have been recently inactivated, are present in several marine metazoan lineages, whereas the stramenopile Ectocarpus, retains an active copy of the same. Thus, we predict that ALOG domains help establish organ identity and differentiation by binding specific DNA sequences and acting as transcription factors or recruiters of repressive chromatin. They are also found in certain plant defense proteins, where they are predicted to function as DNA sensors. The evolutionary history of the ALOG domain represents a unique instance of a domain, otherwise exclusively found in retroelements, being recruited as a specific transcription factor in the streptophyte lineage of plants. Hence, they add to the growing evidence for derivation of DNA-binding domains of eukaryotic specific TFs from mobile and selfish elements.
doi:10.1186/1745-6150-7-39
PMCID: PMC3537659
PMID: 23146749
DIRS1; Tyrosine recombinase; Plant development; DNA-binding; Retroposon; Transcription factor; Chromatin protein; Plant defense
Background
Cellular life with complex metabolism probably evolved during the reign of RNA, when it served as both information carrier and enzyme. Jensen proposed that enzymes of primordial cells possessed broad specificities: they were generalist. When and under what conditions could primordial metabolism run by generalist enzymes evolve to contemporary-type metabolism run by specific enzymes?
Results
Here we show by numerical simulation of an enzyme-catalyzed reaction chain that specialist enzymes spread after the invention of the chromosome because protocells harbouring unlinked genes maintain largely non-specific enzymes to reduce their assortment load. When genes are linked on chromosomes, high enzyme specificity evolves because it increases biomass production, also by reducing taxation by side reactions.
Conclusion
The constitution of the genetic system has a profound influence on the limits of metabolic efficiency. The major evolutionary transition to chromosomes is thus proven to be a prerequisite for a complex metabolism. Furthermore, the appearance of specific enzymes opens the door for the evolution of their regulation.
Reviewers
This article was reviewed by Sándor Pongor, Gáspár Jékely, and Rob Knight.
doi:10.1186/1745-6150-7-38
PMCID: PMC3534232
PMID: 23114029
Origin of life; Chromosome; Metabolism; Ribozyme; Major transitions; Enzyme evolution
Wood, Derrick E | Lin, Henry | Levy-Moonshine, Ami | Swaminathan, Rajiswari | Chang, Yi-Chien | Anton, Brian P | Osmani, Lais | Steffen, Martin | Kasif, Simon | Salzberg, Steven L
Background
The dramatic reduction in the cost of sequencing has allowed many researchers to join in the effort of sequencing and annotating prokaryotic genomes. Annotation methods vary considerably and may fail to identify some genes. Here we draw attention to a large number of likely genes missing from annotations using common tools such as Glimmer and BLAST.
Results
By analyzing 1,474 prokaryotic genome annotations in GenBank, we identify 13,602 likely missed genes that are homologs to non-hypothetical proteins, and 11,792 likely missed genes that are homologs only to hypothetical proteins, yet have supporting evidence of their protein-coding nature from COMBREX, a newly created gene function database. We also estimate the likelihood that each potential missing gene found is a genuine protein-coding gene using COMBREX.
Conclusions
Our analysis of the causes of missed genes suggests that larger annotation centers tend to produce annotations with fewer missed genes than smaller centers, and many of the missed genes are short genes <300 bp. Over 1,000 of the likely missed genes could be associated with phenotype information available in COMBREX. 359 of these genes, found in pathogenic organisms, may be potential targets for pharmaceutical research. The newly identified genes are available on COMBREX’s website.
Reviewers
This article was reviewed by Daniel Haft, Arcady Mushegian, and M. Pilar Francino (nominated by David Ardell).
doi:10.1186/1745-6150-7-37
PMCID: PMC3534567
PMID: 23111013
Background
Mammalian genomes are repositories of repetitive DNA sequences derived from transposable elements (TEs). Typically, TEs generate multiple, mostly inactive copies of themselves, commonly known as repetitive families or families of repeats. Recently, we proposed that families of TEs originate in small populations by genetic drift and that the origin of small subpopulations from larger populations can be fueled by biological innovations.
Results
We report three distinct groups of repetitive families preserved in the human genome that expanded and declined during the three previously described periods of regulatory innovations in vertebrate genomes. The first group originated prior to the evolutionary separation of the mammalian and bird lineages and the second one during subsequent diversification of the mammalian lineages prior to the origin of eutherian lineages. The third group of families is primate-specific.
Conclusions
The observed correlation implies a relationship between regulatory innovations and the origin of repetitive families. Consistent with our previous hypothesis, it is proposed that regulatory innovations fueled the origin of new subpopulations in which new repetitive families became fixed by genetic drift.
Reviewers
Eugene Koonin, I. King Jordan, Jürgen Brosius.
doi:10.1186/1745-6150-7-36
PMCID: PMC3500645
PMID: 23098210
Transposable elements; Conserved repeats; Genetic drift; Evolution
Viruses with large genomes encode numerous proteins that do not directly participate in virus biogenesis but rather modify key functional systems of infected cells. We report that a distinct group of giant viruses infecting unicellular eukaryotes that includes Organic Lake Phycodnaviruses and Phaeocystis globosa virus encode predicted proteorhodopsins that have not been previously detected in viruses. Search of metagenomic sequence data shows that putative viral proteorhodopsins are extremely abundant in marine environments. Phylogenetic analysis suggests that giant viruses acquired proteorhodopsins via horizontal gene transfer from proteorhodopsin-encoding protists although the actual donor(s) could not be presently identified. The pattern of conservation of the predicted functionally important amino acid residues suggests that viral proteorhodopsin homologs function as sensory rhodopsins. We hypothesize that viral rhodopsins modulate light-dependent signaling, in particular phototaxis, in infected protists.
This article was reviewed by Igor B. Zhulin and Laksminarayan M. Iyer. For the full reviews, see the Reviewers’ reports section.
doi:10.1186/1745-6150-7-34
PMCID: PMC3500653
PMID: 23036091
Background
The availability of over 3000 published genome sequences has enabled the use of comparative genomic approaches to drive the biological function discovery process. Classically, one used to link gene with function by genetic or biochemical approaches, a lengthy process that often took years. Phylogenetic distribution profiles, physical clustering, gene fusion, co-expression profiles, structural information and other genomic or post-genomic derived associations can be now used to make very strong functional hypotheses. Here, we illustrate this shift with the analysis of the DUF71/COG2102 family, a subgroup of the PP-loop ATPase family.
Results
The DUF71 family contains at least two subfamilies, one of which was predicted to be the missing diphthine-ammonia ligase (EC 6.3.1.14), Dph6. This enzyme catalyzes the last ATP-dependent step in the synthesis of diphthamide, a complex modification of Elongation Factor 2 that can be ADP-ribosylated by bacterial toxins. Dph6 orthologs are found in nearly all sequenced Archaea and Eucarya, as expected from the distribution of the diphthamide modification. The DUF71 family appears to have originated in the Archaea/Eucarya ancestor and to have been subsequently horizontally transferred to Bacteria. Bacterial DUF71 members likely acquired a different function because the diphthamide modification is absent in this Domain of Life. In-depth investigations suggest that some archaeal and bacterial DUF71 proteins participate in B12 salvage.
Conclusions
This detailed analysis of the DUF71 family members provides an example of the power of integrated data-miming for solving important “missing genes” or “missing function” cases and illustrates the danger of functional annotation of protein families by homology alone.
Reviewers’ names
This article was reviewed by Arcady Mushegian, Michael Galperin and L. Aravind.
doi:10.1186/1745-6150-7-32
PMCID: PMC3541065
PMID: 23013770
Diphthamide; Vitamin B12; Amidotransferase; Comparative genomics
Background
In this work a mathematical model describing the growth of a solid tumour in the presence of an immune system response is presented. Specifically, attention is focused on the interactions between cytotoxic T-lymphocytes (CTLs) and tumour cells in a small, avascular multicellular tumour. At this stage of the disease the CTLs and the tumour cells are considered to be in a state of dynamic equilibrium or cancer dormancy. The precise biochemical and cellular mechanisms by which CTLs can control a cancer and keep it in a dormant state are still not completely understood from a biological and immunological point of view. The mathematical model focuses on the spatio-temporal dynamics of tumour cells, immune cells, chemokines and “chemorepellents” in an immunogenic tumour. The CTLs and tumour cells are assumed to migrate and interact with each other in such a way that lymphocyte-tumour cell complexes are formed. These complexes result in either the death of the tumour cells (the normal situation) or the inactivation of the lymphocytes and consequently the survival of the tumour cells. In the latter case, we assume that each tumour cell that survives its “brief encounter” with the CTLs undergoes certain beneficial phenotypic changes.
Results
We explore the dynamics of the model under these assumptions and show that the process of immuno-evasion can arise as a consequence of these encounters. We show that the proposed mechanism not only shape the dynamics of the total number of tumor cells and of CTLs, but also the dynamics of their spatial distribution. We also briefly discuss the evolutionary features of our model, by framing them in the recent quasi-Lamarckian theories.
Conclusions
Our findings might have some interesting implication of interest for clinical practice. Indeed, immuno-editing process can be seen as an “involuntary” antagonistic process acting against immunotherapies, which aim at maintaining a tumor in a dormant state, or at suppressing it.
Reviewers
This article was reviewed by G. Bocharov (nominated by V. Kuznetsov, member of the Editorial Board of Biology Direct), M. Kimmel and A. Marciniak-Czochra.
doi:10.1186/1745-6150-7-31
PMCID: PMC3582466
PMID: 23009638
Tumour growth; Immune response; Cytotoxic T-lymphocytes; Immuno-evasion; Mathematical models; Chemotaxis; Diffusion; Immuno-editing
Background
The evolution and genomic stop codon frequencies have not been rigorously studied with the exception of coding of non-canonical amino acids. Here we study the rate of evolution and frequency distribution of stop codons in bacterial genomes.
Results
We show that in bacteria stop codons evolve slower than synonymous sites, suggesting the action of weak negative selection. However, the frequency of stop codons relative to genomic nucleotide content indicated that this selection regime is not straightforward. The frequency of TAA and TGA stop codons is GC-content dependent, with TAA decreasing and TGA increasing with GC-content, while TAG frequency is independent of GC-content. Applying a formal, analytical model to these data we found that the relationship between stop codon frequencies and nucleotide content cannot be explained by mutational biases or selection on nucleotide content. However, with weak nucleotide content-dependent selection on TAG, -0.5 < Nes < 1.5, the model fits all of the data and recapitulates the relationship between TAG and nucleotide content. For biologically plausible rates of mutations we show that, in bacteria, TAG stop codon is universally associated with lower fitness, with TAA being the optimal for G-content < 16% while for G-content > 16% TGA has a higher fitness than TAG.
Conclusions
Our data indicate that TAG codon is universally suboptimal in the bacterial lineage, such that TAA is likely to be the preferred stop codon for low GC content while the TGA is the preferred stop codon for high GC content. The optimization of stop codon usage may therefore be useful in genome engineering or gene expression optimization applications.
Reviewers
This article was reviewed by Michail Gelfand, Arcady Mushegian and Shamil Sunyaev. For the full reviews, please go to the Reviewers’ Comments section.
doi:10.1186/1745-6150-7-30
PMCID: PMC3549826
PMID: 22974057
Background
Acid Mine Drainages (AMDs) are extreme environments characterized by very acid conditions and heavy metal contaminations. In these ecosystems, the bacterial diversity is considered to be low. Previous culture-independent approaches performed in the AMD of Carnoulès (France) confirmed this low species richness. However, very little is known about the cultured bacteria in this ecosystem. The aims of the study were firstly to apply novel culture methods in order to access to the largest cultured bacterial diversity, and secondly to better define the robustness of the community for 3 important functions: As(III) oxidation, cellulose degradation and cobalamine biosynthesis.
Results
Despite the oligotrophic and acidic conditions found in AMDs, the newly designed media covered a large range of nutrient concentrations and a pH range from 3.5 to 9.8, in order to target also non-acidophilic bacteria. These approaches generated 49 isolates representing 19 genera belonging to 4 different phyla. Importantly, overall diversity gained 16 extra genera never detected in Carnoulès. Among the 19 genera, 3 were previously uncultured, one of them being novel in databases. This strategy increased the overall diversity in the Carnoulès sediment by 70% when compared with previous culture-independent approaches, as specific phylogenetic groups (e.g. the subclass Actinobacteridae or the order Rhizobiales) were only detected by culture. Cobalamin auxotrophy, cellulose degradation and As(III)-oxidation are 3 crucial functions in this ecosystem, and a previous meta- and proteo-genomic work attributed each function to only one taxon. Here, we demonstrate that other members of this community can also assume these functions, thus increasing the overall community robustness.
Conclusions
This work highlights that bacterial diversity in AMDs is much higher than previously envisaged, thus pointing out that the AMD system is functionally more robust than expected. The isolated bacteria may be part of the rare biosphere which remained previously undetected due to molecular biases. No matter their current ecological relevance, the exploration of the full diversity remains crucial to decipher the function and dynamic of any community. This work also underlines the importance to associate culture-dependent and -independent approaches to gain an integrative view of the community function.
Reviewers
This paper was reviewed by Sándor Pongor, Eugene V. Koonin and Brett Baker (nominated by Purificacion Lopez-Garcia).
doi:10.1186/1745-6150-7-28
PMCID: PMC3443666
PMID: 22963335
Acid mine drainage (AMD); Alkaliphilic bacteria; Neutrophilic bacteria; Functional redundancy; Rare biosphere; Uncultured bacteria; Molecular biases; Culture-dependent approaches; Actinobacteria; Bacterial diversity
Background
In previous work, we introduced a concept, a mathematical model and its computer realization that describe the interaction between bacterial and phage type RNA polymerases, protein factors, DNA and RNA secondary structures during transcription, including transcription initiation and termination. The model accurately reproduces changes of gene transcription level observed in polymerase sigma-subunit knockout and heat shock experiments in plant plastids. The corresponding computer program and a user guide are available at http://lab6.iitp.ru/en/rivals. Here we apply the model to the analysis of transcription and (partially) translation processes in the mitochondria of frog, rat and human. Notably, mitochondria possess only phage-type polymerases. We consider the entire mitochondrial genome so that our model allows RNA polymerases to complete more than one circle on the DNA strand.
Results
Our model of RNA polymerase interaction during transcription initiation and elongation accurately reproduces experimental data obtained for plastids. Moreover, it also reproduces evidence on bulk RNA concentrations and RNA half-lives in the mitochondria of frog, human with or without the MELAS mutation, and rat with normal (euthyroid) or hyposecretion of thyroid hormone (hypothyroid). The transcription characteristics predicted by the model include: (i) the fraction of polymerases terminating at a protein-dependent terminator in both directions (the terminator polarization), (ii) the binding intensities of the regulatory protein factor (mTERF) with the termination site and, (iii) the transcription initiation intensities (initiation frequencies) of all promoters in all five conditions (frog, healthy human, human with MELAS syndrome, healthy rat, and hypothyroid rat with aberrant mtDNA methylation). Using the model, absolute levels of all gene transcription can be inferred from an arbitrary array of the three transcription characteristics, whereas, for selected genes only relative RNA concentrations have been experimentally determined. Conversely, these characteristics and absolute transcription levels can be obtained using relative RNA concentrations and RNA half-lives known from various experimental studies. In this case, the “inverse problem” is solved with multi-objective optimization.
Conclusions
In this study, we demonstrate that our model accurately reproduces all relevant experimental data available for plant plastids, as well as the mitochondria of chordates. Using experimental data, the model is applied to estimate binding intensities of phage-type RNA polymerases to their promoters as well as predicting terminator characteristics, including polarization. In addition, one can predict characteristics of phage-type RNA polymerases and the transcription process that are difficult to measure directly, e.g., the association between the promoter’s nucleotide composition and the intensity of polymerase binding. To illustrate the application of our model in functional predictions, we propose a possible mechanism for MELAS syndrome development in human involving a decrease of Phe-tRNA, Val-tRNA and rRNA concentrations in the cell. In addition, we describe how changes in methylation patterns of the mTERF binding site and three promoters in hypothyroid rat correlate with changes in intensities of the mTERF binding and transcription initiations. Finally, we introduce an auxiliary model to describe the interaction between polysomal mRNA and ribonucleases.
doi:10.1186/1745-6150-7-26
PMCID: PMC3583402
PMID: 22873568
RNA polymerase interaction; RNA polymerase competition; Transcription; Circular DNA; mtDNA in chordates; MELAS syndrome; Impact of DNA methylation; Hyposecretion of hormones; RNA interaction model; Polysome and ribonuclease interaction model
Background
The availability of sequencing technology has enabled understanding of transcriptomes through genome-wide approaches including RNA-sequencing. Contrary to the previous assumption that large tracts of the eukaryotic genomes are not transcriptionally active, recent evidence from transcriptome sequencing approaches have revealed pervasive transcription in many genomes of higher eukaryotes. Many of these loci encode transcripts that have no obvious protein-coding potential and are designated as non-coding RNA (ncRNA). Non-coding RNAs are classified empirically as small and long non-coding RNAs based on the size of the functional RNAs. Each of these classes is further classified into functional subclasses. Although microRNAs (miRNA), one of the major subclass of ncRNAs, have been extensively studied for their roles in regulation of gene expression and involvement in a large number of patho-physiological processes, the functions of a large proportion of long non-coding RNAs (lncRNA) still remains elusive. We hypothesized that some lncRNAs could potentially be processed to small RNA and thus could have a dual regulatory output.
Results
Integration of large-scale independent experimental datasets in public domain revealed that certain well studied lncRNAs harbor small RNA clusters. Expression analysis of the small RNA clusters in different tissue and cell types reveal that they are differentially regulated suggesting a regulated biogenesis mechanism.
Conclusions
Our analysis suggests existence of a potentially novel pathway for lncRNA processing into small RNAs. Expression analysis, further suggests that this pathway is regulated. We argue that this evidence supports our hypothesis, though limitations of the datasets and analysis cannot completely rule out alternate possibilities. Further in-depth experimental verification of the observation could potentially reveal a novel pathway for biogenesis.
Reviewers
This article was reviewed by Dr Rory Johnson (nominated by Fyodor Kondrashov), Dr Raya Khanin (nominated by Dr Yuriy Gusev) and Prof Neil Smalheiser. For full reviews, please go to the Reviewer’s comment section.
doi:10.1186/1745-6150-7-25
PMCID: PMC3477000
PMID: 22871084
Background
CRISPR/Cas (Clustered Regularly Interspaced Short Palindromic Repeats/CRISPR associated sequences) is a recently discovered prokaryotic defense system against foreign DNA, including viruses and plasmids. CRISPR cassette is transcribed as a continuous transcript (pre-crRNA), which is processed by Cas proteins into small RNA molecules (crRNAs) that are responsible for defense against invading viruses. Experiments in E. coli report that overexpression of cas genes generates a large number of crRNAs, from only few pre-crRNAs.
Results
We here develop a minimal model of CRISPR processing, which we parameterize based on available experimental data. From the model, we show that the system can generate a large amount of crRNAs, based on only a small decrease in the amount of pre-crRNAs. The relationship between the decrease of pre-crRNAs and the increase of crRNAs corresponds to strong linear amplification. Interestingly, this strong amplification crucially depends on fast non-specific degradation of pre-crRNA by an unidentified nuclease. We show that overexpression of cas genes above a certain level does not result in further increase of crRNA, but that this saturation can be relieved if the rate of CRISPR transcription is increased. We furthermore show that a small increase of CRISPR transcription rate can substantially decrease the extent of cas gene activation necessary to achieve a desired amount of crRNA.
Conclusions
The simple mathematical model developed here is able to explain existing experimental observations on CRISPR transcript processing in Escherichia coli. The model shows that a competition between specific pre-crRNA processing and non-specific degradation determines the steady-state levels of crRNA and is responsible for strong linear amplification of crRNAs when cas genes are overexpressed. The model further shows how disappearance of only a few pre-crRNA molecules normally present in the cell can lead to a large (two orders of magnitude) increase of crRNAs upon cas overexpression. A crucial ingredient of this large increase is fast non-specific degradation by an unspecified nuclease, which suggests that a yet unidentified nuclease(s) is a major control element of CRISPR response. Transcriptional regulation may be another important control mechanism, as it can either increase the amount of generated pre-crRNA, or alter the level of cas gene activity.
Reviewers
This article was reviewed by Mikhail Gelfand, Eugene Koonin and L Aravind.
doi:10.1186/1745-6150-7-24
PMCID: PMC3537551
PMID: 22849651
CRISPR/Cas; Transcript processing; Small RNA; CRISPR expression regulation; CRISPR/Cas response
Background
The overwhelming majority of animal species exhibit bilateral symmetry. However, the precise evolutionary importance of bilateral symmetry is unknown, although elements of the understanding of the phenomenon have been present within the scientific community for decades.
Presentation of the hypothesis
Here we show, with very simple physical laws, that locomotion in three-dimensional macro-world space is itself sufficient to explain the maintenance of bilateral symmetry in animal evolution. The ability to change direction, a key element of locomotion, requires the generation of instantaneous “pushing” surfaces, from which the animal can obtain the necessary force to depart in the new direction. We show that bilateral is the only type of symmetry that can maximize this force; thus, an actively locomoting bilateral body can have the maximal manoeuvrability as compared to other symmetry types. This confers an obvious selective advantage on the bilateral animal.
Implications of the hypothesis
These considerations imply the view that animal evolution is a highly channelled process, in which bilateral and radial body symmetries seem to be inevitable.
Reviewers
This article was reviewed by Gáspár Jékely, L. Aravind and Eugene Koonin.
doi:10.1186/1745-6150-7-22
PMCID: PMC3438024
PMID: 22789130
Bilateral symmetry; Radial symmetry; Manoeuvrability; Drag; Drag coefficient