Accumulation of mitochondrial DNA (mtDNA) mutations has been implicated in a wide range of human pathologies, including neurodegenerative diseases, sarcopenia, and the aging process itself. In cells, mtDNA molecules are constantly turned over (i.e. replicated and degraded) and are also exchanged among mitochondria during the fusion and fission of these organelles. While the expansion of a mutant mtDNA population is believed to occur by random segregation of these molecules during turnover, the role of mitochondrial fusion-fission in this context is currently not well understood. In this study, an in silico modeling approach is taken to investigate the effects of mitochondrial fusion and fission dynamics on mutant mtDNA accumulation. Here we report model simulations suggesting that when mitochondrial fusion-fission rate is low, the slow mtDNA mixing can lead to an uneven distribution of mutant mtDNA among mitochondria in between two mitochondrial autophagic events leading to more stochasticity in the outcomes from a single random autophagic event. Consequently, slower mitochondrial fusion-fission results in higher variability in the mtDNA mutation burden among cells in a tissue over time, and mtDNA mutations have a higher propensity to clonally expand due to the increased stochasticity. When these mutations affect cellular energetics, nuclear retrograde signalling can upregulate mtDNA replication, which is expected to slow clonal expansion of these mutant mtDNA. However, our simulations suggest that the protective ability of retrograde signalling depends on the efficiency of fusion-fission process. Our results thus shed light on the interplay between mitochondrial fusion-fission and mtDNA turnover and may explain the mechanism underlying the experimentally observed increase in the accumulation of mtDNA mutations when either mitochondrial fusion or fission is inhibited.
Type III Secretion Systems (T3SSs) play important roles in the interaction between gram-negative bacteria and their hosts. T3SSs function by translocating a group of bacterial effector proteins into the host cytoplasm. The details of specific type III secretion process are yet to be clarified. This research focused on comparing the amino acid composition within the N-terminal 100 amino acids from type III secretion (T3S) signal sequences or non-T3S proteins, specifically whether each residue exerts a constraint on residues found in adjacent positions. We used these comparisons to set up a statistic model to quantitatively model and effectively distinguish T3S effectors.
In this study, the amino acid composition (Aac) probability profiles conditional on its sequentially preceding position and corresponding amino acids were compared between N-terminal sequences of T3S and non-T3S proteins. The profiles are generally different. A Markov model, namely T3_MM, was consequently designed to calculate the total Aac conditional probability difference, i.e., the likelihood ratio of a sequence being a T3S or a non-T3S protein. With T3_MM, known T3S and non-T3S proteins were found to well approximate two distinct normal distributions. The model could distinguish validated T3S and non-T3S proteins with a 5-fold cross-validation sensitivity of 83.9% at a specificity of 90.3%. T3_MM was also shown to be more robust, accurate, simple, and statistically quantitative, when compared with other T3S protein prediction models. The high effectiveness of T3_MM also indicated the overall Aac difference between N-termini of T3S and non-T3S proteins, and the constraint of Aac exerted by its preceding position and corresponding Aac.
An R package for T3_MM is freely downloadable from: http://biocomputer.bio.cuhk.edu.hk/softwares/T3_MM. T3_MM web server: http://biocomputer.bio.cuhk.edu.hk/T3DB/T3_MM.php.
The korAB operon in RK2 plasmids is a beautiful natural example of a negatively and cooperatively self-regulating operon. It has been particularly well characterized both experimentally and with mathematical models. We have carried out a detailed investigation of the role of the regulatory mechanism using a biologically grounded mechanistic multi-scale stochastic model that includes plasmid gene regulation and replication in the context of host growth and cell division. We use the model to compare four hypotheses for the action of the regulatory mechanism: increased robustness to extrinsic factors, decreased protein fluctuations, faster response-time of the operon and reduced host burden through improved efficiency of protein production. We find that the strongest impact of all elements of the regulatory architecture is on improving the efficiency of protein synthesis by reduction in the number of mRNA molecules needed to be produced, leading to a greater than ten-fold reduction in host energy required to express these plasmid proteins. A smaller but still significant role is seen for speeding response times, but this is not materially improved by the cooperativity. The self-regulating mechanisms have the least impact on protein fluctuations and robustness. While reduction of host burden is evident in a plasmid context, negative self-regulation is a widely seen motif for chromosomal genes. We propose that an important evolutionary driver for negatively self-regulated genes is to improve the efficiency of protein synthesis.
Our practice has long been concerned with the effects of display quality, including color accuracy and matching among paired color displays. Three years of data have been collected on the historical behavior of color stability on our clinical displays. This has permitted an analysis of the color-aging behavior of those displays over that time. The results of that analysis show that all displays tend to yellow over time, but that they do so together. That is, neither the intra- nor inter-display color variances observed at initial deployment diverge over time as measured by a mean radial distance metric in color space (Commission Internationale d’Eclairage L’, u’, v’ 1976). The consequence of this result is that color displays that are matched at deployment tend to remain matched over their lifetime even as they collectively yellow.
Digital display; Diagnostic image quality; Diagnostic display; Monitors
Bacteria have elaborate signalling mechanisms to ensure a behavioural response that is most likely to enhance survival in a changing environment. It is becoming increasingly apparent that as part of this response, bacteria are capable of cell differentiation and can generate multiple, mutually exclusive co-existing cell states. These cell states are often associated with multicellular processes that bring benefit to the community as a whole but which may be, paradoxically, disadvantageous to an individual subpopulation. How this process of cell differentiation is controlled is intriguing and remains a largely open question. In this paper, we consider an important aspect of cell differentiation that is known to occur in the Gram-positive bacterium Bacillus subtilis: we investigate the role of two master regulators DegU and Spo0A in the control of extra-cellular protease production. Recent work in this area focussed the on role of DegU in this process and suggested that transient effects in protein production were the drivers of cell-response heterogeneity. Here, using a combination of mathematical modelling, analysis and stochastic simulations, we provide a complementary analysis of this regulatory system that investigates the roles of both DegU and Spo0A in extra-cellular protease production. In doing so, we present a mechanism for bimodality, or system heterogeneity, without the need for a bistable switch in the underlying regulatory network. Moreover, our analysis leads us to conclude that this heterogeneity is in fact a persistent, stable feature. Our results suggest that system response is divided into three zones: low and high signal levels induce a unimodal or undifferentiated response from the cell population with all cells OFF and ON, respectively for exoprotease production. However, for intermediate levels of signal, a heterogeneous response is predicted with a spread of activity levels, representing typical “bet-hedging” behaviour.
We developed a sequencing assay for genotypic HIV-1 tropism determination. The assay allows examination of HIV RNA from plasma and HIV DNA from peripheral blood mononuclear cells (PBMC), including PBMC samples from patients with undetectable viral loads. Assessment of 100 pairs of plasma and PBMC samples showed a high concordance of 90%. With the limitations of population-based sequencing, the assay was found to be robust and suitable for the routine clinical laboratory.
Following successful completion of the Brassica rapa sequencing project, the next step is to investigate functions of individual genes/proteins. For Arabidopsis thaliana, large amounts of protein–protein interaction (PPI) data are available from the major PPI databases (DBs). It is known that Brassica crop species are closely related to A. thaliana. This provides an opportunity to infer the B. rapa interactome using PPI data available from A. thaliana. In this paper, we present an inferred B. rapa interactome that is based on the A. thaliana PPI data from two resources: (i) A. thaliana PPI data from three major DBs, BioGRID, IntAct, and TAIR. (ii) ortholog-based A. thaliana PPI predictions. Linking between B. rapa and A. thaliana was accomplished in three complementary ways: (i) ortholog predictions, (ii) identification of gene duplication based on synteny and collinearity, and (iii) BLAST sequence similarity search. A complementary approach was also applied, which used known/predicted domain–domain interaction data. Specifically, since the two species are closely related, we used PPI data from A. thaliana to predict interacting domains that might be conserved between the two species. The predicted interactome was investigated for the component that contains known A. thaliana meiotic proteins to demonstrate its usability.
Brassica rapa; Arabidopsis thaliana; interactome; protein–protein interaction; domain–domain interaction; meiosis
Word-based models have achieved promising results in sequence comparison. However, as the important statistical properties of words in biological sequence, how to use the overlapping structures and background information of the words to improve sequence comparison is still a problem. This paper proposed a new statistical method that integrates the overlapping structures and the background information of the words in biological sequences. To assess the effectiveness of this integration for sequence comparison, two sets of evaluation experiments were taken to test the proposed model. The first one, performed via receiver operating curve analysis, is the application of proposed method in discrimination between functionally related regulatory sequences and unrelated sequences, intron and exon. The second experiment is to evaluate the performance of the proposed method with f-measure for clustering Hepatitis E virus genotypes. It was demonstrated that the proposed method integrating the overlapping structures and the background information of words significantly improves biological sequence comparison and outperforms the existing models.
IncP-1 plasmids are broad host range plasmids that have been found in clinical and environmental bacteria. They often carry genes for antibiotic resistance or catabolic pathways. The archetypal IncP-1 plasmid RK2 is a well-characterized biological system, with a fully sequenced and annotated genome and wide range of experimental measurements. Its central control operon, encoding two global regulators KorA and KorB, is a natural example of a negatively self-regulated operon. To increase our understanding of the regulation of this operon, we have constructed a dynamical mathematical model using Ordinary Differential Equations, and employed a Bayesian inference scheme, Markov Chain Monte Carlo (MCMC) using the Metropolis-Hastings algorithm, as a way of integrating experimental measurements and a priori knowledge. We also compared MCMC and Metabolic Control Analysis (MCA) as approaches for determining the sensitivity of model parameters.
We identified two distinct sets of parameter values, with different biological interpretations, that fit and explain the experimental data. This allowed us to highlight the proportion of repressor protein as dimers as a key experimental measurement defining the dynamics of the system. Analysis of joint posterior distributions led to the identification of correlations between parameters for protein synthesis and partial repression by KorA or KorB dimers, indicating the necessary use of joint posteriors for correct parameter estimation. Using MCA, we demonstrated that the system is highly sensitive to the growth rate but insensitive to repressor monomerization rates in their selected value regions; the latter outcome was also confirmed by MCMC. Finally, by examining a series of different model refinements for partial repression by KorA or KorB dimers alone, we showed that a model including partial repression by KorA and KorB was most compatible with existing experimental data.
We have demonstrated that the combination of dynamical mathematical models with Bayesian inference is valuable in integrating diverse experimental data and identifying key determinants and parameters for the IncP-1 central control operon. Moreover, we have shown that Bayesian inference and MCA are complementary methods for identification of sensitive parameters. We propose that this demonstrates generic value in applying this combination of approaches to systems biology dynamical modelling.
It is now widely accepted that at an early stage in the evolution of life an RNA world arose, in which RNAs both served as the genetic material and catalyzed diverse biochemical reactions. Then, proteins have gradually replaced RNAs because of their superior catalytic properties in catalysis over time. Therefore, it is important to investigate how primitive functional proteins emerged from RNA world, which can shed light on the evolutionary pathway of life from RNA world to the modern world. In this work, we proposed that the emergence of most primitive functional proteins are assisted by the early primitive nucleotide cofactors, while only a minority are induced directly by RNAs based on the analysis of RNA-protein complexes. Furthermore, the present findings have significant implication for exploring the composition of primitive RNA, i.e., adenine base as principal building blocks.
Background and Aims
Dalbergia nigra is one of the most valuable timber species of its genus, having been traded for over 300 years. Due to over-exploitation it is facing extinction and trade has been banned under CITES Appendix I since 1992. Current methods, primarily comparative wood anatomy, are inadequate for conclusive species identification. This study aims to find a set of anatomical characters that distinguish the wood of D. nigra from other commercially important species of Dalbergia from Latin America.
Qualitative and quantitative wood anatomy, principal components analysis and naïve Bayes classification were conducted on 43 specimens of Dalbergia, eight D. nigra and 35 from six other Latin American species.
Dalbergia cearensis and D. miscolobium can be distinguished from D. nigra on the basis of vessel frequency for the former, and ray frequency for the latter. Principal components analysis was unable to provide any further basis for separating the species. Naïve Bayes classification using the four characters: minimum vessel diameter; frequency of solitary vessels; mean ray width; and frequency of axially fused rays, classified all eight D. nigra correctly with no false negatives, but there was a false positive rate of 36·36 %.
Wood anatomy alone cannot distinguish D. nigra from all other commercially important Dalbergia species likely to be encountered by customs officials, but can be used to reduce the number of specimens that would need further study.
Dalbergia nigra; Brazilian rosewood; CITES; wood anatomy; PCA; naïve Bayes analysis
Pancreatic adenocarcinoma (PAC) is one of the most intractable malignancies. In order to search for potential new therapeutic targets, we relied on computational methods aimed at identifying transcription factor binding sites (TFBSs) over-represented in the promoter regions of genes differentially expressed in PAC. Though many computational methods have been implemented to accomplish this, none has gained overall acceptance or produced proven novel targets in PAC. To this end we have developed DEMON, a novel method for motif detection.
DEMON relies on a hidden Markov model to score the appearance of sequence motifs, taking into account all potential sites in a promoter of potentially varying binding affinities. We demonstrate DEMON's accuracy on simulated and real data sets. Applying DEMON to PAC-related data sets identifies the RUNX family as highly enriched in PAC-related genes. Using a novel experimental paradigm to distinguish between normal and PAC cells, we find that RUNX3 mRNA (but not RUNX1 or RUNX2 mRNAs) exhibits time-dependent increases in normal but not in PAC cells. These increases are accompanied by changes in mRNA levels of putative RUNX gene targets.
The integrated application of DEMON and a novel differentiation system led to the identification of a single family member, RUNX3, which together with four of its putative targets showed a robust response to a differentiation stimulus in healthy cells, whereas this regulatory mechanism was absent in PAC cells, emphasizing RUNX3 as a promising target for further studies.
In spite of extensive research on the effect of mutation and selection on codon usage, a general model of codon usage bias due to mutational bias has been lacking. Because most amino acids allow synonymous GC content changing substitutions in the third codon position, the overall GC bias of a genome or genomic region is highly correlated with GC3, a measure of third position GC content. For individual amino acids as well, G/C ending codons usage generally increases with increasing GC bias and decreases with increasing AT bias. Arginine and leucine, amino acids that allow GC-changing synonymous substitutions in the first and third codon positions, have codons which may be expected to show different usage patterns.
In analyzing codon usage bias in hundreds of prokaryotic and plant genomes and in human genes, we find that two G-ending codons, AGG (arginine) and TTG (leucine), unlike all other G/C-ending codons, show overall usage that decreases with increasing GC bias, contrary to the usual expectation that G/C-ending codon usage should increase with increasing genomic GC bias. Moreover, the usage of some codons appears nonlinear, even nonmonotone, as a function of GC bias. To explain these observations, we propose a continuous-time Markov chain model of GC-biased synonymous substitution. This model correctly predicts the qualitative usage patterns of all codons, including nonlinear codon usage in isoleucine, arginine and leucine. The model accounts for 72%, 64% and 52% of the observed variability of codon usage in prokaryotes, plants and human respectively. When codons are grouped based on common GC content, 87%, 80% and 68% of the variation in usage is explained for prokaryotes, plants and human respectively.
The model clarifies the sometimes-counterintuitive effects that GC mutational bias can have on codon usage, quantifies the influence of GC mutational bias and provides a natural null model relative to which other influences on codon bias may be measured.
The Saccharopolyspora erythraea genome sequence was released in 2007. In order to look at the gene regulations at whole transcriptome level, an expression microarray was specifically designed on the S. erythraea strain NRRL 2338 genome sequence. Based on these data, we set out to investigate the potential transcriptional regulatory networks and their organization.
In view of the hierarchical structure of bacterial transcriptional regulation, we constructed a hierarchical coexpression network at whole transcriptome level. A total of 27 modules were identified from 1255 differentially expressed transcript units (TUs) across time course, which were further classified in to four groups. Functional enrichment analysis indicated the biological significance of our hierarchical network. It was indicated that primary metabolism is activated in the first rapid growth phase (phase A), and secondary metabolism is induced when the growth is slowed down (phase B). Among the 27 modules, two are highly correlated to erythromycin production. One contains all genes in the erythromycin-biosynthetic (ery) gene cluster and the other seems to be associated with erythromycin production by sharing common intermediate metabolites. Non-concomitant correlation between production and expression regulation was observed. Especially, by calculating the partial correlation coefficients and building the network based on Gaussian graphical model, intrinsic associations between modules were found, and the association between those two erythromycin production-correlated modules was included as expected.
This work created a hierarchical model clustering transcriptome data into coordinated modules, and modules into groups across the time course, giving insight into the concerted transcriptional regulations especially the regulation corresponding to erythromycin production of S. erythraea. This strategy may be extendable to studies on other prokaryotic microorganisms.
Gene regulatory networks exhibit complex, hierarchical features such as global regulation and network motifs. There is much debate about whether the evolutionary origins of such features are the results of adaptation, or the by-products of non-adaptive processes of DNA replication. The lack of availability of gene regulatory networks of ancestor species on evolutionary timescales makes this a particularly difficult problem to resolve. Digital organisms, however, can be used to provide a complete evolutionary record of lineages. We use a biologically realistic evolutionary model that includes gene expression, regulation, metabolism and biosynthesis, to investigate the evolution of complex function in gene regulatory networks. We discover that: (i) network architecture and complexity evolve in response to environmental complexity, (ii) global gene regulation is selected for in complex environments, (iii) complex, inter-connected, hierarchical structures evolve in stages, with energy regulation preceding stress responses, and stress responses preceding growth rate adaptations and (iv) robustness of evolved models to mutations depends on hierarchical level: energy regulation and stress responses tend not to be robust to mutations, whereas growth rate adaptations are more robust and non-lethal when mutated. These results highlight the adaptive and incremental evolution of complex biological networks, and the value and potential of studying realistic in silico evolutionary systems as a way of understanding living systems.
Electronic supplementary material
The online version of this article (doi:10.1007/s00239-010-9369-4) contains supplementary material, which is available to authorized users.
Complexity; Gene regulatory network; Evolution; Hierarchical; Computer model; In silico
Chemical reaction networks (CRNs) are susceptible to mathematical modelling. The dynamic behavior of CRNs can be investigated by solving the polynomial equations derived from its structure. However, simple CRN give rise to non-linear polynomials that are difficult to resolve. Here we propose a procedure to locate the steady states of CRNs from a formula derived through algebraic geometry methods. We have applied this procedure to define the steady states of a classic CRN that exhibits instability, and to a model of programmed cell death.
Prediction of transcription factor binding sites is an important challenge in genome analysis. The advent of next generation genome sequencing technologies makes the development of effective computational approaches particularly imperative. We have developed a novel training-based methodology intended for prokaryotic transcription factor binding site prediction. Our methodology extends existing models by taking into account base interdependencies between neighbouring positions using conditional probabilities and includes genomic background weighting. This has been tested against other existing and novel methodologies including position-specific weight matrices, first-order Hidden Markov Models and joint probability models. We have also tested the use of gapped and ungapped alignments and the inclusion or exclusion of background weighting. We show that our best method enhances binding site prediction for all of the 22 Escherichia coli transcription factors with at least 20 known binding sites, with many showing substantial improvements. We highlight the advantage of using block alignments of binding sites over gapped alignments to capture neighbouring position interdependencies. We also show that combining these methods with ChIP-on-chip data has the potential to further improve binding site prediction. Finally we have developed the ungapped likelihood under positional background platform: a user friendly website that gives access to the prediction method devised in this work.
Heidenreich et al. (Risk Anal 1997 17 391–399) considered parameter identifiability in the context of the two-mutation cancer model and demonstrated that combinations of all but two of the model parameters are identifiable. We consider the problem of identifiability in the recently developed carcinogenesis models of Little and Wright (Math Biosci 2003 183 111–134) and Little et al. (J Theoret Biol 2008 254 229–238). These models, which incorporate genomic instability, generalize a large number of other quasi-biological cancer models, in particular those of Armitage and Doll (Br J Cancer 1954 8 1–12), the two-mutation model (Moolgavkar et al. Math Biosci 1979 47 55–77), the generalized multistage model of Little (Biometrics 1995 51 1278–1291), and a recently developed cancer model of Nowak et al. (PNAS 2002 99 16226–16231).
We show that in the simpler model proposed by Little and Wright (Math Biosci 2003 183 111–134) the number of identifiable combinations of parameters is at most two less than the number of biological parameters, thereby generalizing previous results of Heidenreich et al. (Risk Anal 1997 17 391–399) for the two-mutation model. For the more general model of Little et al. (J Theoret Biol 2008 254 229–238) the number of identifiable combinations of parameters is at most less than the number of biological parameters, where is the number of destabilization types, thereby also generalizing all these results. Numerical evaluations suggest that these bounds are sharp. We also identify particular combinations of identifiable parameters.
We have shown that the previous results on parameter identifiability can be generalized to much larger classes of quasi-biological carcinogenesis model, and also identify particular combinations of identifiable parameters. These results are of theoretical interest, but also of practical significance to anyone attempting to estimate parameters for this large class of cancer models.
Cancer cells interact with surrounding stromal fibroblasts during tumorigenesis, but the complex molecular rules that govern these interactions remain poorly understood thus hindering the development of therapeutic strategies to target cancer stroma. We have taken a mathematical approach to begin defining these rules by performing the first large-scale quantitative analysis of fibroblast effects on cancer cell proliferation across more than four hundred heterotypic cell line pairings. Systems-level modeling of this complex dataset using singular value decomposition revealed that normal tissue fibroblasts variably express at least two functionally distinct activities, one which reflects transcriptional programs associated with activated mesenchymal cells, that act either coordinately or at cross-purposes to modulate cancer cell proliferation. These findings suggest that quantitative approaches may prove useful for identifying organizational principles that govern complex heterotypic cell-cell interactions in cancer and other contexts.
Genes required for infection of mice by Salmonella Typhimurium can be identified by the interrogation of random transposon mutant libraries for mutants that cannot survive in vivo. Inactivation of such genes produces attenuated S. Typhimurium strains that have potential for use as live attenuated vaccines. A quantitative screen, Transposon Mediated Differential Hybridisation (TMDH), has been developed that identifies those members of a large library of transposon mutants that are attenuated. TMDH employs custom transposons with outward-facing T7 and SP6 promoters. Fluorescently-labelled transcripts from the promoters are hybridised to whole-genome tiling microarrays, to allow the position of the transposon insertions to be determined. Comparison of microarray data from the mutant library grown in vitro (input) with equivalent data produced after passage of the library through mice (output) enables an attenuation score to be determined for each transposon mutant. These scores are significantly correlated with bacterial counts obtained during infection of mice using mutants with individual defined deletions of the same genes. Defined deletion mutants of several novel targets identified in the TMDH screen are effective live vaccines.
Salmonella Typhimurium infection of mice is an established model of systemic typhoid fever in humans. Mutations that inactivate genes that are important for virulence produce attenuated S. Typhimurium bacteria that have potential for use as live vaccines. To investigate the infection process we have produced a large pool of random insertion mutants, and developed a novel microarray-based technology, Transposon Mediated Differential Hybridisation (TMDH), that allows us to determine the gene disrupted by each insertion. Comparison of data obtained from the mutant pool grown in laboratory culture (input) with equivalent data produced after passage of the pool through mice (output) enables genes that are important for the infection process to be determined, since they are absent or less prevalent in the output pool. We have constructed defined deletion mutants of several of the candidate genes identified in the TMDH screen, and have shown that they are attenuated for virulence and effective live vaccines.
In April 2009, novel swine-origin influenza viruses (S-OIV) were identified in patients from Mexico and the United States. The viruses were genetically characterized as a novel influenza A (H1N1) strain originating in swine, and within a very short time the S-OIV strain spread across the globe via human-to-human contact.
We conducted a comprehensive computational search of all available sequences of the surface proteins of H1N1 swine influenza isolates and found that a similar strain to S-OIV appeared in Thailand in 2000. The earlier isolates caused infections in pigs but only one sequenced human case, A/Thailand/271/2005 (H1N1).
Differences between the Thai cases and S-OIV may help shed light on the ability of the current outbreak strain to spread rapidly among humans.
A sequencing assay for detection of mutations conferring resistance to human immunodeficiency virus type 1 (HIV-1) integrase inhibitors raltegravir and elvitegravir was developed using the automated TruGene sequencing system. The assay returned clear sequencing results for samples with ≥500 RNA copies/ml for mutation detection and HIV-1 subtyping across a spectrum of HIV-1 subtypes.
We demonstrate how a single-celled organism could undertake associative learning. Although to date only one previous study has found experimental evidence for such learning, there is no reason in principle why it should not occur. We propose a gene regulatory network that is capable of associative learning between any pre-specified set of chemical signals, in a Hebbian manner, within a single cell. A mathematical model is developed, and simulations show a clear learned response. A preliminary design for implementing this model using plasmids within Escherichia coli is presented, along with an alternative approach, based on double-phosphorylated protein kinases.
associative learning; Hebbian learning; single-celled organism; plasmid; synthetic biology
Choanoflagellates are unicellular filter-feeding protozoa distributed universally in aquatic habitats. Cells are ovoid in shape with a single anterior flagellum encircled by a funnel-shaped collar of microvilli. Movement of the flagellum creates water currents from which food particles are entrapped on the outer surface of the collar and ingested by pseudopodia. One group of marine choanoflagellates has evolved an elaborate basket-like exoskeleton, the lorica, comprising two layers of siliceous costae made up of costal strips. A computer graphic model has been developed for generating three-dimensional images of choanoflagellate loricae based on a universal set of ‘rules’ derived from electron microscopical observations. This model has proved seminal in understanding how complex costal patterns can be assembled in a single continuous movement. The lorica, which provides a rigid framework around the cell, is multifunctional. It resists the locomotory forces generated by flagellar movement, directs and enhances water flow over the collar and, for planktonic species, contributes towards maintaining cells in suspension. Since the functional morphology of choanoflagellate cells is so effective and has been highly conserved within the group, the ecological and evolutionary radiation of choanoflagellates is almost entirely dependent on the ability of the external coverings, particularly the lorica, to diversify.
computer graphic model; choanoflagellates; lorica construction; lorica assembly; cell rotation; lorica function
In this study, differential gene expression analysis using complementary DNA (cDNA) libraries has been improved. Firstly by the introduction of an accurate method of assigning Expressed Sequence Tags (ESTs) to genes and secondly, by using a novel likelihood ratio statistical scoring of differential gene expression between two pools of cDNA libraries. These methods were applied to the latest available cell line and bulk tissue cDNA libraries in a two-step screen to predict novel tumour endothelial markers. Initially, endothelial cell lines were in silico subtracted from non-endothelial cell lines to identify endothelial genes. Subsequently, a second bulk tumour versus normal tissue subtraction was employed to predict tumour endothelial markers.
From an endothelial cDNA library analysis, 431 genes were significantly up regulated in endothelial cells with a False Discovery Rate adjusted q-value of 0.01 or less and 104 of these were expressed only in endothelial cells. Combining the cDNA library data with the latest Serial Analysis of Gene Expression (SAGE) library data derived a complete list of 459 genes preferentially expressed in endothelium. 27 genes were predicted tumour endothelial markers in multiple tissues based on the second bulk tissue screen.
This approach represents a significant advance on earlier work in its ability to accurately assign an EST to a gene, statistically measure differential expression between two pools of cDNA libraries and predict putative tumour endothelial markers before entering the laboratory. These methods are of value and available to researchers that are interested in the analysis of transcriptomic data.