Search tips
Search criteria

Results 1-25 (38)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
more »
1.  Origins of Transcriptional Transition: Balance between Upstream and Downstream Regulatory Gene Sequences 
mBio  2015;6(1):e02182-14.
By measuring individual mRNA production at the single-cell level, we investigated the lac promoter’s transcriptional transition during cell growth phases. In exponential phase, variation in transition rates generates two mixed phenotypes, low and high numbers of mRNAs, by modulating their burst frequency and sizes. Independent activation of the regulatory-gene sequence does not produce bimodal populations at the mRNA level, but bimodal populations are produced when the regulatory gene is activated coordinately with the upstream and downstream region promoter sequence (URS and DRS, respectively). Time-lapse microscopy of mRNAs for lac and a variant lac promoter confirm this observation. Activation of the URS/DRS elements of the promoter reveals a counterplay behavior during cell phases. The promoter transition rate coupled with cell phases determines the mRNA and transcriptional noise. We further show that bias in partitioning of RNA does not lead to phenotypic switching. Our results demonstrate that the balance between the URS and the DRS in transcriptional regulation determines population diversity.
By measuring individual mRNA production at the single-cell level, we investigated the lac promoter transcriptional transition during cell growth phases. In exponential phase, variation in transition rate generates two mixed phenotypes producing low and high numbers of mRNAs by modulating the burst frequency and size. Independent activation of the regulatory gene sequence does not produce bimodal populations at the mRNA level, while it does when activated together through the coordination of upstream/downstream promoter sequences (URS/DRS). Time-lapse microscopy of mRNAs for lac and a lac variant promoter confirm this observation. Activation of the URS/DRS elements of the promoter reveals a counterplay behavior during cell phases. The promoter transition rate coupled with cell phases determines the mRNA and transcriptional noise. We further show that bias in partitioning of RNA does not lead to phenotypic switching. Our results demonstrate that the balance between URS and DRS in transcription regulation is determining the population diversity.
PMCID: PMC4324307  PMID: 25626902
2.  Quantitative analysis of colony morphology in yeast 
BioTechniques  2014;56(1):18-27.
Microorganisms often form multicellular structures such as biofilms and structured colonies that can influence the organism’s virulence, drug resistance, and adherence to medical devices. Phenotypic classification of these structures has traditionally relied on qualitative scoring systems that limit detailed phenotypic comparisons between strains. Automated imaging and quantitative analysis have the potential to improve the speed and accuracy of experiments designed to study the genetic and molecular networks underlying different morphological traits. For this reason, we have developed a platform that uses automated image analysis and pattern recognition to quantify phenotypic signatures of yeast colonies. Our strategy enables quantitative analysis of individual colonies, measured at a single time point or over a series of time-lapse images, as well as the classification of distinct colony shapes based on image-derived features. Phenotypic changes in colony morphology can be expressed as changes in feature space trajectories over time, thereby enabling the visualization and quantitative analysis of morphological development. To facilitate data exploration, results are plotted dynamically through an interactive Yeast Image Analysis web application (YIMAA; that integrates the raw and processed images across all time points, allowing exploration of the image-based features and principal components associated with morphological development.
PMCID: PMC3996921  PMID: 24447135
colony morphology; image analysis; software; yeast; phenotype; time-lapse
3.  Transcriptome and Small RNA Deep Sequencing Reveals Deregulation of miRNA Biogenesis in Human Glioma 
The Journal of pathology  2013;229(3):10.1002/path.4109.
Altered expression of oncogenic and tumor-suppressing microRNAs (miRNAs) is widely associated with tumorigenesis. However, the regulatory mechanisms underlying these alterations are poorly understood. We sought to shed light on the deregulation of miRNA biogenesis promoting the aberrant miRNA expression profiles identified in these tumors. Using sequencing technology to perform both whole-transcriptome and small RNA sequencing of glioma patient samples, we examined precursor and mature miRNAs to directly evaluate the miRNA maturation process, and interrogated expression profiles for genes involved in the major steps of miRNA biogenesis. We found that ratios of mature to precursor forms of a large number of miRNAs increased with the progression from normal brain to low-grade and then to high-grade gliomas. The expression levels of genes involved in each of the three major steps of miRNA biogenesis (nuclear processing, nucleo-cytoplasmic transport, and cytoplasmic processing) were systematically altered in glioma tissues. Survival analysis of an independent data set demonstrated that the alteration of genes involved in miRNA maturation correlates with survival in glioma patients. Direct quantification of miRNA maturation with deep sequencing demonstrated that deregulation of the miRNA biogenesis pathway is a hallmark for glioma genesis and progression.
PMCID: PMC3857031  PMID: 23007860
microRNA; biogenesis; glioma
4.  ZebIAT, an image analysis tool for registering zebrafish embryos and quantifying cancer metastasis 
BMC Bioinformatics  2013;14(Suppl 10):S5.
Zebrafish embryos have recently been established as a xenotransplantation model of the metastatic behaviour of primary human tumours. Current tools for automated data extraction from the microscope images are restrictive concerning the developmental stage of the embryos, usually require laborious manual image preprocessing, and, in general, cannot characterize the metastasis as a function of the internal organs.
We present a tool, ZebIAT, that allows both automatic or semi-automatic registration of the outer contour and inner organs of zebrafish embryos. ZebIAT provides a registration at different stages of development and an automatic analysis of cancer metastasis per organ, thus allowing to study cancer progression. The semi-automation relies on a graphical user interface.
We quantified the performance of the registration method, and found it to be accurate, except in some of the smallest organs. Our results show that the accuracy of registering small organs can be improved by introducing few manual corrections. We also demonstrate the applicability of the tool to studies of cancer progression.
ZebIAT offers major improvement relative to previous tools by allowing for an analysis on a per-organ or region basis. It should be of use in high-throughput studies of cancer metastasis in zebrafish embryos.
PMCID: PMC3750475  PMID: 24267347
5.  Cell segmentation by multi-resolution analysis and maximum likelihood estimation (MAMLE) 
BMC Bioinformatics  2013;14(Suppl 10):S8.
Cell imaging is becoming an indispensable tool for cell and molecular biology research. However, most processes studied are stochastic in nature, and require the observation of many cells and events. Ideally, extraction of information from these images ought to rely on automatic methods. Here, we propose a novel segmentation method, MAMLE, for detecting cells within dense clusters.
MAMLE executes cell segmentation in two stages. The first relies on state of the art filtering technique, edge detection in multi-resolution with morphological operator and threshold decomposition for adaptive thresholding. From this result, a correction procedure is applied that exploits maximum likelihood estimate as an objective function. Also, it acquires morphological features from the initial segmentation for constructing the likelihood parameter, after which the final segmentation is obtained.
We performed an empirical evaluation that includes sample images from different imaging modalities and diverse cell types. The new method attained very high (above 90%) cell segmentation accuracy in all cases. Finally, its accuracy was compared to several existing methods, and in all tests, MAMLE outperformed them in segmentation accuracy.
PMCID: PMC3750476  PMID: 24267594
6.  Multi-scale Gaussian representation and outline-learning based cell image segmentation 
BMC Bioinformatics  2013;14(Suppl 10):S6.
High-throughput genome-wide screening to study gene-specific functions, e.g. for drug discovery, demands fast automated image analysis methods to assist in unraveling the full potential of such studies. Image segmentation is typically at the forefront of such analysis as the performance of the subsequent steps, for example, cell classification, cell tracking etc., often relies on the results of segmentation.
We present a cell cytoplasm segmentation framework which first separates cell cytoplasm from image background using novel approach of image enhancement and coefficient of variation of multi-scale Gaussian scale-space representation. A novel outline-learning based classification method is developed using regularized logistic regression with embedded feature selection which classifies image pixels as outline/non-outline to give cytoplasm outlines. Refinement of the detected outlines to separate cells from each other is performed in a post-processing step where the nuclei segmentation is used as contextual information.
Results and conclusions
We evaluate the proposed segmentation methodology using two challenging test cases, presenting images with completely different characteristics, with cells of varying size, shape, texture and degrees of overlap. The feature selection and classification framework for outline detection produces very simple sparse models which use only a small subset of the large, generic feature set, that is, only 7 and 5 features for the two cases. Quantitative comparison of the results for the two test cases against state-of-the-art methods show that our methodology outperforms them with an increase of 4-9% in segmentation accuracy with maximum accuracy of 93%. Finally, the results obtained for diverse datasets demonstrate that our framework not only produces accurate segmentation but also generalizes well to different segmentation tasks.
PMCID: PMC3750482  PMID: 24267488
7.  Effects of multimerization on the temporal variability of protein complex abundance 
BMC Systems Biology  2013;7(Suppl 1):S3.
We explore whether the process of multimerization can be used as a means to regulate noise in the abundance of functional protein complexes. Additionally, we analyze how this process affects the mean level of these functional units, response time of a gene, and temporal correlation between the numbers of expressed proteins and of the functional multimers. We show that, although multimerization increases noise by reducing the mean number of functional complexes it can reduce noise in comparison with a monomer, when abundance of the functional proteins are comparable. Alternatively, reduction in noise occurs if both monomeric and multimeric forms of the protein are functional. Moreover, we find that multimerization either increases the response time to external signals or decreases the correlation between number of functional complexes and protein production kinetics. Finally, we show that the results are in agreement with recent genome-wide assessments of cell-to-cell variability in protein numbers and of multimerization in essential and non-essential genes in Escherichia coli, and that the effects of multimerization are tangible at the level of genetic circuits.
PMCID: PMC3750523  PMID: 24267954
8.  Characterization of aberrant pathways across human cancers 
BMC Systems Biology  2013;7(Suppl 1):S1.
Cancer is a broad group of genetic diseases which account for millions of deaths worldwide each year. Cancers are classified by various clinical, pathological and molecular methods, but even within a well-characterized disease, there is a significant inter-patient variability in survival, response to treatment, and other parameters. Especially in molecular level, tumours of the same category can appear significantly dissimilar due to complex combinations of genetic aberrations leading to a similar malignancy. We extended the current classification methods by studying tumour heterogeneity at pathway level.
We computed the rate of alterations in 1994 pathways and 2210 tumours consisting of eight different cancers. Using gene set enrichment analysis, each sample was computed a pathway aberration profile that reflected its molecular state. The profiles were analysed together to infer the characteristic aberration rates for each pathway within each cancer. Subgroups of tumours defined by similar pathway aberrations were identified using clustering analyses. The pathway aberration and gene expression profiles of the subgroups were consecutively compared across all eight cancer types to search for similar tumours crossing the standard classification.
We identified pathways and processes that were common to all cancers as well as traits that are unique to a cancer type or closely related cancers. Studying the gene expression patterns within the pathway context suggested potential alteration mechanisms. Clustering analysis revealed five clinically relevant subgroups of tumours in four cancers that exhibited significant differences in survival compared to others. The cross-cancer analysis of the subgroups resulted in the identification of tumours that shared potentially significant alterations.
This study represents the first effort to extend the molecular characterizations towards pathway level descriptions across the family of cancers. In addition to providing a proof-of-concept for single sample pathway aberration analysis in this context, we present a comprehensive pathway aberration dataset that can be used to study pathway aberration patterns within or across cancers. Significant similarities between subgroups of different cancers on pathway and gene expression levels provide interesting hypotheses for understanding variable drug response, or transferring treatments across diseases by identifying common druggable pathways or genes, for example.
PMCID: PMC3750561  PMID: 24267866
9.  Effects of Rate-Limiting Steps in Transcription Initiation on Genetic Filter Motifs 
PLoS ONE  2013;8(8):e70439.
The behavior of genetic motifs is determined not only by the gene-gene interactions, but also by the expression patterns of the constituent genes. Live single-molecule measurements have provided evidence that transcription initiation is a sequential process, whose kinetics plays a key role in the dynamics of mRNA and protein numbers. The extent to which it affects the behavior of cellular motifs is unknown. Here, we examine how the kinetics of transcription initiation affects the behavior of motifs performing filtering in amplitude and frequency domain. We find that the performance of each filter is degraded as transcript levels are lowered. This effect can be reduced by having a transcription process with more steps. In addition, we show that the kinetics of the stepwise transcription initiation process affects features such as filter cutoffs. These results constitute an assessment of the range of behaviors of genetic motifs as a function of the kinetics of transcription initiation, and thus will aid in tuning of synthetic motifs to attain specific characteristics without affecting their protein products.
PMCID: PMC3734270  PMID: 23940576
10.  NanoMiner — Integrative Human Transcriptomics Data Resource for Nanoparticle Research 
PLoS ONE  2013;8(7):e68414.
The potential impact of nanoparticles on the environment and on human health has attracted considerable interest worldwide. The amount of transcriptomics data, in which tissues and cell lines are exposed to nanoparticles, increases year by year. In addition to the importance of the original findings, this data can have value in broader context when combined with other previously acquired and published results. In order to facilitate the efficient usage of the data, we have developed the NanoMiner web resource (, which contains 404 human transcriptome samples exposed to various types of nanoparticles. All the samples in NanoMiner have been annotated, preprocessed and normalized using standard methods that ensure the quality of the data analyses and enable the users to utilize the database systematically across the different experimental setups and platforms. With NanoMiner it is possible to 1) search and plot the expression profiles of one or several genes of interest, 2) cluster the samples within the datasets, 3) find differentially expressed genes in various nanoparticle studies, 4) detect the nanoparticles causing differential expression of selected genes, 5) analyze enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways and Gene Ontology (GO) terms for the detected genes and 6) search the expression values and differential expressions of the genes belonging to a specific KEGG pathway or Gene Ontology. In sum, NanoMiner database is a valuable collection of microarray data which can be also used as a data repository for future analyses.
PMCID: PMC3709991  PMID: 23874618
11.  In vivo single-molecule kinetics of activation and subsequent activity of the arabinose promoter 
Nucleic Acids Research  2013;41(13):6544-6552.
Using a single-RNA detection technique in live Escherichia coli cells, we measure, for each cell, the waiting time for the production of the first RNA under the control of PBAD promoter after induction by arabinose, and subsequent intervals between transcription events. We find that the kinetics of the arabinose intake system affect mean and diversity in RNA numbers, long after induction. We observed the same effect on Plac/ara-1 promoter, which is inducible by arabinose or by IPTG. Importantly, the distribution of waiting times of Plac/ara-1 is indistinguishable from that of PBAD, if and only if induced by arabinose alone. Finally, RNA production under the control of PBAD is found to be a sub-Poissonian process. We conclude that inducer-dependent waiting times affect mean and cell-to-cell diversity in RNA numbers long after induction, suggesting that intake mechanisms have non-negligible effects on the phenotypic diversity of cell populations in natural, fluctuating environments.
PMCID: PMC3711423  PMID: 23644285
12.  The tumorigenic FGFR3-TACC3 gene fusion escapes miR-99a regulation in glioblastoma  
Fusion genes are chromosomal aberrations that are found in many cancers and can be used as prognostic markers and drug targets in clinical practice. Fusions can lead to production of oncogenic fusion proteins or to enhanced expression of oncogenes. Several recent studies have reported that some fusion genes can escape microRNA regulation via 3′–untranslated region (3′-UTR) deletion. We performed whole transcriptome sequencing to identify fusion genes in glioma and discovered FGFR3-TACC3 fusions in 4 of 48 glioblastoma samples from patients both of mixed European and of Asian descent, but not in any of 43 low-grade glioma samples tested. The fusion, caused by tandem duplication on 4p16.3, led to the loss of the 3′-UTR of FGFR3, blocking gene regulation of miR-99a and enhancing expression of the fusion gene. The fusion gene was mutually exclusive with EGFR, PDGFR, or MET amplification. Using cultured glioblastoma cells and a mouse xenograft model, we found that fusion protein expression promoted cell proliferation and tumor progression, while WT FGFR3 protein was not tumorigenic, even under forced overexpression. These results demonstrated that the FGFR3-TACC3 gene fusion is expressed in human cancer and generates an oncogenic protein that promotes tumorigenesis in glioblastoma.
PMCID: PMC3561838  PMID: 23298836
13.  Asymmetric Disposal of Individual Protein Aggregates in Escherichia coli, One Aggregate at a Time 
Journal of Bacteriology  2012;194(7):1747-1752.
Escherichia coli cells employ an asymmetric strategy at division, segregating unwanted substances to older poles, which has been associated with aging in these organisms. The kinetics of this process is still poorly understood. Using the MS2 coat protein fused to green fluorescent protein (GFP) and a reporter construct with multiple MS2 binding sites, we tracked individual RNA-MS2-GFP complexes in E. coli cells from the time when they were produced. Analyses of the kinetics and brightness of the spots showed that these spots appear in the midcell region, are composed of a single RNA-MS2-GFP complex, and reach a pole before another target RNA is formed, typically remaining there thereafter. The choice of pole is probabilistic and heavily biased toward one pole, similar to what was observed by previous studies regarding protein aggregates. Additionally, this mechanism was found to act independently on each disposed molecule. Finally, while the RNA-MS2-GFP complexes were disposed of, the MS2-GFP tagging molecules alone were not. We conclude that this asymmetric mechanism to segregate damage at the expense of aging individuals acts probabilistically on individual molecules and is capable of the accurate classification of molecules for disposal.
PMCID: PMC3302457  PMID: 22287517
14.  Dynamics of transcription driven by the tetA promoter, one event at a time, in live Escherichia coli cells 
Nucleic Acids Research  2012;40(17):8472-8483.
In Escherichia coli, tetracycline prevents translation. When subject to tetracycline, E. coli express TetA to pump it out by a mechanism that is sensitive, while fairly independent of cellular metabolism. We constructed a target gene, PtetA-mRFP1-96BS, with a 96 MS2-GFP binding site array in a single-copy BAC vector, whose expression is controlled by the tetA promoter. We measured the in vivo kinetics of production of individual RNA molecules of the target gene as a function of inducer concentration and temperature. From the distributions of intervals between transcription events, we find that RNA production by PtetA is a sub-Poissonian process. Next, we infer the number and duration of the prominent sequential steps in transcription initiation by maximum likelihood estimation. Under full induction and at optimal temperature, we observe three major steps. We find that the kinetics of RNA production under the control of PtetA, including number and duration of the steps, varies with induction strength and temperature. The results are supported by a set of logical pairwise Kolmogorov-Smirnov tests. We conclude that the expression of TetA is controlled by a sequential mechanism that is robust, whereas sensitive to external signals.
PMCID: PMC3458540  PMID: 22730294
15.  In vivo kinetics of transcription initiation of the lar promoter in Escherichia coli. Evidence for a sequential mechanism with two rate-limiting steps 
BMC Systems Biology  2011;5:149.
In Escherichia coli the mean and cell-to-cell diversity in RNA numbers of different genes vary widely. This is likely due to different kinetics of transcription initiation, a complex process with multiple rate-limiting steps that affect RNA production.
We measured the in vivo kinetics of production of individual RNA molecules under the control of the lar promoter in E. coli. From the analysis of the distributions of intervals between transcription events in the regimes of weak and medium induction, we find that the process of transcription initiation of this promoter involves a sequential mechanism with two main rate-limiting steps, each lasting hundreds of seconds. Both steps become faster with increasing induction by IPTG and Arabinose.
The two rate-limiting steps in initiation are found to be important regulators of the dynamics of RNA production under the control of the lar promoter in the regimes of weak and medium induction. Variability in the intervals between consecutive RNA productions is much lower than if there was only one rate-limiting step with a duration following an exponential distribution. The methodology proposed here to analyze the in vivo dynamics of transcription may be applicable at a genome-wide scale and provide valuable insight into the dynamics of prokaryotic genetic networks.
PMCID: PMC3191489  PMID: 21943372
16.  Differential Gene Expression in Adipose Stem Cells Cultured in Allogeneic Human Serum Versus Fetal Bovine Serum 
Tissue Engineering. Part A  2010;16(7):2281-2294.
In preclinical studies, human adipose stem cells (ASCs) have been shown to have therapeutic applicability, but standard expansion methods for clinical applications remain yet to be established. ASCs are typically expanded in the medium containing fetal bovine serum (FBS). However, sera and other animal-derived culture reagents stage safety issues in clinical therapy, including possible infections and severe immune reactions. By expanding ASCs in the medium containing human serum (HS), the problem can be eliminated. To define how allogeneic HS (alloHS) performs in ASC expansion compared to FBS, a comparative in vitro study in both serum supplements was performed. The choice of serum had a significant effect on ASCs. First, to reach cell proliferation levels comparable with 10% FBS, at least 15% alloHS was required. Second, while genes of the cell cycle pathway were overexpressed in alloHS, genes of the bone morphogenetic protein receptor–mediated signaling on the transforming growth factor beta signaling pathway regulating, for example, osteoblast differentiation, were overexpressed in FBS. The result was further supported by differentiation analysis, where early osteogenic differentiation was significantly enhanced in FBS. The data presented here underscore the importance of thorough investigation of ASCs for utilization in cell therapies. This study is a step forward in the understanding of these potential cells.
PMCID: PMC2928709  PMID: 20184435
17.  A Beta-mixture model for dimensionality reduction, sample classification and analysis 
BMC Bioinformatics  2011;12:215.
Patterns of genome-wide methylation vary between tissue types. For example, cancer tissue shows markedly different patterns from those of normal tissue. In this paper we propose a beta-mixture model to describe genome-wide methylation patterns based on probe data from methylation microarrays. The model takes dependencies between neighbour probe pairs into account and assumes three broad categories of methylation, low, medium and high. The model is described by 37 parameters, which reduces the dimensionality of a typical methylation microarray significantly. We used methylation microarray data from 42 colon cancer samples to assess the model.
Based on data from colon cancer samples we show that our model captures genome-wide characteristics of methylation patterns. We estimate the parameters of the model and show that they vary between different tissue types. Further, for each methylation probe the posterior probability of a methylation state (low, medium or high) is calculated and the probability that the state is correctly predicted is assessed. We demonstrate that the model can be applied to classify cancer tissue types accurately and that the model provides accessible and easily interpretable data summaries.
We have developed a beta-mixture model for methylation microarray data. The model substantially reduces the dimensionality of the data. It can be used for further analysis, such as sample classification or to detect changes in methylation status between different samples and tissues.
PMCID: PMC3126746  PMID: 21619656
18.  Cell-to-cell diversity in protein levels of a gene driven by a tetracycline inducible promoter 
BMC Molecular Biology  2011;12:21.
Gene expression in Escherichia coli is regulated by several mechanisms. We measured in single cells the expression level of a single copy gene coding for green fluorescent protein (GFP), integrated into the genome and driven by a tetracycline inducible promoter, for varying induction strengths. Also, we measured the transcriptional activity of a tetracycline inducible promoter controlling the transcription of a RNA with 96 binding sites for MS2-GFP.
The distribution of GFP levels in single cells is found to change significantly as induction reaches high levels, causing the Fano factor of the cells' protein levels to increase with mean level, beyond what would be expected from a Poisson-like process of RNA transcription. In agreement, the Fano factor of the cells' number of RNA molecules target for MS2-GFP follows a similar trend. The results provide evidence that the dynamics of the promoter complex formation, namely, the variability in its duration from one transcription event to the next, explains the change in the distribution of expression levels in the cell population with induction strength.
The results suggest that the open complex formation of the tetracycline inducible promoter, in the regime of strong induction, affects significantly the dynamics of RNA production due to the variability of its duration from one event to the next.
PMCID: PMC3120693  PMID: 21569576
19.  Stochastic sequence-level model of coupled transcription and translation in prokaryotes 
BMC Bioinformatics  2011;12:121.
In prokaryotes, transcription and translation are dynamically coupled, as the latter starts before the former is complete. Also, from one transcript, several translation events occur in parallel. To study how events in transcription elongation affect translation elongation and fluctuations in protein levels, we propose a delayed stochastic model of prokaryotic transcription and translation at the nucleotide and codon level that includes the promoter open complex formation and alternative pathways to elongation, namely pausing, arrests, editing, pyrophosphorolysis, RNA polymerase traffic, and premature termination. Stepwise translation can start after the ribosome binding site is formed and accounts for variable codon translation rates, ribosome traffic, back-translocation, drop-off, and trans-translation.
First, we show that the model accurately matches measurements of sequence-dependent translation elongation dynamics. Next, we characterize the degree of coupling between fluctuations in RNA and protein levels, and its dependence on the rates of transcription and translation initiation. Finally, modeling sequence-specific transcriptional pauses, we find that these affect protein noise levels.
For parameter values within realistic intervals, transcription and translation are found to be tightly coupled in Escherichia coli, as the noise in protein levels is mostly determined by the underlying noise in RNA levels. Sequence-dependent events in transcription elongation, e.g. pauses, are found to cause tangible effects in the degree of fluctuations in protein levels.
PMCID: PMC3113936  PMID: 21521517
20.  Cancer systems biology: signal processing for cancer research 
Chinese Journal of Cancer  2011;30(4):221-225.
In this editorial we introduce the research paradigms of signal processing in the era of systems biology. Signal processing is a field of science traditionally focused on modeling electronic and communications systems, but recently it has turned to biological applications with astounding results. The essence of signal processing is to describe the natural world by mathematical models and then, based on these models, develop efficient computational tools for solving engineering problems. Here, we underline, with examples, the endless possibilities which arise when the battle-hardened tools of engineering are applied to solve the problems that have tormented cancer researchers. Based on this approach, a new field has emerged, called cancer systems biology. Despite its short history, cancer systems biology has already produced several success stories tackling previously impracticable problems. Perhaps most importantly, it has been accepted as an integral part of the major endeavors of cancer research, such as analyzing the genomic and epigenomic data produced by The Cancer Genome Atlas (TCGA) project. Finally, we show that signal processing and cancer research, two fields that are seemingly distant from each other, have merged into a field that is indeed more than the sum of its parts.
PMCID: PMC4013347  PMID: 21439242
Systems biology; signal processing; gene regulation; methylation; glioblastoma
21.  Information Diversity in Structure and Dynamics of Simulated Neuronal Networks 
Neuronal networks exhibit a wide diversity of structures, which contributes to the diversity of the dynamics therein. The presented work applies an information theoretic framework to simultaneously analyze structure and dynamics in neuronal networks. Information diversity within the structure and dynamics of a neuronal network is studied using the normalized compression distance. To describe the structure, a scheme for generating distance-dependent networks with identical in-degree distribution but variable strength of dependence on distance is presented. The resulting network structure classes possess differing path length and clustering coefficient distributions. In parallel, comparable realistic neuronal networks are generated with NETMORPH simulator and similar analysis is done on them. To describe the dynamics, network spike trains are simulated using different network structures and their bursting behaviors are analyzed. For the simulation of the network activity the Izhikevich model of spiking neurons is used together with the Tsodyks model of dynamical synapses. We show that the structure of the simulated neuronal networks affects the spontaneous bursting activity when measured with bursting frequency and a set of intraburst measures: the more locally connected networks produce more and longer bursts than the more random networks. The information diversity of the structure of a network is greatest in the most locally connected networks, smallest in random networks, and somewhere in between in the networks between order and disorder. As for the dynamics, the most locally connected networks and some of the in-between networks produce the most complex intraburst spike trains. The same result also holds for sparser of the two considered network densities in the case of full spike trains.
PMCID: PMC3151619  PMID: 21852970
information diversity; neuronal network; structure-dynamics relationship; complexity
22.  A systems biological approach to identify key transcription factors and their genomic neighborhoods in human sarcomas 
Chinese Journal of Cancer  2011;30(1):27-40.
Identification of genetic signatures is the main objective for many computational oncology studies. The signature usually consists of numerous genes that are differentially expressed between two clinically distinct groups of samples, such as tumor subtypes. Prospectively, many signatures have been found to generalize poorly to other datasets and, thus, have rarely been accepted into clinical use. Recognizing the limited success of traditionally generated signatures, we developed a systems biology-based framework for robust identification of key transcription factors and their genomic regulatory neighborhoods. Application of the framework to study the differences between gastrointestinal stromal tumor (GIST) and leiomyosarcoma (LMS) resulted in the identification of nine transcription factors (SRF, NKX2-5, CCDC6, LEF1, VDR, ZNF250, TRIM63, MAF, and MYC). Functional annotations of the obtained neighborhoods identified the biological processes which the key transcription factors regulate differently between the tumor types. Analyzing the differences in the expression patterns using our approach resulted in a more robust genetic signature and more biological insight into the diseases compared to a traditional genetic signature.
PMCID: PMC4012261  PMID: 21192842
Systems biology; transcription factor; gene regulation; binding motif; sarcoma
23.  Inference of Kinetic Parameters of Delayed Stochastic Models of Gene Expression Using a Markov Chain Approximation 
We propose a Markov chain approximation of the delayed stochastic simulation algorithm to infer properties of the mechanisms in prokaryote transcription from the dynamics of RNA levels. We model transcription using the delayed stochastic modelling strategy and realistic parameter values for rate of transcription initiation and RNA degradation. From the model, we generate time series of RNA levels at the single molecule level, from which we use the method to infer the duration of the promoter open complex formation. This is found to be possible even when adding external Gaussian noise to the RNA levels.
PMCID: PMC3171302  PMID: 21234243
24.  Information propagation within the Genetic Network of Saccharomyces cerevisiae 
BMC Systems Biology  2010;4:143.
A gene network's capacity to process information, so as to bind past events to future actions, depends on its structure and logic. From previous and new microarray measurements in Saccharomyces cerevisiae following gene deletions and overexpressions, we identify a core gene regulatory network (GRN) of functional interactions between 328 genes and the transfer functions of each gene. Inferred connections are verified by gene enrichment.
We find that this core network has a generalized clustering coefficient that is much higher than chance. The inferred Boolean transfer functions have a mean p-bias of 0.41, and thus similar amounts of activation and repression interactions. However, the distribution of p-biases differs significantly from what is expected by chance that, along with the high mean connectivity, is found to cause the core GRN of S. cerevisiae's to have an overall sensitivity similar to critical Boolean networks. In agreement, we find that the amount of information propagated between nodes in finite time series is much higher in the inferred core GRN of S. cerevisiae than what is expected by chance.
We suggest that S. cerevisiae is likely to have evolved a core GRN with enhanced information propagation among its genes.
PMCID: PMC2975643  PMID: 20977725
25.  Evaluation of methods for detection of fluorescence labeled subcellular objects in microscope images 
BMC Bioinformatics  2010;11:248.
Several algorithms have been proposed for detecting fluorescently labeled subcellular objects in microscope images. Many of these algorithms have been designed for specific tasks and validated with limited image data. But despite the potential of using extensive comparisons between algorithms to provide useful information to guide method selection and thus more accurate results, relatively few studies have been performed.
To better understand algorithm performance under different conditions, we have carried out a comparative study including eleven spot detection or segmentation algorithms from various application fields. We used microscope images from well plate experiments with a human osteosarcoma cell line and frames from image stacks of yeast cells in different focal planes. These experimentally derived images permit a comparison of method performance in realistic situations where the number of objects varies within image set. We also used simulated microscope images in order to compare the methods and validate them against a ground truth reference result. Our study finds major differences in the performance of different algorithms, in terms of both object counts and segmentation accuracies.
These results suggest that the selection of detection algorithms for image based screens should be done carefully and take into account different conditions, such as the possibility of acquiring empty images or images with very few spots. Our inclusion of methods that have not been used before in this context broadens the set of available detection methods and compares them against the current state-of-the-art methods for subcellular particle detection.
PMCID: PMC3098061  PMID: 20465797

Results 1-25 (38)