Populus is a model woody plant and a promising feedstock for lignocellulosic biofuel production. However, its lengthy life cycle impedes rapid characterization of gene function.
We optimized a Populus leaf mesophyll protoplast isolation protocol and established a Populus protoplast transient expression system. We demonstrated that Populus protoplasts are able to respond to hormonal stimuli and that a series of organelle markers are correctly localized in the Populus protoplasts. Furthermore, we showed that the Populus protoplast transient expression system is suitable for studying protein-protein interaction, gene activation, and cellular signaling events.
This study established a method for efficient isolation of protoplasts from Populus leaf and demonstrated the efficacy of using Populus protoplast transient expression assays as an in vivo system to characterize genes and pathways.
A microarray has been created from 36,354 ESTs from Populus species and used to study autumn senescence in the leaves of the aspen tree Populus tremula.
We have developed genomic tools to allow the genus Populus (aspens and cottonwoods) to be exploited as a full-featured model for investigating fundamental aspects of tree biology. We have undertaken large-scale expressed sequence tag (EST) sequencing programs and created Populus microarrays with significant gene coverage. One of the important aspects of plant biology that cannot be studied in annual plants is the gene activity involved in the induction of autumn leaf senescence.
On the basis of 36,354 Populus ESTs, obtained from seven cDNA libraries, we have created a DNA microarray consisting of 13,490 clones, spotted in duplicate. Of these clones, 12,376 (92%) were confirmed by resequencing and all sequences were annotated and functionally classified. Here we have used the microarray to study transcript abundance in leaves of a free-growing aspen tree (Populus tremula) in northern Sweden during natural autumn senescence. Of the 13,490 spotted clones, 3,792 represented genes with significant expression in all leaf samples from the seven studied dates.
We observed a major shift in gene expression, coinciding with massive chlorophyll degradation, that reflected a shift from photosynthetic competence to energy generation by mitochondrial respiration, oxidation of fatty acids and nutrient mobilization. Autumn senescence had much in common with senescence in annual plants; for example many proteases were induced. We also found evidence for increased transcriptional activity before the appearance of visible signs of senescence, presumably preparing the leaf for degradation of its components.
An Arabidopsis thaliana transcriptional network reveals regulatory mechanisms for the control of genes related to stress adaptation.
Understanding the molecular mechanisms plants have evolved to adapt their biological activities to a constantly changing environment is an intriguing question and one that requires a systems biology approach. Here we present a network analysis of genome-wide expression data combined with reverse-engineering network modeling to dissect the transcriptional control of Arabidopsis thaliana. The regulatory network is inferred by using an assembly of microarray data containing steady-state RNA expression levels from several growth conditions, developmental stages, biotic and abiotic stresses, and a variety of mutant genotypes.
We show that the A. thaliana regulatory network has the characteristic properties of hierarchical networks. We successfully applied our quantitative network model to predict the full transcriptome of the plant for a set of microarray experiments not included in the training dataset. We also used our model to analyze the robustness in expression levels conferred by network motifs such as the coherent feed-forward loop. In addition, the meta-analysis presented here has allowed us to identify regulatory and robust genetic structures.
These data suggest that A. thaliana has evolved high connectivity in terms of transcriptional regulation among cellular functions involved in response and adaptation to changing environments, while gene networks constitutively expressed or less related to stress response are characterized by a lower connectivity. Taken together, these findings suggest conserved regulatory strategies that have been selected during the evolutionary history of this eukaryote.
The genus Populus includes poplars, aspens and cottonwoods, which will be collectively referred to as poplars hereafter unless otherwise specified. Poplars are the dominant tree species in many forest ecosystems in the Northern Hemisphere and are of substantial economic value in plantation forestry. Poplar has been established as a model system for genomics studies of growth, development, and adaptation of woody perennial plants including secondary xylem formation, dormancy, adaptation to local environments, and biotic interactions.
As part of the poplar genome sequencing project and the development of genomic resources for poplar, we have generated a full-length (FL)-cDNA collection using the biotinylated CAP trapper method. We constructed four FLcDNA libraries using RNA from xylem, phloem and cambium, and green shoot tips and leaves from the P. trichocarpa Nisqually-1 genotype, as well as insect-attacked leaves of the P. trichocarpa × P. deltoides hybrid. Following careful selection of candidate cDNA clones, we used a combined strategy of paired end reads and primer walking to generate a set of 4,664 high-accuracy, sequence-verified FLcDNAs, which clustered into 3,990 putative unique genes. Mapping FLcDNAs to the poplar genome sequence combined with BLAST comparisons to previously predicted protein coding sequences in the poplar genome identified 39 FLcDNAs that likely localize to gaps in the current genome sequence assembly. Another 173 FLcDNAs mapped to the genome sequence but were not included among the previously predicted genes in the poplar genome. Comparative sequence analysis against Arabidopsis thaliana and other species in the non-redundant database of GenBank revealed that 11.5% of the poplar FLcDNAs display no significant sequence similarity to other plant proteins. By mapping the poplar FLcDNAs against transcriptome data previously obtained with a 15.5 K cDNA microarray, we identified 153 FLcDNA clones for genes that were differentially expressed in poplar leaves attacked by forest tent caterpillars.
This study has generated a high-quality FLcDNA resource for poplar and the third largest FLcDNA collection published to date for any plant species. We successfully used the FLcDNA sequences to reassess gene prediction in the poplar genome sequence, perform comparative sequence annotation, and identify differentially expressed transcripts associated with defense against insects. The FLcDNA sequences will be essential to the ongoing curation and annotation of the poplar genome, in particular for targeting gaps in the current genome assembly and further improvement of gene predictions. The physical FLcDNA clones will serve as useful reagents for functional genomics research in areas such as analysis of gene functions in defense against insects and perennial growth. Sequences from this study have been deposited in NCBI GenBank under the accession numbers EF144175 to EF148838.
A Populus EST dataset was used for in silico transcript profiling of the programmed death of the xylem fibres in woody tissues of Populus stem. The analysis suggests the involvement of two novel extracellular serine proteases, nodulin-like proteins and an AtOST1 (Arabidopsis thaliana OPEN STOMATA 1) homolog in signaling fiber-cell death.
Poplar (Populus sp.) has emerged as the main model system for molecular and genetic studies of forest trees. A Populus expressed sequence tag (EST) database (POPULUSDB) was previously created from 19 cDNA libraries each originating from different Populus tree tissues, and opened to the public in September 2004. We used this dataset for in silico transcript profiling of a particular process in the woody tissues of the Populus stem: the programmed death of xylem fibers.
One EST library in POPULUSDB originates from woody tissues of the Populus stem where xylem fibers undergo cell death. Analysis of EST abundances and library distribution within the POPULUSDB revealed a large number of previously uncharacterized transcripts that were unique in this library and possibly related to the death of xylem fibers. The in silico analysis was complemented by a microarray analysis utilizing a novel Populus cDNA array with a unigene set of 25,000 sequences.
In silico analysis, combined with the microarray analysis, revealed the usefulness of non-normalized EST libraries in elucidating transcriptional regulation of previously uncharacterized physiological processes. The data suggested the involvement of two novel extracellular serine proteases, nodulin-like proteins and an Arabidopsis thaliana OPEN STOMATA 1 (AtOST1) homolog in signaling fiber-cell death, as well as mechanisms responsible for hormonal control, nutrient remobilization, regulation of vacuolar integrity and autolysis of the dying fibers.
In System Biology, iterations of wet-lab experiments followed by modelling approaches and model-inspired experiments describe a cyclic workflow. This approach is especially useful for the inference of gene regulatory networks based on high-throughput gene expression data. Experiments can verify or falsify the predicted interactions allowing further refinement of the network model. Aspergillus fumigatus is a major human fungal pathogen. One important virulence trait is its ability to gain sufficient amounts of iron during infection process. Even though some regulatory interactions are known, we are still far from a complete understanding of the way iron homeostasis is regulated.
In this study, we make use of a reverse engineering strategy to infer a regulatory network controlling iron homeostasis in A. fumigatus. The inference approach utilizes the temporal change in expression data after a change from iron depleted to iron replete conditions. The modelling strategy is based on a set of linear differential equations and offers the possibility to integrate known regulatory interactions as prior knowledge. Moreover, it makes use of important selection criteria, such as sparseness and robustness. By compiling a list of known regulatory interactions for iron homeostasis in A. fumigatus and softly integrating them during network inference, we are able to predict new interactions between transcription factors and target genes. The proposed activation of the gene expression of hapX by the transcriptional regulator SrbA constitutes a so far unknown way of regulating iron homeostasis based on the amount of metabolically available iron. This interaction has been verified by Northern blots in a recent experimental study. In order to improve the reliability of the predicted network, the results of this experimental study have been added to the set of prior knowledge. The final network includes three SrbA target genes. Based on motif searching within the regulatory regions of these genes, we identify potential DNA-binding sites for SrbA. Our wet-lab experiments demonstrate high-affinity binding capacity of SrbA to the promoters of hapX, hemA and srbA.
This study presents an application of the typical Systems Biology circle and is based on cooperation between wet-lab experimentalists and in silico modellers. The results underline that using prior knowledge during network inference helps to predict biologically important interactions. Together with the experimental results, we indicate a novel iron homeostasis regulating system sensing the amount of metabolically available iron and identify the binding site of iron-related SrbA target genes. It will be of high interest to study whether these regulatory interactions are also important for close relatives of A. fumigatus and other pathogenic fungi, such as Candida albicans.
Understanding the complex regulatory networks underlying development and evolution of multi-cellular organisms is a major problem in biology. Computational models can be used as tools to extract the regulatory structure and dynamics of such networks from gene expression data. This approach is called reverse engineering. It has been successfully applied to many gene networks in various biological systems. However, to reconstitute the structure and non-linear dynamics of a developmental gene network in its spatial context remains a considerable challenge. Here, we address this challenge using a case study: the gap gene network involved in segment determination during early development of Drosophila melanogaster. A major problem for reverse-engineering pattern-forming networks is the significant amount of time and effort required to acquire and quantify spatial gene expression data. We have developed a simplified data processing pipeline that considerably increases the throughput of the method, but results in data of reduced accuracy compared to those previously used for gap gene network inference. We demonstrate that we can infer the correct network structure using our reduced data set, and investigate minimal data requirements for successful reverse engineering. Our results show that timing and position of expression domain boundaries are the crucial features for determining regulatory network structure from data, while it is less important to precisely measure expression levels. Based on this, we define minimal data requirements for gap gene network inference. Our results demonstrate the feasibility of reverse-engineering with much reduced experimental effort. This enables more widespread use of the method in different developmental contexts and organisms. Such systematic application of data-driven models to real-world networks has enormous potential. Only the quantitative investigation of a large number of developmental gene regulatory networks will allow us to discover whether there are rules or regularities governing development and evolution of complex multi-cellular organisms.
To better understand multi-cellular organisms we need a better and more systematic understanding of the complex regulatory networks that govern their development and evolution. However, this problem is far from trivial. Regulatory networks involve many factors interacting in a non-linear manner, which makes it difficult to study them without the help of computers. Here, we investigate a computational method, reverse engineering, which allows us to reconstitute real-world regulatory networks in silico. As a case study, we investigate the gap gene network involved in determining the position of body segments during early development of Drosophila. We visualise spatial gap gene expression patterns using in situ hybridisation and microscopy. The resulting embryo images are quantified to measure the position of expression domain boundaries. We then use computational models as tools to extract regulatory information from the data. We investigate what kind, and how much data are required for successful network inference. Our results reveal that much less effort is required for reverse-engineering networks than previously thought. This opens the possibility of investigating a large number of developmental networks using this approach, which in turn will lead to a more general understanding of the rules and principles underlying development in animals and plants.
The Populus sucrose (Suc) transporter 4 (PtaSUT4), like its orthologs in other plant taxa, is tonoplast localized and thought to mediate Suc export from the vacuole into the cytosol. In source leaves of Populus, SUT4 is the predominantly expressed gene family member, with transcript levels several times higher than those of plasma membrane SUTs. A hypothesis is advanced that SUT4-mediated tonoplast sucrose fluxes contribute to the regulation of osmotic gradients between cellular compartments, with the potential to mediate both sink provisioning and drought tolerance in Populus. Here, we describe the effects of PtaSUT4-RNA interference (RNAi) on sucrose levels and raffinose family oligosaccharides (RFO) induction, photosynthesis, and water uptake, retention and loss during acute and chronic drought stresses. Under normal water-replete growing conditions, SUT4-RNAi plants had generally higher shoot water contents than wild-type plants. In response to soil drying during a short-term, acute drought, RNAi plants exhibited reduced rates of water uptake and delayed wilting relative to wild-type plants. SUT4-RNAi plants had larger leaf areas and lower photosynthesis rates than wild-type plants under well-watered, but not under chronic water-limiting conditions. Moreover, the magnitude of shoot water content, height growth, and photosynthesis responses to contrasting soil moisture regimes was greater in RNAi than wild-type plants. The concentrations of stress-responsive RFOs increased in wild-type plants but were unaffected in SUT4-RNAi plants under chronically dry conditions. We discuss a model in which the subcellular compartmentalization of sucrose mediated by PtaSUT4 is regulated in response to both sink demand and plant water status in Populus.
Poplar (Populus spp.) is a widely distributed tree genus of significant economic and ecological importance. Poplar trees accumulate proanthocyanidins (PAs) in leaves, roots, and a variety of other tissues. Damage to leaves by insects causes a rapid accumulation of PAs, both at the site of damage and distally in undamaged leaves. This rapid PA accumulation is mediated by the activation of genes encoding enzymes involved in PA synthesis. PAs have been hypothesized to deter insect feeding and reduce the nutritive value of poplar leaf tissue, but experimental evidence supporting a role for PAs as an effective inducible defense against herbivores is lacking. Our recent paper described the identification of a MYB gene that regulates the PA pathway under multiple stress conditions, and we used this gene to constitutively activate the PA pathway in poplar. Here we describe observations that suggest that poplar PAs may have roles besides insect defense, for example, responses to UV light. The PA-modified trees will be a useful tool for analyzing the biological roles of PAs in this important model tree.
tannins; herbivory; flavonoid; UV light; light stress
A Boolean network is a graphical model for representing and analyzing the behavior of gene regulatory networks (GRN). In this context, the accurate and efficient reconstruction of a Boolean network is essential for understanding the gene regulation mechanism and the complex relations that exist therein. In this paper we introduce an elegant and efficient algorithm for the reverse engineering of Boolean networks from a time series of multivariate binary data corresponding to gene expression data. We call our method ReBMM, i.e., reverse engineering based on Bernoulli mixture models. The time complexity of most of the existing reverse engineering techniques is quite high and depends upon the indegree of a node in the network. Due to the high complexity of these methods, they can only be applied to sparsely connected networks of small sizes. ReBMM has a time complexity factor, which is independent of the indegree of a node and is quadratic in the number of nodes in the network, a big improvement over other techniques and yet there is little or no compromise in accuracy. We have tested ReBMM on a number of artificial datasets along with simulated data derived from a plant signaling network. We also used this method to reconstruct a network from real experimental observations of microarray data of the yeast cell cycle. Our method provides a natural framework for generating rules from a probabilistic model. It is simple, intuitive and illustrates excellent empirical results.
Genomic studies are routinely performed on young plants in controlled environments which is very different from natural conditions. In reality plants in temperate countries are exposed to large fluctuations in environmental conditions, in the case of perennials over several years. We have studied gene expression in leaves of a free-growing aspen (Populus tremula) throughout multiple growing seasons
We show that gene expression during the first month of leaf development was largely determined by a developmental program although leaf expansion, chlorophyll accumulation and the speed of progression through this program was regulated by the temperature. We were also able to define "transcriptional signatures" for four different substages of leaf development. In mature leaves, weather factors were important for gene regulation.
This study shows that multivariate methods together with high throughput transcriptional methods in the field can provide additional, novel information as to plant status under changing environmental conditions that is impossible to mimic in laboratory conditions. We have generated a dataset that could be used to e.g. identify marker genes for certain developmental stages or treatments, as well as to assess natural variation in gene expression.
Quantitative PCR (qPCR) is a widely used technique for gene expression analysis. A common normalization method for accurate qPCR data analysis involves stable reference genes to determine relative gene expression. Despite extensive research in the forest tree species Populus, there is not a resource for reference genes that meet the Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE) standards for qPCR techniques and analysis. Since Populus is a woody perennial species, studies of seasonal changes in gene expression are important towards advancing knowledge of this important developmental and physiological trait. The objective of this study was to evaluate reference gene expression stability in various tissues and growth conditions in two important Populus genotypes (P. trichocarpa “Nisqually 1” and P. tremula x P. alba 717 1-B4) following MIQE guidelines.
We evaluated gene expression stability in shoot tips, young leaves, mature leaves and bark tissues from P. trichocarpa and P. tremula. x P. alba grown under long-day (LD), short-day (SD) or SD plus low-temperatures conditions. Gene expression data were analyzed for stable reference genes among 18S rRNA, ACT2, CDC2, CYC063, TIP4-like, UBQ7, PT1 and ANT using two software packages, geNormPLUS and BestKeeper. GeNormPLUS ranked TIP4-like and PT1 among the most stable genes in most genotype/tissue combinations while BestKeeper ranked CDC2 and ACT2 among the most stable genes.
This is the first comprehensive evaluation of reference genes in two important Populus genotypes and the only study in Populus that meets MIQE standards. Both analysis programs identified stable reference genes in both genotypes and all tissues grown under different photoperiods. This set of reference genes was found to be suitable for either genotype considered here and may potentially be suitable for other Populus species and genotypes. These results provide a valuable resource for the Populus research community.
RT-qPCR; Reference gene validation; Populus trichocarpa; Populus tremula x Populus alba
Starch serves as a temporal storage of carbohydrates in plant leaves during day/night cycles. To study transcriptional regulatory modules of this dynamic metabolic process, we conducted gene regulation network analysis based on small-sample inference of graphical Gaussian model (GGM).
Time-series significant analysis was applied for Arabidopsis leaf transcriptome data to obtain a set of genes that are highly regulated under a diurnal cycle. A total of 1,480 diurnally regulated genes included 21 starch metabolic enzymes, 6 clock-associated genes, and 106 transcription factors (TF). A starch-clock-TF gene regulation network comprising 117 nodes and 266 edges was constructed by GGM from these 133 significant genes that are potentially related to the diurnal control of starch metabolism. From this network, we found that β-amylase 3 (b-amy3: At4g17090), which participates in starch degradation in chloroplast, is the most frequently connected gene (a hub gene). The robustness of gene-to-gene regulatory network was further analyzed by TF binding site prediction and by evaluating global co-expression of TFs and target starch metabolic enzymes. As a result, two TFs, indeterminate domain 5 (AtIDD5: At2g02070) and constans-like (COL: At2g21320), were identified as positive regulators of starch synthase 4 (SS4: At4g18240). The inference model of AtIDD5-dependent positive regulation of SS4 gene expression was experimentally supported by decreased SS4 mRNA accumulation in Atidd5 mutant plants during the light period of both short and long day conditions. COL was also shown to positively control SS4 mRNA accumulation. Furthermore, the knockout of AtIDD5 and COL led to deformation of chloroplast and its contained starch granules. This deformity also affected the number of starch granules per chloroplast, which increased significantly in both knockout mutant lines.
In this study, we utilized a systematic approach of microarray analysis to discover the transcriptional regulatory network of starch metabolism in Arabidopsis leaves. With this inference method, the starch regulatory network of Arabidopsis was found to be strongly associated with clock genes and TFs, of which AtIDD5 and COL were evidenced to control SS4 gene expression and starch granule formation in chloroplasts.
Arabidopsis thaliana; Constans-like; Indeterminate domain 5; Graphical Gaussian model; Starch synthase 4; Transcriptional regulation
Identifying the connections between molecular and physiological processes underlying the diversity of drought stress responses in plants is key for basic and applied science. Drought stress response involves a large number of molecular pathways and subsequent physiological processes. Therefore, it constitutes an archetypical systems biology model. We first inferred a gene-phenotype network exploiting differences in drought responses of eight sunflower (Helianthus annuus) genotypes to two drought stress scenarios. Large transcriptomic data were obtained with the sunflower Affymetrix microarray, comprising 32423 probesets, and were associated to nine morpho-physiological traits (integrated transpired water, leaf transpiration rate, osmotic potential, relative water content, leaf mass per area, carbon isotope discrimination, plant height, number of leaves and collar diameter) using sPLS regression. Overall, we could associate the expression patterns of 1263 probesets to six phenotypic traits and identify if correlations were due to treatment, genotype and/or their interaction. We also identified genes whose expression is affected at moderate and/or intense drought stress together with genes whose expression variation could explain phenotypic and drought tolerance variability among our genetic material. We then used the network model to study phenotypic changes in less tractable agronomical conditions, i.e. sunflower hybrids subjected to different watering regimes in field trials. Mapping this new dataset in the gene-phenotype network allowed us to identify genes whose expression was robustly affected by water deprivation in both controlled and field conditions. The enrichment in genes correlated to relative water content and osmotic potential provides evidence of the importance of these traits in agronomical conditions.
Poplar is a model organism for high in vitro regeneration in woody plants. We have chosen a hybrid poplar Populus davidiana Dode × Populus bollena Lauche. By optimizing the Murashige and Skoog medium with (0.3 mg/L) 6-benzylaminopurine and (0.08 mg/L) naphthaleneacetic acid, we have achieved the highest frequency (90%) for shoot regeneration from poplar leaves. It was also important to improve the transformation efficiency of poplar for genetic breeding and other applications. In this study, we found a significant improvement of the transformation frequency by controlling the leaf age. Transformation efficiency was enhanced by optimizing the Agrobacterium concentration (OD600 = 0.8–1.0) and an infection time (20–30 min). According to transmission electron microscopy observations, there were more Agrobacterium invasions in the 30-day-old leaf explants than in 60-day-old and 90-day-old explants. Using the green fluorescent protein (GFP) marker, the expression of MD–GFP fusion proteins in the leaf, shoot, and root of hybrid poplar P. davidiana Dode × P. bollena Lauche was visualized for confirmation of transgene integration. Southern and Northern blot analysis also showed the integration of T-DNA into the genome and gene expression of transgenic plants. Our results suggest that younger leaves had higher transformation efficiency (~30%) than older leaves (10%).
regeneration; transformation; leaf age; poplar; Agrobacterium
The genus Populus is accepted as a model system for molecular tree biology. To investigate gene functions in Populus spp. trees, generating stable transgenic lines is the common technique for functional genetic studies. However, a limited number of genes have been targeted due to the lengthy transgenic process. Transient transformation assays complementing stable transformation have significant advantages for rapid in vivo assessment of gene function. The aim of this study is to develop a simple and efficient transient transformation for hybrid aspen and to provide its potential applications for functional genomic approaches.
We developed an in planta transient transformation assay for young hybrid aspen cuttings using Agrobacterium-mediated vacuum infiltration. The transformation conditions such as the infiltration medium, the presence of a surfactant, the phase of bacterial growth and bacterial density were optimized to achieve a higher transformation efficiency in young aspen leaves. The Agrobacterium infiltration assay successfully transformed various cell types in leaf tissues. Intracellular localization of four aspen genes was confirmed in homologous Populus spp. using fusion constructs with the green fluorescent protein. Protein-protein interaction was detected in transiently co-transformed cells with bimolecular fluorescence complementation technique. In vivo promoter activity was monitored over a few days in aspen cuttings that were transformed with luciferase reporter gene driven by a circadian clock promoter.
The Agrobacterium infiltration assay developed here is a simple and enhanced throughput method that requires minimum handling and short transgenic process. This method will facilitate functional analyses of Populus genes in a homologous plant system.
Populus; Agrobacterium-mediated vacuum infiltration; Transient expression; Subcellular localization; Co-localization; Luciferase reporter assay
Reverse-engineering regulatory networks is one of the central challenges for computational biology. Many techniques have been developed to accomplish this by utilizing transcription factor binding data in conjunction with expression data. Of these approaches, several have focused on the reconstruction of the cell cycle regulatory network of Saccharomyces cerevisiae. The emphasis of these studies has been to model the relationships between transcription factors and their target genes. In contrast, here we focus on reverse-engineering the network of relationships among transcription factors that regulate the cell cycle in S. cerevisiae.
We have developed a technique to reverse-engineer networks of the time-dependent activities of transcription factors that regulate the cell cycle in S. cerevisiae. The model utilizes linear regression to first estimate the activities of transcription factors from expression time series and genome-wide transcription factor binding data. We then use least squares to construct a model of the time evolution of the activities. We validate our approach in two ways: by demonstrating that it accurately models expression data and by demonstrating that our reconstructed model is similar to previously-published models of transcriptional regulation of the cell cycle.
Our regression-based approach allows us to build a general model of transcriptional regulation of the yeast cell cycle that includes additional factors and couplings not reported in previously-published models. Our model could serve as a starting point for targeted experiments that test the predicted interactions. In the future, we plan to apply our technique to reverse-engineer other systems where both genome-wide time series expression data and transcription factor binding data are available.
Gene expression profiles can be used to infer previously unknown transcriptional regulatory interaction among thousands of genes, via systems biology ‘reverse engineering’ approaches. We ‘reverse engineered’ an embryonic stem (ES)-specific transcriptional network from 171 gene expression profiles, measured in ES cells, to identify master regulators of gene expression (‘hubs’). We discovered that E130012A19Rik (E13), highly expressed in mouse ES cells as compared with differentiated cells, was a central ‘hub’ of the network. We demonstrated that E13 is a protein-coding gene implicated in regulating the commitment towards the different neuronal subtypes and glia cells. The overexpression and knock-down of E13 in ES cell lines, undergoing differentiation into neurons and glia cells, caused a strong up-regulation of the glutamatergic neurons marker Vglut2 and a strong down-regulation of the GABAergic neurons marker GAD65 and of the radial glia marker Blbp. We confirmed E13 expression in the cerebral cortex of adult mice and during development. By immuno-based affinity purification, we characterized protein partners of E13, involved in the Polycomb complex. Our results suggest a role of E13 in regulating the division between glutamatergic projection neurons and GABAergic interneurons and glia cells possibly by epigenetic-mediated transcriptional regulation.
Motivation: Inferring the underlying regulatory pathways within a gene interaction network is a fundamental problem in Systems Biology to help understand the complex interactions and the regulation and flow of information within a system-of-interest. Given a weighted gene network and a gene in this network, the goal of an inference algorithm is to identify the potential regulatory pathways passing through this gene.
Results: In a departure from previous approaches that largely rely on the random walk model, we propose a novel single-source k-shortest paths based algorithm to address this inference problem. An important element of our approach is to explicitly account for and enhance the diversity of paths discovered by our algorithm. The intuition here is that diversity in paths can help enrich different functions and thereby better position one to understand the underlying system-of-interest. Results on the yeast gene network demonstrate the utility of the proposed approach over extant state-of-the-art inference algorithms. Beyond utility, our algorithm achieves a significant speedup over these baselines.
Availability: All data and codes are freely available upon request.
Supplementary data are available at Bioinformatics online.
The lack of understanding of stem cell differentiation and proliferation is a fundamental problem in developmental biology. Although gene regulatory networks (GRNs) for stem cell differentiation have been partially identified, the nature of differentiation dynamics and their regulation leading to robust development remain unclear. Herein, using a dynamical system modeling cell approach, we performed simulations of the developmental process using all possible GRNs with a few genes, and screened GRNs that could generate cell type diversity through cell-cell interactions. We found that model stem cells that both proliferated and differentiated always exhibited oscillatory expression dynamics, and the differentiation frequency of such stem cells was regulated, resulting in a robust number distribution. Moreover, we uncovered the common regulatory motifs for stem cell differentiation, in which a combination of regulatory motifs that generated oscillatory expression dynamics and stabilized distinct cellular states played an essential role. These findings may explain the recently observed heterogeneity and dynamic equilibrium in cellular states of stem cells, and can be used to predict regulatory networks responsible for differentiation in stem cell systems.
The mechanisms of the control and activity of the autophagy-lysosomal protein degradation machinery are emerging as an important theme for neurodevelopment and neurodegeneration. However, the underlying regulatory and functional networks of known genes controlling autophagy and lysosomal function and their role in disease are relatively unexplored. We performed a systems biology-based integrative computational analysis to study the interactions between molecular components and to develop models for regulation and function of genes involved in autophagy and lysosomal function. Specifically, we analyzed transcriptional and microRNA-based post-transcriptional regulation of these genes and performed functional enrichment analyses to understand their involvement in nervous system-related diseases and phenotypes. Transcriptional regulatory network analysis showed that binding sites for transcription factors, SREBP1, USF, AP-1 and NFE2, are common among autophagy and lysosomal genes. MicroRNA enrichment analysis revealed miR-130, 98, 124, 204 and 142 as the putative post-transcriptional regulators of the autophagy-lysosomal pathway genes. Pathway enrichment analyses revealed that the mTOR and insulin signaling pathways are important in the regulation of genes involved in autophagy. In addition, we found that glycosaminoglycan and glycosphingolipid pathways also make a major contribution to lysosomal gene regulation. The analysis confirmed the known contribution of the autophagy-lysosomal genes to Alzheimer and Parkinson diseases and also revealed potential involvement in tuberous sclerosis, neuronal ceroidlipofuscinoses, sepsis and lung, liver and prostatic neoplasms. To further probe the impact of autophagy-lysosomal gene deficits on neurologically-linked phenotypes, we also mined the mouse knockout phenotype data for the autophagy-lysosomal genes and found them to be highly predictive of nervous system dysfunction. Overall this study demonstrates the utility of systems biology-based approaches for understanding the autophagy-lysosomal pathways and gaining additional insights into the potential impact of defects in these complex biological processes.
systems biology; autophagy; lysosome; transcription factors; microRNA
Perennial woody species, such as poplar (Populus spp.) must acquire necessary heavy metals like zinc (Zn) while avoiding potential toxicity. Poplar contains genes with sequence homology to genes HMA4 and PCS1 from other species which are involved in heavy metal regulation. While basic genomic conservation exists, poplar does not have a hyperaccumulating phenotype. Poplar has a common indicator phenotype in which heavy metal accumulation is proportional to environmental concentrations but excesses are prevented. Phenotype is partly affected by regulation of HMA4 and PCS1 transcriptional abundance. Wild-type poplar down-regulates several transcripts in its Zn-interacting pathway at high Zn levels. Also, overexpressed PtHMA4 and PtPCS1 genes result in varying Zn phenotypes in poplar; specifically, there is a doubling of Zn accumulation in leaf tissues in an overexpressed PtPCS1 line. The genomic complement and regulation of poplar highlighted in this study supports a role of HMA4 and PCS1 in Zn regulation dictating its phenotype. These genes can be altered in poplar to change its interaction with Zn. However, other poplar genes in the surrounding pathway may maintain the phenotype by inhibiting drastic changes in heavy metal accumulation with a single gene transformation.
Heavy metal; heavy metal transporter; phytochelatin synthase; poplar nutrition
The elucidation of mammalian transcriptional regulatory networks holds great promise for both basic and translational research and remains one the greatest challenges to systems biology. Recent reverse engineering methods deduce regulatory interactions from large-scale mRNA expression profiles and cross-species conserved regulatory regions in DNA. Technical challenges faced by these methods include distinguishing between direct and indirect interactions, associating transcription regulators with predicted transcription factor binding sites (TFBSs), identifying non-linearly conserved binding sites across species, and providing realistic accuracy estimates.
We address these challenges by closely integrating proven methods for regulatory network reverse engineering from mRNA expression data, linearly and non-linearly conserved regulatory region discovery, and TFBS evaluation and discovery. Using an extensive test set of high-likelihood interactions, which we collected in order to provide realistic prediction-accuracy estimates, we show that a careful integration of these methods leads to significant improvements in prediction accuracy. To verify our methods, we biochemically validated TFBS predictions made for both transcription factors (TFs) and co-factors; we validated binding site predictions made using a known E2F1 DNA-binding motif on E2F1 predicted promoter targets, known E2F1 and JUND motifs on JUND predicted promoter targets, and a de novo discovered motif for BCL6 on BCL6 predicted promoter targets. Finally, to demonstrate accuracy of prediction using an external dataset, we showed that sites matching predicted motifs for ZNF263 are significantly enriched in recent ZNF263 ChIP-seq data.
Using an integrative framework, we were able to address technical challenges faced by state of the art network reverse engineering methods, leading to significant improvement in direct-interaction detection and TFBS-discovery accuracy. We estimated the accuracy of our framework on a human B-cell specific test set, which may help guide future methodological development.
With the accumulation of increasing omics data, a key goal of systems biology is to construct networks at different cellular levels to investigate cellular machinery of the cell. However, there is currently no satisfactory method to construct an integrated cellular network that combines the gene regulatory network and the signaling regulatory pathway.
In this study, we integrated different kinds of omics data and developed a systematic method to construct the integrated cellular network based on coupling dynamic models and statistical assessments. The proposed method was applied to S. cerevisiae stress responses, elucidating the stress response mechanism of the yeast. From the resulting integrated cellular network under hyperosmotic stress, the highly connected hubs which are functionally relevant to the stress response were identified. Beyond hyperosmotic stress, the integrated network under heat shock and oxidative stress were also constructed and the crosstalks of these networks were analyzed, specifying the significance of some transcription factors to serve as the decision-making devices at the center of the bow-tie structure and the crucial role for rapid adaptation scheme to respond to stress. In addition, the predictive power of the proposed method was also demonstrated.
We successfully construct the integrated cellular network which is validated by literature evidences. The integration of transcription regulations and protein-protein interactions gives more insight into the actual biological network and is more predictive than those without integration. The method is shown to be powerful and flexible and can be used under different conditions and for different species. The coupling dynamic models of the whole integrated cellular network are very useful for theoretical analyses and for further experiments in the fields of network biology and synthetic biology.
Understanding gene interactions is a fundamental question in systems biology. Currently, modeling of gene regulations using the Bayesian Network (BN) formalism assumes that genes interact either instantaneously or with a certain amount of time delay. However in reality, biological regulations, both instantaneous and time-delayed, occur simultaneously. A framework that can detect and model both these two types of interactions simultaneously would represent gene regulatory networks more accurately.
In this paper, we introduce a framework based on the Bayesian Network (BN) formalism that can represent both instantaneous and time-delayed interactions between genes simultaneously. A novel scoring metric having firm mathematical underpinnings is also proposed that, unlike other recent methods, can score both interactions concurrently and takes into account the reality that multiple regulators can regulate a gene jointly, rather than in an isolated pair-wise manner. Further, a gene regulatory network (GRN) inference method employing an evolutionary search that makes use of the framework and the scoring metric is also presented.
By taking into consideration the biological fact that both instantaneous and time-delayed regulations can occur among genes, our approach models gene interactions with greater accuracy. The proposed framework is efficient and can be used to infer gene networks having multiple orders of instantaneous and time-delayed regulations simultaneously. Experiments are carried out using three different synthetic networks (with three different mechanisms for generating synthetic data) as well as real life networks of Saccharomyces cerevisiae, E. coli and cyanobacteria gene expression data. The results show the effectiveness of our approach.