Metabolic engineering modifies cellular function to address various biochemical applications. Underlying metabolic engineering efforts are a host of tools and knowledge that are integrated to enable successful outcomes. Concurrent development of computational and experimental tools has enabled different approaches to metabolic engineering. One approach is to leverage knowledge and computational tools to prospectively predict designs to achieve the desired outcome. An alternative approach is to utilize combinatorial experimental tools to empirically explore the range of cellular function and to screen for desired traits. This mini-review focuses on computational systems biology and synthetic biology tools that can be used in combination for prospective in silico strain design.
Metabolic engineering; Genome-scale modeling; Synthetic biology; Computational design; Biotechnology
Thermobifida fusca is a cellulolytic bacterium with potential to be used as a platform organism for sustainable industrial production of biofuels, pharmaceutical ingredients and other bioprocesses due to its capability of potential to convert plant biomass to value-added chemicals. To best develop T. fusca as a bioprocess organism, it is important to understand its native cellular processes. In the current study, we characterize the metabolic network of T. fusca through reconstruction of a genome-scale metabolic model and proteomics data. The overall goal of this study was to use multiple metabolic models generated by different methods and comparison to experimental data to gain a high-confidence understanding of the T. fusca metabolic network.
We report the generation of three versions of a metabolic model of Thermobifida fusca sp. XY developed using three different approaches (automated, semi-automated, and proteomics-derived). The model closest to in vivo growth was the proteomics-derived model that consists of 975 reactions involving 1382 metabolites and account for 316 EC numbers (296 genes). The model was optimized for biomass production with the optimal flux of 0.48 doublings per hour when grown on cellobiose with a substrate uptake rate of 0.25 mmole/h. In vivo activity of the DXP pathway for terpenoid biosynthesis was also confirmed using real-time PCR.
iTfu296 provides a platform to understand and explore the metabolic capabilities of the actinomycete T. fusca for the potential use in bioprocess industries for the production of biofuel and pharmaceutical ingredients. By comparing different model reconstruction methods, the use of high-throughput proteomics data as a starting point proved to be the most accurate to in vivo growth.
Metabolic Modeling; Flux Balance Analysis; Constraint Based Modeling; Actinomycete; Thermobifida fusca; Proteomics Profiling; Terpenoids Biosynthesis Pathway; DXP Pathway; Mevalonate Pathway; Biofuel
Like for other somatic tissues, isolation of a pure population of stem cells has been a primary goal in epidermal biology. We isolated discrete populations of freshly obtained human neonatal keratinocytes (HNKs) using previously untested candidate stem cell markers aldehyde dehydrogenase (ALDH) and CD44 as well as the previously studied combination of integrin α6 and CD71. An in vivo transplantation assay combined with limiting dilution analysis was used to quantify enrichment for long-term repopulating cells in the isolated populations. The ALDH+CD44+ population was enriched 12.6-fold for long-term repopulating epidermal stem cells (EpiSCs) and the integrin α6hiCD71lo population was enriched 5.6-fold, over unfractionated cells. In addition to long-term repopulation, CD44+ALDH+ keratinocytes exhibited other stem cell properties. CD44+ALDH+ keratinocytes had self-renewal ability, demonstrated by increased numbers of cells expressing nuclear Bmi-1, serial transplantation of CD44+ALDH+ cells, and holoclone formation in vitro. CD44+ALDH+ cells were multipotent, producing greater numbers of hair follicle-like structures than CD44−ALDH− cells. Furthermore, 58% ± 7% of CD44+ALDH+ cells exhibited label-retention. In vitro, CD44+ALDH+ cells showed enhanced colony formation, in both keratinocyte and embryonic stem cell growth media. In summary, the CD44+ALDH+ population exhibits stem cell properties including long-term epidermal regeneration, multipotency, label retention, and holoclone formation. This study shows that it is possible to quantify the relative number of EpiSCs in human keratinocyte populations using long-term repopulation as a functional test of stem cell nature. Future studies will combine isolation strategies as dictated by the results of quantitative transplantation assays, in order to achieve a nearly pure population of EpiSCs.
Keratinocyte; Stem cell; Epidermis; Human; Aldehyde dehydrogenase; CD44
Even with decreasing DNA synthesis costs there remains a need for inexpensive, rapid, and reliable methods for assembling synthetic DNA into larger constructs or combinatorial libraries. Advances in cloning techniques have resulted in powerful in vitro and in vivo assembly of DNA. However, monetary and time costs have limited these approaches. Here, we report an ex vivo DNA assembly method that uses cellular lysates derived from a commonly used laboratory strain of Escherichia coli for joining double-stranded DNA with short end homologies embedded within inexpensive primers. This method concurrently shortens the time and decreases costs associated with current DNA assembly methods.
DNA assembly; ex vivo; end joining; cellular lysates; colorimetric screen; synthetic biology; genetic engineering
Constraint-based metabolic models are currently the most comprehensive system-wide models of cellular metabolism. Several challenges arise when building an in silico constraint-based model of an organism that need to be addressed before flux balance analysis (FBA) can be applied for simulations. An algorithm called FBA-Gap is presented here that aids the construction of a working model based on plausible modifications to a given list of reactions that are known to occur in the organism. When applied to a working model, the algorithm gives a hypothesis concerning a minimal medium for sustaining the cell in culture. The utility of the algorithm is demonstrated in creating a new model organism and is applied to four existing working models for generating hypotheses about culture media. In modifying a partial metabolic reconstruction so that biomass may be produced using FBA, the proposed method is more efficient than a previously proposed method in that fewer new reactions are added to complete the model. The proposed method is also more accurate than other approaches in that only biologically plausible reactions and exchange reactions are used.
Thermobifida fusca is a high-G+C-content, thermophilic, Gram-positive soil actinobacterium with high cellulolytic activity. In T. fusca, CelR is thought to act as the primary regulator of cellulase gene expression by binding to a 14-bp inverted repeat [5′-(T)GGGAGCGCTCCC(A)] that is upstream of many known cellulase genes. Previously, the ability to study the roles and regulation of cellulase genes in T. fusca has been limited largely by a lack of established genetic engineering methods for T. fusca. In this study, we developed an efficient procedure for creating precise chromosomal gene disruptions and demonstrated this procedure by generating a celR deletion strain. The celR deletion strain was then characterized using measurements for growth behavior, cellulase activity, and gene expression. The celR deletion strain of T. fusca exhibited a severely crippled growth phenotype with a prolonged lag phase and decreased cell yields for growth on both glucose and cellobiose. While the maximum endoglucanase activity and cellulase activity were not significantly changed, the endoglucanase activity and cellulase activity per cell were highly elevated. Measurements of mRNA transcript levels in both the celR deletion strain and the wild-type strain indicated that the CelR protein potentially acts as a repressor for some genes and as an activator for other genes. Overall, we established and demonstrated a method for manipulating chromosomal DNA in T. fusca that can be used to study the cellulolytic capabilities of this organism. Components of this method may be useful in developing genetic engineering methods for other currently intractable organisms.
The field of synthetic biology has made rapid progress in a number of areas including method development, novel applications and community building. In seeking to make biology “engineerable,” synthetic biology is increasing the accessibility of biological research to researchers of all experience levels and backgrounds. One of the underlying strengths of synthetic biology is that it may establish the framework for a rigorous bottom-up approach to studying biology starting at the DNA level. Building upon the existing framework established largely by the Registry of Standard Biological Parts, careful consideration of future goals may lead to integrated multi- scale approaches to biology. Here we describe some of the current challenges that need to be addressed or considered in detail to continue the development of synthetic biology. Specifically, discussion on the areas of elucidating biological principles, computational methods and experimental construction methodologies are presented.
synthetic biology; multi-scale; DNA synthesis; biological standards; metabolic engineering
HOXA9-mediated up-regulation of miR-155 was noted during an array-based analysis of microRNA expression in Hoxa9−/−bone marrow (BM) cells. HOXA9 induction of miR-155 was confirmed in these samples, as well as in wild-type versus Hoxa9-deficient marrow, using northern analysis and qRT–PCR. Infection of wild-type BM with HOXA9 expressing or GFP+ control virus further confirmed HOXA9-mediated regulation of miR-155. miR-155 expression paralleled Hoxa9 mRNA expression in fractionated BM progenitors, being highest in the stem cell enriched pools. HOXA9 capacity to induce myeloid colony formation was blunted in miR-155-deficient BM cells, indicating that miR-155 is a downstream mediator of HOXA9 function in blood cells. Pu.1, an important regulator of myelopoiesis, was identified as a putative down stream target for miR-155. Although miR-155 was shown to down-regulate the Pu.1 protein, HOXA9 did not appear to modulate Pu.1 expression in murine BM cells.
Microorganisms possess diverse metabolic capabilities that can potentially be leveraged for efficient production of biofuels. Clostridium thermocellum (ATCC 27405) is a thermophilic anaerobe that is both cellulolytic and ethanologenic, meaning that it can directly use the plant sugar, cellulose, and biochemically convert it to ethanol. A major challenge in using microorganisms for chemical production is the need to modify the organism to increase production efficiency. The process of properly engineering an organism is typically arduous.
Here we present a genome-scale model of C. thermocellum metabolism, iSR432, for the purpose of establishing a computational tool to study the metabolic network of C. thermocellum and facilitate efforts to engineer C. thermocellum for biofuel production. The model consists of 577 reactions involving 525 intracellular metabolites, 432 genes, and a proteomic-based representation of a cellulosome. The process of constructing this metabolic model led to suggested annotation refinements for 27 genes and identification of areas of metabolism requiring further study. The accuracy of the iSR432 model was tested using experimental growth and by-product secretion data for growth on cellobiose and fructose. Analysis using this model captures the relationship between the reduction-oxidation state of the cell and ethanol secretion and allowed for prediction of gene deletions and environmental conditions that would increase ethanol production.
By incorporating genomic sequence data, network topology, and experimental measurements of enzyme activities and metabolite fluxes, we have generated a model that is reasonably accurate at predicting the cellular phenotype of C. thermocellum and establish a strong foundation for rational strain design. In addition, we are able to draw some important conclusions regarding the underlying metabolic mechanisms for observed behaviors of C. thermocellum and highlight remaining gaps in the existing genome annotations.
The generation of well-characterized parts and the formulation of biological design principles in synthetic biology are laying the foundation for more complex and advanced microbial metabolic engineering. Improvements in de novo DNA synthesis and codon-optimization alone are already contributing to the manufacturing of pathway enzymes with improved or novel function. Further development of analytical and computer-aided design tools should accelerate the forward engineering of precisely regulated synthetic pathways by providing a standard framework for the predictable design of biological systems from well-characterized parts. In this review we discuss the current state of synthetic biology within a four-stage framework (design, modeling, synthesis, analysis) and highlight areas requiring further advancement to facilitate true engineering of synthetic microbial metabolism.
Feed-forward motifs are important functional modules in biological and other complex networks. The functionality of feed-forward motifs and other network motifs is largely dictated by the connectivity of the individual network components. While studies on the dynamics of motifs and networks are usually devoted to the temporal or spatial description of processes, this study focuses on the relationship between the specific architecture and the overall rate of the processes of the feed-forward family of motifs, including double and triple feed-forward loops. The search for the most efficient network architecture could be of particular interest for regulatory or signaling pathways in biology, as well as in computational and communication systems.
Feed-forward motif dynamics were studied using cellular automata and compared with differential equation modeling. The number of cellular automata iterations needed for a 100% conversion of a substrate into a target product was used as an inverse measure of the transformation rate. Several basic topological patterns were identified that order the specific feed-forward constructions according to the rate of dynamics they enable. At the same number of network nodes and constant other parameters, the bi-parallel and tri-parallel motifs provide higher network efficacy than single feed-forward motifs. Additionally, a topological property of isodynamicity was identified for feed-forward motifs where different network architectures resulted in the same overall rate of the target production.
It was shown for classes of structural motifs with feed-forward architecture that network topology affects the overall rate of a process in a quantitatively predictable manner. These fundamental results can be used as a basis for simulating larger networks as combinations of smaller network modules with implications on studying synthetic gene circuits, small regulatory systems, and eventually dynamic whole-cell models.
In comparison with intensive studies of genetic mechanisms related to biological evolutionary systems, much less analysis has been conducted on metabolic network responses to adaptive evolution that are directly associated with evolved metabolic phenotypes. Metabolic mechanisms involved in laboratory evolution of Escherichia coli on gluconeogenic carbon sources, such as lactate, were studied based on intracellular flux states determined from 13C tracer experiments and 13C-constrained flux analysis. At the end point of laboratory evolution, strains exhibited a more than doubling of the average growth rate and a 50% increase in the average biomass yield. Despite different evolutionary trajectories among parallel evolved populations, most improvements were obtained within the first 250 generations of evolution and were generally characterized by a significant increase in pathway capacity. Partitioning between gluconeogenic and pyruvate catabolic flux at the pyruvate node remained almost unchanged, while flux distributions around the key metabolites phosphoenolpyruvate, oxaloacetate, and acetyl-coenzyme A were relatively flexible over the course of evolution on lactate to meet energetic and anabolic demands during rapid growth on this gluconeogenic carbon substrate. There were no clear qualitative correlations between most transcriptional expression and metabolic flux changes, suggesting complex regulatory mechanisms at multiple levels of genetics and molecular biology. Moreover, higher fitness gains for cell growth on both evolutionary and alternative carbon sources were found for strains that adaptively evolved on gluconeogenic carbon sources compared to those that evolved on glucose. These results provide a novel systematic view of the mechanisms underlying microbial adaptation to growth on a gluconeogenic substrate.
Genome-scale metabolic network models can be reconstructed for well-characterized organisms using genomic annotation and literature information. However, there are many instances in which model predictions of metabolic fluxes are not entirely consistent with experimental data, indicating that the reactions in the model do not match the active reactions in the in vivo system. We introduce a method for determining the active reactions in a genome-scale metabolic network based on a limited number of experimentally measured fluxes. This method, called optimal metabolic network identification (OMNI), allows efficient identification of the set of reactions that results in the best agreement between in silico predicted and experimentally measured flux distributions. We applied the method to intracellular flux data for evolved Escherichia coli mutant strains with lower than predicted growth rates in order to identify reactions that act as flux bottlenecks in these strains. The expression of the genes corresponding to these bottleneck reactions was often found to be downregulated in the evolved strains relative to the wild-type strain. We also demonstrate the ability of the OMNI method to diagnose problems in E. coli strains engineered for metabolite overproduction that have not reached their predicted production potential. The OMNI method applied to flux data for evolved strains can be used to provide insights into mechanisms that limit the ability of microbial strains to evolve towards their predicted optimal growth phenotypes. When applied to industrial production strains, the OMNI method can also be used to suggest metabolic engineering strategies to improve byproduct secretion. In addition to these applications, the method should prove to be useful in general for reconstructing metabolic networks of ill-characterized microbial organisms based on limited amounts of experimental data.
One of the major uses of in silico models in biology is to identify discrepancies between model predictions and experimental data and use these discrepancies to drive discovery of novel biological mechanisms. However, models only allow for identification of the discrepancies; they do not necessarily provide any assistance in discovering what are the missing or incorrect functionalities in the model that cause these discrepancies. Herrgård et al. describe a new in silico method, optimal metabolic network identification, or OMNI, that performs this discovery process in an efficient and systematic manner for genome-scale metabolic networks. Given a preliminary metabolic network model and experimentally determined metabolic flux data, OMNI finds the changes that need to be made to the model so that its predictions match the experimental data as well as possible. Herrgård et al. apply the method to identify metabolic bottlenecks in experimentally evolved Escherichia coli strains and to diagnose problems in strains designed through metabolic engineering strategies to overproduce specific desirable byproducts. The OMNI method can also be adapted to number of other settings, including identification of novel biochemical pathways in ill-characterized organisms based on limited amounts of experimental data.
Genome-scale in silico metabolic networks of Escherichia coli have been reconstructed. By using a constraint-based in silico model of a reconstructed network, the range of phenotypes exhibited by E. coli under different growth conditions can be computed, and optimal growth phenotypes can be predicted. We hypothesized that the end point of adaptive evolution of E. coli could be accurately described a priori by our in silico model since adaptive evolution should lead to an optimal phenotype. Adaptive evolution of E. coli during prolonged exponential growth was performed with M9 minimal medium supplemented with 2 g of α-ketoglutarate per liter, 2 g of lactate per liter, or 2 g of pyruvate per liter at both 30 and 37°C, which produced seven distinct strains. The growth rates, substrate uptake rates, oxygen uptake rates, by-product secretion patterns, and growth rates on alternative substrates were measured for each strain as a function of evolutionary time. Three major conclusions were drawn from the experimental results. First, adaptive evolution leads to a phenotype characterized by maximized growth rates that may not correspond to the highest biomass yield. Second, metabolic phenotypes resulting from adaptive evolution can be described and predicted computationally. Third, adaptive evolution on a single substrate leads to changes in growth characteristics on other substrates that could signify parallel or opposing growth objectives. Together, the results show that genome-scale in silico metabolic models can describe the end point of adaptive evolution a priori and can be used to gain insight into the adaptive evolutionary process for E. coli.