|Home | About | Journals | Submit | Contact Us | Français|
The capacity of microorganisms to respond to variable external conditions requires a coordination of environment-sensing mechanisms and decision-making regulatory circuits. Here, we seek to understand the interplay between these two processes by combining high-throughput measurement of time-dependent mRNA profiles with a novel computational approach that searches for key genetic triggers of transcriptional changes. Our approach helped us understand the regulatory strategies of a respiratorily versatile bacterium with promising bioenergy and bioremediation applications, Shewanella oneidensis, in minimal and rich media. By comparing expression profiles across these two conditions, we unveiled components of the transcriptional program that depend mainly on the growth phase. Conversely, by integrating our time-dependent data with a previously available large compendium of static perturbation responses, we identified transcriptional changes that cannot be explained solely by internal network dynamics, but are rather triggered by specific genes acting as key mediators of an environment-dependent response. These transcriptional triggers include known and novel regulators that respond to carbon, nitrogen and oxygen limitation. Our analysis suggests a sequence of physiological responses, including a coupling between nitrogen depletion and glycogen storage, partially recapitulated through dynamic flux balance analysis, and experimentally confirmed by metabolite measurements. Our approach is broadly applicable to other systems.
The transcriptional dynamics of microbial growth depends both on the physico-chemical properties of the external environment and on the internal state of the cell (e.g. its growth rate) (1–3). The time-dependent array of environmental stimuli and the wiring of the underlying regulatory network jointly determine the dynamical changes in gene expression (4, 5). Understanding the interplay between these two elements is an important open challenge, especially relevant for metabolically versatile microbes that may rapidly cycle through different environments. Given that a microbial cell needs to dynamically adapt to changing environmental conditions, transient changes in protein expression must be triggered by condition-specific external factors, which interact with the intracellular genetic network. However, other transcriptional programs, such as those involved in cellular division, must be robust to influences external to the cell. Thus, it may be possible to discriminate between environment-independent circuits (e.g. dedicated to controlling core cellular functions, including growth) and portions of the genetic networks that have specifically evolved to mediate environmental sensing (6–8). The interplay between genetic circuitry and external variables is particularly important during growth phase transitions. Microorganisms are able to sense the depletion of environmental resources and respond with a variety of condition-specific strategies, such as sporulation (9,10), increased motility (11) and the formation or dispersal of biofilms (12,13). DNA microarrays have been widely used to monitor the time-dependent response of various microorganisms to environmental change, showing that their fast physiological adaptation is typically accompanied by a massive transcriptional rearrangement, where selected genes dynamically adjust their level of expression as a function of the external stimuli (14–17).
Here, we combine experimental and computational approaches to study the transcriptional response upon growth phase transitions for the versatile gram-negative bacterium Shewanella oneidensis MR-1. Shewanella oneidensis is found mainly in nutrient-rich environments, such as marine sediments, often in association with fermentative communities (18–20). Shewanella species have the characteristic ability to reduce a broad spectrum of metals and organic compounds (in addition to oxygen), making them very relevant for environmental and bioenergy-related applications, such as bioremediation of metal-contaminated waters, biofuel production and microbial fuel cells (18,19,21–24). Several prior studies on S. oneidensis cultures have characterized transcriptional and proteomic responses of this microbe under various media and stress conditions (25–38). These comparative studies have mostly focused on ‘binary’ changes, i.e. on comparing transcriptional states before and after a given perturbation or environmental shift. However, to our knowledge, there has been no detailed measurement and systems level analysis of transcript levels at multiple time points as S. oneidensis transitions between exponential and stationary phase of growth.
Our results are organized as follows: first, we introduce our measurements of time-dependent mRNA expression levels during batch growth of S. oneidensis MR-1 under two radically different growth media compositions (minimal lactate and rich LB, i.e. Luria and Bertani’s Lysogenic Broth) and identify global transcriptional trends. We then implement a new growth derivative mapping (GDM) approach to compare transcriptional profiles across different time-course experiments and to discriminate genes (and processes) most likely controlled by growth-associated functions. Next, we dive into a detailed system-level analysis of the genetic response to growth phase transitions under the lactate-minimal medium. Specifically, we use a new mathematical approach (dynamic detection of transcriptional triggers or D2T2) to identify the genes through which environmental stimuli are expected to affect the internal dynamics. Our analysis highlights the importance of some specific pathways, whose metabolic relevance is confirmed by dynamic flux balance analysis (dFBA) calculations. In particular, we characterize some aspects of the transcriptional response to oxygen limitation, detecting the activation of genes previously shown to be relevant for anaerobic respiration. Moreover, we find that nitrogen limitation is coupled to storage of glycogen. Both observations are corroborated by measurement of relevant intracellular and extracellular metabolites, as well as by complementary analyses of literature information and competitive fitness assay data.
dl-Lactate (60% solution) and ammonia analysis kit were procured from Sigma-Aldrich (St Louis, MO, USA). All other components of the M4 minimal medium (Supplementary Table S1 in Supplementary text) were of highest purity grade and were also procured either from Sigma-Aldrich or Thermo-Fisher Scientific (Pittsburgh, PA, USA). Qiagen Inc. supplied the RNA protect reagent, RNAse easy kit for isolation of RNA, cDNA purification kit and RNAse-free DNAse enzyme. Other reagents and chemicals used during isolation and purification of RNA and during various steps of Shewanella chips hybridization (Affymetrix Inc.) were purchased from several different vendors: Superscript II reverse transcriptase, DTT, random hexamers and BSA from Invitrogen Inc.; Gene chip labeling reagent, One-phor-all buffer and B2 oligo from Affymetrix; DNAse from Pierce Biochemicals; MES stock, lysozyme, Goat IgG and 200 proof ethanol from Sigma-Aldrich; Terminal transferase, Herring sperm DNA and dNTPs from Promega; 0.5M EDTA solution from Gibco; Biotinylated Anti-Streptavidin antibody from Vector laboratories; SSPE, Streptavidin, SAPE, 10% Tween-20, NaOH and HCl from Thermo-Fisher Scientific; and TE Buffer (pH 8.0), Superase 1n, 5M NaCl and nuclease-free water were from Ambion.
Shewanella oneidensis MR-1 ATCC 700550 was used in this work. The strain was revived from –80°C glycerol stocks by overnight growth in LB medium. One hundred microliters of an overnight S. oneidensis MR-1 pre-culture was inoculated in M4 minimal medium containing lactate (LAC) (Supplementary Table S1) and rich LB medium separately for the inoculum, which was used for inoculation in 1.3l working volume Bioreactor vessel (Bioflo110, New Brunswick Scientific Company) for M4-lactate and LB media runs, respectively. Various growth parameters, viz., temperature (30°C), pH (7.2), aeration (1l/m) and agitation were controlled using microprocessor probes. The pH was maintained at 7.2 using automatic additions of 2N NaOH and 10% H3PO4 using peristaltic pumps attached to the bioreactor. The dissolved oxygen (dO2) probe was calibrated before the inoculation and dO2 in the vessel was maintained at 20% air saturation level using automatic control of O2 cascade throughout the experiment. We started collecting biomass samples for RNA isolation after the optical density of culture was at least above 0.150. In lactate-minimal medium (LAC), the time interval for most of the biomass samples between collections was between 1 and 6h, except after 36h only two samples were collected at 48 and 50h. For LB medium, the biomass samples were collected every 30min between 1.5 and 6h of growth (exponential phase); however, after late exponential and into stationary phase, the biomass samples were collected at time intervals of 1–4h until 36h of growth. Three more samples at 48, 50 and 55h near the end of bioreactor run were also collected. We collected two biomass samples during each collection, and both the samples were processed for RNA extraction to provide statistical significance in subsequent microarray results.
High pressure liquid chromatography (HPLC) was used for quantifying residual lactate and other organic acids (acetate, pyruvate) produced during growth of S. oneidensis MR-1 on lactate. The BioRad column (HPX 87H 300mm×7.8mm) was used on a HPLC (HP-Agilent 1090 Series II). The quantification was carried out by injecting 20µl of sample using 0.008N H2SO4 in mobile phase at 35°C. The flow rate during quantification was kept at 0.6ml/h and detection was done at 210nm. The regression equations from individual standard curve of all three compounds were used to calculate the amount of individual compound from the test sample. The retention times for lactate, acetate and pyruvate were 12.5, 14.9 and 8.9min, respectively. Residual lactate was also measured using enzymatic analysis kit (R-Biopharm). Residual ammonium in the cell-free supernatant and intracellular glycogen were measured using commercially available kits for ammonium (Sigma-Aldrich, St Louis, MO, USA) and glycogen (BioAssay Systems, Hayward, CA, USA), respectively.
At each time point of interest, 2–30ml (depending on phase of growth and cell concentration) of live bacterial culture were sampled, immediately mixed with two volumes of Qiagen RNA protect solution, vortexed and centrifuged, and RNA was isolated from biomass pellet using Qiagen’s RNA easy isolation kit according to manufacturer’s recommendation. During isolation procedure, the RNA sample was treated with Qiagen’s DNAase to get rid of any genomic DNA contamination. RNA yields were quantified on Nanodrop and UV 260/280 ratios were calculated to check purity of each RNA sample. The previously described protocol (23,31) was used for microarrays on S. oneidensis chips from Affymetrix. Briefly, ~10µg of each RNA sample were used for cDNA synthesis, followed by reverse transcription, cDNA purification and cDNA fragmentation. This was followed by labeling of cDNA and 16h of hybridization at 45°C on S. oneidensis arrays. The labeled and hybridized arrays were subjected to several cycles of washing and staining using Affymetrix Wash buffers A and B, Goat IgG, Streptavidin, Anti-streptavidin and SAPE according to Affymetrix protocol for prokaryotic arrays. This was followed by scanning of stained arrays on Affymetrix Gene Chip Scanner Model 3000. The S. oneidensis MR-1 platform and microarray data has been submitted to Gene Expression Omnibus (accession numbers: GPL8434 and GSE25821 for platform and data, respectively).
We developed the GDM method to compare gene expression profiles during growth under two nutrient conditions. The details of GDM are available in the Supplementary Text. GDM maps gene expression profiles from a time-dependent to a growth rate-dependent domain, through the following steps: first, we fit the observed OD curves to a logistic growth model of bacterial growth. Then, by matching time points corresponding to similar instantaneous growth rates in the two conditions, we generate a non-linear mapping between the rich and the minimal medium profiles, which we apply systematically to each individual gene. Finally, profiles across conditions can be compared, with the goal of finding genes whose time-dependent expression levels display significant correlation or anti-correlation across the two conditions.
The D2T2 approach is aimed at integrating gene expression time courses with steady-state profiles to identify genes that are primarily involved in the response to external perturbations. Complete method details can be found in the Supplementary Text. Here, we provide a brief summary.
The main input of the D2T2 algorithm, which ultimately provides information about the trigger genes, is a set of time course data of gene expression. However, the algorithm requires a preliminary step, which yields an influence gene network based on steady-state profiles. In this first step, the influence network model is trained on an independent large compendium of gene expression measurements taken under the assumption of steady-state conditions (39), in analogy to previous reverse engineering approaches (40–43). This initial procedure, which we recently developed and applied to the study of prion infection (44) is further shown here to provide accurate predictions on an Escherichia coli multiple perturbation dataset (see Supplementary text). The fundamental simplifying assumption of this model is that the rate of change in abundance of gene i (i.e. ) can be expressed as a linear combination of all other gene expression levels. Under steady state, this assumption reduces our model to a system of algebraic equations Ax=0, where A is a sparse n×n matrix (n is the number of genes) and each non-zero element (i.e. ) in A represents the influence of gene j on gene i. Solution to this first preliminary step is based on a relevance network approach, where pairs of genes showing a high linear proportionality across multiple conditions are first selected in order to make the influence matrix (i.e. A) sufficiently sparse. These relationships are further weighted through a multiple linear regression scheme, which evaluates the influence of a gene node in driving expression changes of the putative interacting ones. A gene network model is then selected on the basis of a Bayesian criterion, which identifies the best compromise between the model complexity and its predictive ability (See Supplementary text). It is worth noting that the solution of the regression problems is a weighted asymmetric matrix, with all diagonal elements identical to 1 resulting from rescaling each row of A by an undetermined scaling factor (i.e. the diagonal term aii). We named this rescaled matrix Ac.
In a second step (the core of the D2T2 procedure), we go beyond the static influence matrix Ac computed earlier and explicitly incorporate the measured dynamic changes in the model. This is achieved by modeling the rate of change in abundance of each gene as a linear combination of the ‘influencing-gene’ expression levels plus an additional external input (i.e. u). Here, we approximate the external function u(t) as a unique step function [u(t)=1], leading to the following formulation of the problem: . Here, the matrix B represents the influence that the perturbation u has in driving changes in the transcriptional rate of each gene. Hence, the condition-specific evolution in time of the system is described by a large-scale system of linear differential equations (ODEs) for the time-dependent gene expression profiles. Typically, the complexity of this type of model is limited by the sampling frequency and number of time points. In order to overcome these limitations, we take advantage of the initial steady-state model to reduce the fitting of the gene expression profiles to only two parameters (i.e. for gene xi we fit cii and bi, see Supplementary text for full details), and hence we write Δx=CAc x+B, where Δx is the approximation of the continuous derivative in time using the Euler scheme and C is a diagonal matrix (n×n) consisting of normalizing factors cii.
The identification of these two parameters for each single transcript aims at describing the portion of dynamical changes caused by external factors (i.e. B) from rearrangements induced by internal transcriptional regulatory mechanisms (Ac). In particular, the genes showing an incoherent behavior relative to the initial model (e.g. transient evolution) are chosen as the best candidates for direct targets of environmental changes. The underlying network and the significantly perturbed genes are identified through a rigorous statistical analysis. Once the distribution of empirical P-values has been generated, the Q-value is used to correct the index of significance for multiple testing. Categories associated to a Q-value in the 1% confidence interval are considered to be significant (similar results are obtained for the 5% confidence threshold).
dFBA was implemented as described previously (45), using a static optimization method. The detailed dFBA optimization method is described in Supplementary Text. Here, we provide a brief summary. dFBA performs iterations of flux balance analysis (46,47) with flux bounds computed based on external metabolite concentrations updated at each time step. This approach assumes that biomass and environmental metabolite changes happen at a slower scale relative to the time necessary for the achievement of intracellular steady state of metabolite levels. The S. oneidensis MR-1 genome-scale metabolic model iSO783 (48) was used to predict growth of the bacteria in the LAC experimental condition. This model contains 774 reactions and 634 unique metabolites. Initial medium composition was modeled by assuming an initial concentration of nutrients that reflects the known M4 medium components (Supplementary Table S1) with supplemented d- and l-lactate. The simulated level of external O2 was maintained throughout the dFBA simulation. To test the effect of O2 on overall growth dynamics, we run the simulation under different constant concentration values, between 2 and 8mM/h. In order to simulate the production of glycogen, we implemented a secondary objective function which maximizes a glycogen sink flux (to what could be thought of as a glycogen storage compartment) when biomass cannot be produced. This is done by imposing an objective function which is a linear combination of biomass production and glycogen production, in proportions of 99 to 1. When this objective is maximized, biomass is always favored, if the appropriate resources are available. Otherwise, excess glycogen can be observed to accumulate over time, simulating effectively the internal sequestration that happens in vivo.
Shewanella oneidensis MR-1 was cultured in a batch bioreactor using nutritionally rich medium (LB) and minimal M4 medium containing dl-lactate (LAC) (see ‘Materials and Methods’ section for details). Time-dependent mRNA expression profiles were obtained using S. oneidensis Affymetrix microarrays (BU_Shewanella_100k_v1.0, Gene Expression Omnibus platform reference number GPL8434). We collected biomass samples and measured expression at 21 time points in LB and 19 time points in LAC, covering the growth curves from early log phase to a few hours after reaching stationary phase (Figure 1A). These data sets provide snapshots of the regulatory programs employed by the organism throughout the transitions from abundantly available to depleted resources in two metabolically very different conditions (Supplementary Data set 1). An enrichment analysis for main functional categories in each cluster across the two growth conditions is presented in Supplementary Data set 2. Some initial insights into the specific transcriptional changes occurring throughout the two experiments can be obtained using standard K-means clustering (Supplementary Figure S1 in Supplementary text). In particular, in the LAC growth experiment, this analysis reveals how major events in the adaptive response at the level of the gene transcription are mainly occurring during the late-exponential phase.
To illustrate the global trends in gene expression dynamics throughout our experiments, we visualized the correlation between genome-wide transcriptional profiles at any two time points (Figure 1B–D). During growth on LAC medium, while the first consecutive time points are highly similar, a sudden drastic change marks the transition from late exponential phase into stationary phase, between 26 and 30h (Figure 1B). This sharp transition suggests a fast, highly coordinated system-level response, likely triggered by environmental changes as explored in detail later. In contrast, during growth on the LB medium, the transcriptional dynamics follows a more gradual and smoother rearrangement (Figure 1C). This is likely a consequence of the much higher number of degrees of freedom that the complex regulatory network can span while switching between the multiple carbon and nitrogen sources as they are gradually used up (49,50). Despite the clear difference between the LAC and LB transcriptional dynamics profiles, it is possible to identify weak correlations across the two experiments. As shown in Figure 1D, early log phase gene states in the LAC and LB growth conditions slightly correlate with each other, and so do transcription profiles during entry in stationary phase. This result can be interpreted as a suggestion that some components of the growth transcriptional program are environment dependent, while others are environment independent, and likely driven by the growth process itself. It is important to note that absolute times in the two experiments are different, suggesting that the LAC versus LB correlation observed holds with respect to some rescaled timeline as discussed in the following section.
In order to perform a systematic and quantitative comparative analysis of the LAC and LB time-dependent courses despite their different timescales, we devised an approach (GDM) to rescale the transcriptional profiles based on an analytical estimation of growth-rate (see ‘Materials and Methods’ section and Supplementary Text for more details) (51), inspired by previous assessments of the dependence of gene expression on growth rates (52,53). As opposed to a previous method that compared gene expression across conditions based on equal optical densities, GDM is based on an analytical estimation of growth-rate changes. Using the GDM approach, we compared rescaled transcriptional profiles for each gene across the two nutrient conditions, and identified three classes of genes (Supplementary Data set 3): a first class (Figure 2A) comprising those genes that have significant positive correlation between transcriptional profiles across the two conditions (891 genes with a Q≤0.01); a second class (Figure 2B) including 175 genes that are significantly anti-correlated; and a third class, (the remaining 3164 genes) displaying no significant correlation or anti-correlation across the two growth conditions.
Among the genes that are most significantly positively correlated between the two growth conditions, we found many genes involved in growth-dependent activities, such as ATP and aminoacyl-tRNA biosynthesis, amino acid metabolic processes, ribosomal genes, DNA replication and aerobic respiration. A notable example of correlated expression profiles in LAC and LB was found for the transcriptional regulator (TR) rpoD, the primary σD factor for exponential growth (Figure 2C). Conversely, rpoS (the σS factor and master regulator of the general stress response) displays non-correlated patterns across the two growth conditions (Supplementary Data set 3). It is known that the relative concentrations of σS and σD play a crucial role in dictating the growth strategy of a bacterium, balancing optimal growth and survival to stress events (54,55). Our data support the possibility that rpoD transcription is associated with growth-rate processes irrespective of media composition, while other regulators (e.g. rpoS) or post-transcriptional control may modulate environment-specific changes (55,56). As a further hint to the complex regulation of these two factors, relA (57), a key enzyme in the biosynthesis of the global stringent response regulatory molecule (p)ppGpp (responding to nutrient and energy starvation), shows positively correlated profiles in the LAC and LB conditions (Figure 2E). Also, rsd, an anti-sigma factor that sequesters RpoD from the RNA polymerase (58,59) displays anti-correlated expression patterns across the two conditions (Figure 2D), revealing an overall dichotomous behavior of regulators. An additional anti-correlated gene worth highlighting is csrA (Figure 2F), which is upregulated in the LAC data set and encodes a known regulator of carbon and glycogen metabolism (20,60). As evidenced later, this finding agrees with the glycogen-related transcriptional changes we observe during LAC growth (see section on Metabolite measurements and dFBA observations).
Thus, using GDM, we showed that it is possible to identify genes whose expression changes are likely associated with the growth process, and are only weakly affected by the drastically different nutritional parameters in the LAC and LB conditions. Conversely, in the next section, we ask what genes are influenced the most by environmental changes, and could be considered the main entry points for downstream transcriptional regulation.
Is it possible to use the measured expression profiles to understand environment-dependent genetic interactions (61,62), and in particular to identify the gene nodes (triggers) that mediate the response to changing environmental conditions? A lot of work has been devoted to clustering microarray data and reconstructing regulatory maps (5,42,43,63–65). Here, we focus on a rather different and complementary aspect, specifically on using time-course data to infer key genes that serve as interface between the internal network and the external environment. To overcome issues typically encountered in time-course analyses (such as small number of time samples compared to a large number of transcripts), we propose a new method, which capitalizes on the availability of a large compendium of S. oneidensis transcriptional response data to a variety of environmental perturbations (M3D) (39) in addition to our time-dependent profiles. Specifically, our approach, named D2T2 (see Supplementary text for complete details), performs reverse engineering of gene expression time courses through integration with steady-state profiles.
Briefly, the D2T2 algorithm operates in two main steps (Figure 3). A gene network model is first trained on a large compendium of gene expression measurements taken under the assumption of steady state, using a relevance network approach (see ‘Materials and Methods’ section and Supplementary text for full details). These relationships are further weighted through a multiple linear regression scheme, and refined to reach the best compromise between the model complexity and its predictive ability. The time-dependent gene expression profile, describing the condition-specific dynamic behavior of the system, is then integrated with the previously determined static network, leading to the predictions of the key responsive genes. More specifically, the genes showing an incoherent behavior relative to the initial model (e.g. transient response) are chosen as the best candidates for being the mediators (or triggers) of the overall transcriptional rearrangement. Both the underlying network and the trigger genes are determined based on statistical inference, with confidence quantifiable through Q-values (Supplementary Data set 4).
Before applying the D2T2 algorithm to the new S. oneidensis data, we tested its performance on the time-dependent transcriptional profiles in E. coli under the effect of a quinolone antibiotic (Norfloxacin) (43), and we asked whether D2T2 could recover the molecular targets and the mechanisms of action of this antibiotic, which had been well characterized before (66–68). In Table 1, we compare the performance of D2T2 with the approach of Time Series Network Identification (TSNI) (43). Full details of the analysis of the results from applying D2T2 to E. coli can be found in Supplementary text.
Our ranking of confirmed Norfloxacin targets was consistently and significantly improved relative to the TSNI results, giving e.g. recA and gyrA among the top hits (see Table 1 for details). Our method also identified two genes, tisA and tisB, so far not suspected to be implicated in quinolone action, but which could explain the interplay between quinolone antibiotics and the observed membrane depolarization (69). Our finding is in agreement with a recent report linking TisB to proton motive force and ATP levels and highlighting the role of this gene in persistency and drug tolerance (70).
In applying the D2T2 approach to our S. oneidensis transcriptional data, we had no discrete perturbation events (as opposed to the E. coli antibiotics case described above). However, we envisaged that nutrient availability might be similarly linked to transcriptional triggers whose signals propagate through the internal genetic networks. The D2T2 approach allowed us to focus on what we refer to as perturbed, or ‘trigger’ genes (290 in the LAC and 430 in LB condition, at Q≤0.01, Supplementary Data set 4). Among these trigger genes, 17 are known or putative TRs in the LAC and 24 in the LB medium (Table 2). These regulators can be thought of as the responsive triggers that dictate the overall response to the environmental changes during both growth conditions.
In Table 2 and Figure 4A, we showed 17 significantly perturbed TRs that are identified by D2T2 as the major environmental triggers during growth in LAC minimal medium. Based on available literature, we confirmed that most of these identified TRs are known to have an important role as environmental sensors in different organisms (Table 3). Many of these TRs are related to carbon and nitrogen metabolism or depletion, most notably lldR (SO3460), a gene recently shown to regulate the transcription of the L-LDH operon (74), and rpoN, a well-characterized sigma factor associated with nitrogen limitation (92). Their perturbed signals seem to propagate to the downstream nodes: lldR is likely responsible for the subsequent repression of SO0827 (l-lactate permease of the LctP family) and lldE (subunit of l-lactate dehydrogenase) (74,88). It is worth noting that both lldE and SO0827 show a significant reduction in expression near the end of the exponential growth phase, while lldR tends to increase, suggesting that lldR acts as a repressor of lldE and SO0827 (Figure 5A). Alternatively, these two genes might be coregulated by other TRs or other signalling circuits, as corroborated by the fact that both score very high in the D2T2 algorithm (Supplementary Data set 4). The potential complexity of the regulatory circuit determining the fate of lactate utilization is supported by the fact that in E. coli, lldR acts both as a repressor and an activator of the lldPRD operon (73).
Other TRs that appear to act as interfaces for environmental sensing include EtrA/Fnr (electron transport regulator) and ArcA (TR for aerobic respiration). In our data, both TRs are up-regulated during mid-log phase and down-regulated upon entry into late-log phase (Figure 5B). EtrA, a homolog of FNR in E. coli (18,81), has recently been shown to be involved in fine tuning the expression of anaerobic metabolism genes in S. oneidensis (82). Another report (83) showed that the expression of either ArcA or EtrA/Fnr was not influenced by growth conditions. Our D2T2 analysis suggests that both regulators play an important role during aerobic growth, responding to high oxygen demand during fast growth (as supported by data in the section on metabolite profiles and dFBA discussed later), and possibly to the initial sensing of carbon and nitrogen starvation.
A third example of TRs whose transcriptional changes are best explained by external factors using our D2T2 algorithm is the pair of nitrogen-related global regulators lrp (leucine responsive protein) and rpoN (σ54). These regulators affect a number of genes relevant to nitrogen utilization (ntrB, ntrC, amtB_1, amtB_2, glnA, glnK_1, glnK_2). In addition, the network of interdependencies obtained by D2T2 shows that these downstream genes are also highly connected to several glycogen-related genes (glgA, glgB, glgC, glgP, glgX, see Figure 5C and Supplementary Figure S8). Both RpoN and Lrp, as well as the two-component glutamine regulatory system (NtrBC) are known to be involved in a complex regulatory network that plays a central role during growth on limited nitrogen in E. coli (92, 93). One interesting pattern that emerges from our data is a tight relationship between genes associated with the nitrogen starvation response and genes involved in glycogen metabolism, which are up-regulated between 28 and 34h, with highest peak at ~31h (Figure 5C). This pattern can be visually observed by analyzing the synchronized expression profiles of the two categories. In addition, the network of gene interdependencies obtained from the D2T2 algorithm places the nitrogen and glycogen related genes into a single highly interconnected cluster (Supplementary Figure S8). Detection of the gluconeogenesis genes (pckA, phosphoenolpyruvate carboxykinase and ppc, phosphoenolpyruvate carboxylase) by D2T2 further supports the notion that the upstream flow of carbon from lactate to glycogen via gluconeogenesis is regulated by the cell through environmental signals (Figure 5D). These observations are reported for the first time in Shewanella species and are consistent with prior knowledge of glycogen production as a storage compound in other bacterial species during exponential and/or stationary phase growth (94–97). Glycogen biosynthesis has been reported to be enhanced during acid stress in S. oneidensis (34). A correlation between glycogen accumulation and low nitrogen in the actinomycete, Micromonospora echinospora, was explained to be a result of reduced carbon flux (98). The use of lipid as a storage compound under nitrogen-limiting conditions has also been observed (99). Our study suggests a possible link between glycogen biosynthesis and nitrogen starvation in S. oneidensis, though the timing of these effects relative to the carbon starvation response in our experiment remains to be understood.
While we have been focusing here on genes that mediate the response to metabolically relevant processes, we should emphasize that D2T2 also identified other genes associated with the entry into stationary phase, such as TRs and target genes associated with flagellar biosynthesis, pili, chemotactic functions and biofilm formation (see Supplementary Figures S9 and S10; Supplementary Data set 4).
Another subtle aspect emerging from the GDM and D2T2 analyses is that the cells may respond to transcriptional triggers that are internal (i.e. associated to the status of the cell) rather than external (i.e. associated with environmental stimuli). Our initial expectation was that the D2T2 approach would only identify extracellular (i.e. environmental) entry points into the regulatory network. Our results suggest, however, that some intracellular growth-phase sensing may feed back into the system as well, effectively ‘communicating’ to the cell its growth status, and triggering a specific environment-independent transcriptional response. This phenomenon is specifically suggested by the fact that some genes (e.g. rpoD, which is regulated by cAMP/CRP in E. coli) appear both as correlated in the GDM analysis (i.e. it is regulated similarly across the two conditions, and hence is weakly dependent on nutrient availability), and as major potential trigger from the D2T2 analysis (i.e. its transcriptional changes cannot be explained simply by the internal regulatory network architecture). These genes (a total of 107) are listed in Supplementary Data set S6. Our results suggest that post-transcriptional regulation may supplement core regulatory processes providing a feed-forward control of environment on growth strategies (6). An intimate and delicate interplay between growth and environmental control might become a crucial requisite to balance the internal state of the cell and achieve optimal conditions for growth or long-term survival.
The high-throughput mRNA profile analyses using GDM and D2T2 have provided us with insight into genes whose expression is apparently associated with growth phase changes, as well as genes whose regulation is likely influenced by environmental factors, such as changes in nutrient availability. The microarray analyses presented earlier indicate that at different times along the growth curve, changes in the availability of resources (carbon, nitrogen, oxygen) trigger major transcriptional events. In order to better understand the interplay between growth-dependent processes and the effects of resource limitations on the physiology of S. oneidensis, we used a time-dependent version of the widespread genome-scale stoichiometric approach of FBA. Specifically, we set to simulate the experimentally measured growth curve and metabolite depletion using a dFBA approach (45) (see ‘Materials and Methods’ section for details), based on a recently published stoichiometric reconstruction of the S. oneidensis metabolic network (48). We initialized dFBA by using the known initial concentrations of available nutrients, and applied consecutive rounds of flux balance updates at equally spaced time intervals. As shown in Figure 6, the dFBA simulation predicts a number of important features of the growth curve consistent with the transcriptional signals described earlier. In particular, the dFBA predicts that: (i) the major nitrogen source (ammonium) should be depleted from the minimal medium approximately 2h before lactate depletion. This would explain the large number of nitrogen utilization genes identified in the D2T2 analysis, including the TRs Lrp, RpoN and ProQ (Figure 5C and Supplementary Figure S8); (ii) abundant oxygen availability (8mM) may not be enough for satisfying the demand of log-phase growing cells. Hence, it is conceivable that during log phase, the cells sense a reduction in oxygen availability. This would be compatible with the transcriptional signals identified with D2T2, in particular TRs EtrA/FNR and ArcA (Figure 5B). Notably, the dFBA model predicts (Figure 6) that under oxygen limitation, S. oneidensis will produce acetate at approximately t=12h (O2=2mM) or t=18h (O2=6mM); and (iii) if indeed ammonium runs out before lactate, there is potentially a time window during which cells would not be able to produce biomass, but would potentially be able to store the available carbon for survival during starvation periods. This hypothesis may be related to the spike of glycogen-associated genes (Figure 5C) that we observed in conjunction with major changes in nitrogen utilization processes. To verify whether the global stoichiometry of the system would be compatible with the storage of glycogen, we modified the dFBA model to include an outflow (‘storage’) of glycogen as a potential flux in the objective function. Upon this modification, the dFBA model predicts that glycogen production is compatible with other constraints in the time window between nitrogen and carbon depletion (Figure 6).
Both the gene expression analyses (Figures 4A and and5)5) and the dFBA calculations (Figure 6) indicate that carbon, nitrogen and oxygen limitations seem to underlie specific physiological transitions, to which the cells respond with transcriptional control of relevant processes. In order to verify the dFBA-based interpretation of the observed transcriptional changes, we measured at different time points the supernatant concentrations of lactate, acetate and ammonium, and the intracellular concentration of glycogen (Figure 4B–D and ‘Materials and Methods’ section). Since pyruvate secretion had been reported in previous growth experiments (100,101), we included pyruvate measurement in our experiment as well. In addition, to assess potential oxygen limitation, we also analyzed the time-dependent rate of external oxygen feeding monitored throughout the growth experiment (Figure 4E).
Consistent with the array data and with the onset of stationary phase (Figure 1B), lactate becomes depleted from the culture vessel between 30 and 32h (Figure 4D). It turns out that the concentration of ammonium drops to zero at ~30h (Figure 4C), placing the cells into a small (up to 2h) window in which carbon is still available, but cells are experiencing depletion in nitrogen levels. The nitrogen depletion is consistent with the observed increase in transcript levels of genes involved in nitrogen starvation response (Figure 5C). As suggested also by the dFBA model, this window of excess carbon coupled with nitrogen limitation could explain the observed activation of glycogen-related genes (Figure 5C), similar to previous reports in yeast (102,103). The cells, unable to produce biomass because of the lack of nitrogen, instead turn their metabolic activity to storing the available carbon. Indeed, upon measuring intracellular glycogen over time, we observed an increase from zero up to ~3µg/ml (corresponding to ~5.5µg/g dried biomass), peaking at ~30h, exactly when cells enter the window of ammonium depletion in presence of carbon (Figure 4B–D).
The transcriptional profiles and flux balance predictions also suggested that oxygen limitation occurs during fast exponential growth. In addition, the dFBA model predicted that upon oxygen limitation, acetate would be secreted as a byproduct. While the level of dO2 in our LAC growth experiment was kept constant (~20%), the rate of external oxygen feed used to maintain such level of dO2 in the bioreactor vessel underwent significant changes (Figure 4E). In particular, one can see a steep increase in O2 demand at different time intervals between mid-exponential to stationary phase of growth. At ~24h, one can observe the occurrence of a sharp increase in the oxygen uptake signal. This corresponds to the time window (24–27h) where acetate is excreted into the medium and subsequently consumed (Figure 4B) and is consistent with the up- and down-regulation of genes that convert acetaldehyde to acetate (aldA) and acetate to acetyl-CoA, respectively (Figure 5A and Supplementary Data set 4 of perturbed genes in LAC growth condition). Another sudden peak in the demand of oxygen (~27–28h) corresponds to the onset of nitrogen limitation in the culture, and corresponds to a major peak in the transcription of the oxygen sensors ArcA and EtrA/FNR (Figures 4A and and5B).5B). Note that an alternative explanation for the secretion of acetate into the external environment between 23 and 26h of growth (Figure 4B) is the sudden change in metabolic strategy occurring when nitrogen starts becoming limiting, at ~22–23h (Figure 4C). The rate of nitrogen utilization decreases sharply at this time, potentially causing a sudden overflow of carbon flux, that is relieved into the medium as acetate before the glycogen biosynthesis mechanism is ready to dissipate it.
In addition to the accumulation and consumption of acetate, we also detected accumulation and re-consumption of pyruvate (Figure 4B). A possible explanation is that the rate of lactate uptake and subsequent conversion to pyruvate outpaces the cells ability to process the pyruvate, leading to its secretion. Alternatively, the accumulation of this and other metabolites in the supernatant may be due to the break-down of lactate as a by-product of extracellular detoxification of peroxides in the environment by the bacteria. This aspect of the physiology of S. oneidensis is not caught in a straightforward manner by the dFBA model. This discrepancy may be attributed to limitations of the stoichiometric model (e.g. missing or unusually regulated reactions) or of the flux balance framework (e.g. the use of an inadequate objective function), and will require further investigation.
Complementary evidence that the D2T2 trigger genes are indeed involved in sensing external perturbations related to the three main environmental signals relevant to our experimental setup (lactate, nitrogen and oxygen), can be obtained from the analysis of recently published phenotype data (104) (Supplementary Text). This data set consists of fitness assays for 3355S. oneidensis nonessential mutants (including several of the D2T2-identified genes) in 121 diverse conditions. For each mutant, we used a two-sample t-test to test whether fitness changes were preferentially observed upon the perturbations of nutritional parameters associated with the aforementioned environmental signals (i.e. anaerobic versus aerobic, lactate versus other carbon source, diverse nitrogen sources). Several of the genes identified by D2T2 under LAC growth show a significant (P≤0.01) phenotypic signature in at least one of the tested condition (Supplementary Table S2).
In addition, in order to test whether our list of trigger genes was significantly enriched for transcripts showing phenotypic patterns upon oxygen, lactate or nitrogen signals, we performed a permutation test, where we iteratively randomly selected an equal subset of putative trigger genes and counted how many times we could find a significant phenotypic pattern in each of these conditions. The results of this analysis indicate that there is a significant enrichment in our predicted gene list for transcripts showing phenotypes under these three environmental conditions (P=0.068, 0.091 and 0.025, respectively). In particular, there seems to be a prevalence of genes responding to nitrogen availability (last row of Supplementary Table S2), pointing again to the fundamental role of this nutrient depletion in our analysis.
We have presented a new set of high-throughput transcriptional data useful for understanding growth phase physiology in S. oneidensis MR-1. In order to decode the information embedded in such data, we devised a mapping procedure (GDM) that allows us to compare transcriptional programs across different growth media. Most importantly, we developed a novel method (D2T2) for interrogating time-dependent gene expression about the entry points in the genetic network, and interpreted some of our results in light of flux balance model predictions and metabolite measurements.
Earlier studies on transcriptional regulation in S. oneidensis have focused on mRNA changes upon several discrete environmental or genetic perturbations, with no time resolution (25–30,32–35,37,38). Similar time-dependent processes in Shewanella have been studied before, but without probing transcriptional profiles (100,101). Our data, which spans two fundamentally different environmental conditions, a minimal lactate and a rich medium (both aerobic), adds a new dimension to the existing knowledge of S. oneidensis physiology (39,88,89,105). Our new data are particularly timely, given the increasingly relevant role of Shewanella species for studying microbial sediment communities and in bioenergy-related research (18,19), such as bioremediation (106), microbial fuel cells and biohydrogen production (107,108).
A large portion of the computational methods presented in this work is dedicated to the analysis of time-dependent gene expression. In contrast to other analyses of gene expression data, our focus here is intentionally not on obtaining a putative regulatory network. Rather, we sought to take advantage of our unique data sets to propose a novel way of interpreting dynamic gene expression. Our goal was to infer the genes whose dynamical changes cannot be explained by internal gene–gene influences, but likely to depend on other factors such as changes in metabolites in the environment.
By developing a time-rescaling approach (GDM), based on growth-rate estimates derived from measured optical density data, we formulated a direct comparison between time-course profiles for individual genes across the two conditions, revealing transcripts mainly responding to growth-rate variations and weakly dependent on whether simple carbon (lactate) or a complex mixture of nutritional resources are available in the medium. Our current analysis can only tell us which genes, upon rescaling, behave similarly in the two conditions considered. In principle, the same approach could be applied to multiple conditions and can help identify genes and pathways whose regulatory dynamics is truly independent of environmental conditions and changes mainly as a function of the growth phase of the cell.
Although the GDM approach allowed us to tease out transcriptional pattern that are correlated between the two conditions, a different method was necessary in order to highlight the specific environmental cues and transcriptional gates allowing cells to respond to changing nutrient availability. To address this question, we took advantage of the availability of a unique large compendium of S. oneidensis microarray data for response to multiple perturbations (39). Specifically, we devised a new algorithm (D2T2) that integrates our new time-series data presented in this manuscript with this compendium’s static measurements to decipher how environmental signals feed into the regulation of gene expression and metabolic dynamics.
Our metabolite measurements and literature analyses (Figure 4, Table 3 and Supplementary Table S2) support the D2T2 predictions. However, a direct testing of the ‘trigger’ role of predicted genes will require complementary experimental approaches. One possible avenue would be the measurement of the response of mutant S. oneidensis strains under different conditions. For example the deletion of a TR predicted to mediate the response to a given external metabolite should lead to higher phenotypic changes under conditions in which such metabolite is present than in all other conditions. Along this line, we performed a statistical analysis of recently published data on gene deletion phenotypes under multiple conditions (104), showing overall agreement with our predictions (Supplementary Table S2 and Supplementary Text). In future, more targeted experiments could further help to validate specific D2T2 predictions. For example the activity of a TR could be monitored in a high time-resolution manner using GFP constructs (109), reporting the expression level of its targets upon pulses of the specific putative sensed signal. In addition, one could directly evaluate the sensing role of a TR through protein–metabolite binding assays (110).
Interpreted through the lens of a dFBA simulation, the key transcriptional changes and entry points detected with D2T2 point to specific metabolic events that apparently dictate some of the major decisions taken by S. oneidensis throughout the growth curve. Our analysis reveals that both ammonium (nitrogen) and lactate (carbon) limitation constitute key transcriptional triggers. As suggested by transcriptional spikes and dFBA predictions, nitrogen limitation turns out to be a major metabolic determinant in our analysis, and ammonium is predicted to run out slightly before lactate does in dFBA simulations. This is also confirmed by measurements of lactate and ammonium in the culture supernatant. At the same time, the transcriptional response, as well as a modified dFBA calculation, point to a glycogen-related activity that seemed tightly coupled with the nitrogen starvation response. As verified through direct measurement of intracellular glycogen, this pattern indicates that S. oneidensis integrates information about carbon and nitrogen availability in a decision-making process that leads to the accumulation of storage carbon rather than biomass due to lack of nitrogen. This type of behavior, previously observed in other organisms, is to our knowledge reported here for the first time in S. oneidensis, suggesting how this lake sediment organism is able to survive under different modes of nutrient availabilities. The dFBA simulations presented in this work allowed us to interpret transcriptional data in light of global resource allocation constraints inherent to metabolism. In this case, the metabolic objective function, a key parameter of constraint-based models of metabolism, was assumed to be the same throughout the growth process. Future research in constraint-based models could take better advantage of data from dynamic processes to try and infer time- or condition-dependent objective functions, such as a shift from maximal growth to minimal energy turnover, in an attempt to mimic the global metabolite management decisions of the cell. Moreover, in integrated models, transcriptional data could be used to directly constrain metabolic fluxes.
Finally, the pipeline of experimental and computational approaches applied and developed for this work could be extended to other microbes and additional conditions. In Shewanella, for example, it would be highly interesting to use similar methods to explore what transcriptional changes and environmental triggers accompany the switching between different electron acceptors, or other physical factors e.g. gradual switching between aerobic and anaerobic conditions, as well as between different anaerobic acceptors including metals/metal oxides.
Supplementary Data are available at NAR Online: Supplementary Tables 1–2, Supplementary Figures 1–10, Supplementary Data sets 1–6 and Supplementary References [111–120].
Office of Science (BER), U.S. Department of Energy [DE-FG02-07ER64388 to D.S. and DE-FG02-08ER64511 to M.H.S.]; National Aeronautics and Space Administration, NASA Astrobiology Institute [NNA08CN84A to D.S.]. Funding for open access charge: Boston University.
Conflict of interest statement. None declared.
The authors are grateful to members of the Segrè lab for helpful feedback and discussions, to Timothy S. Gardner for providing laboratory facilities during the initial phase of this project, to Jennifer Reed for sharing early stoichiometric models of S. oneidensis and to Norman Lee, Director, CIC facility at Boston University, for use of HPLC.