|Home | About | Journals | Submit | Contact Us | Français|
Eukaryotic cell proliferation involves DNA replication, a tightly regulated process mediated by a multitude of protein factors. In budding yeast, the initiation of replication is facilitated by the heterohexameric origin recognition complex (ORC). ORC binds to specific origins of replication and then serves as a scaffold for the recruitment of other factors such as Cdt1, Cdc6, the Mcm2-7 complex, Cdc45 and the Dbf4-Cdc7 kinase complex. While many of the mechanisms controlling these associations are well documented, mathematical models are needed to explore the network’s dynamic behaviour. We have developed an ordinary differential equation-based model of the protein-protein interaction network describing replication initiation.
The model was validated against quantified levels of protein factors over a range of cell cycle timepoints. Using chromatin extracts from synchronized Saccharomyces cerevisiae cell cultures, we were able to monitor the in vivo fluctuations of several of the aforementioned proteins, with additional data obtained from the literature. The model behaviour conforms to perturbation trials previously reported in the literature, and accurately predicts the results of our own knockdown experiments. Furthermore, we successfully incorporated our replication initiation model into an established model of the entire yeast cell cycle, thus providing a comprehensive description of these processes.
This study establishes a robust model of the processes driving DNA replication initiation. The model was validated against observed cell concentrations of the driving factors, and characterizes the interactions between factors implicated in eukaryotic DNA replication. Finally, this model can serve as a guide in efforts to generate a comprehensive model of the mammalian cell cycle in order to explore cancer-related phenotypes.
The machinery of the eukaryotic cell cycle has been extensively dissected and described, in both simple and complex organisms. Proliferation hinges on the cell’s ability to replicate the genome with high fidelity, segregate the chromosomes equally, and ultimately divide into two genetically identical cells. A fundamental process in the regulation of DNA replication is the step-wise assembly of the pre-replicative complex (pre-RC) at origins of replication. The pre-RC is a congregation of proteins each performing a specific role. Its formation is facilitated by the six-subunit origin recognition complex (ORC), which, in the budding yeast Saccharomyces cerevisiae, binds an 11bp consensus sequence [1-3]. ORC then recruits Cdc6, which, like ORC, exhibits ATPase activity [4-6]. The co-import of Cdt1 and the Mcm2-7 complex (MCM) into the nucleus follows , and the MCM·Cdt1 heptamer is then targeted to origins by an interaction between Cdt1 and Orc6 [8,9]. Initial loading of an MCM ring at the origin requires Cdc6 ATP-hydrolysis. Reiterative loading of an additional MCM molecule occurs via ORC ATP-hydrolysis , resulting in two rings at each origin [11-13]. At this point origins are said to be licensed. In late G1 phase, a burst of Dbf4 synthesis activates the Dbf4-dependent kinase Cdc7 (DDK), which then phosphorylates multiple MCM subunits [14-18], bringing about a conformational change that stimulates MCM helicase activity. Dbf4 levels decrease over the course of S-phase and, starting at the metaphase/anaphase transition, Dbf4 is actively degraded by the anaphase promoting complex (APC) and its activating co-factor, Cdc20 [19-23]. In this way, Dbf4 levels are prevented from rising until the next G1/S transition.
The phosphorylation of MCM by DDK is coincident with the phosphorylation of the protein factors Sld2 and Sld3 by Clb5-Cdc28, a cyclin-dependent kinase (CDK) complex, the activity of which rises just prior to S-phase entry. The Sld proteins, once phosphorylated, are stabilized as a complex with the adaptor protein Dpb11 and the tetrameric GINS complex, forming a module that interacts with Cdc45. The latter acts as a scaffold for this module, which is then competent to associate with the pre-RC and attract DNA polymerase [15,24-26]. A recent study shows that the end result is the tight association of Cdc45, MCM and GINS (collectively known as CMG) with origins, allowing the unwinding of DNA and processive replication by DNA polymerase . This represents the essential role of CDK in stabilizing polymerase at the moving replication fork and switching the system from a pre-replicative state to a replicative one. From this point until late in mitosis, CDK levels remain high. This continued CDK activity prevents re-establishment of pre-RCs at origins that have already fired through a number of mechanisms. Firstly, CDK phosphorylates Cdc6, thus causing the SCFcdc4 complex to target Cdc6 to the proteasome for degradation [28-31]. Secondly, Orc2 and Orc6 are phosphorylated by CDK [32-34], with the phosphorylation of Orc6 rendering it refractory to interaction with Cdt1 , thereby preventing further MCM loading. Finally, CDK facilitates the nuclear export of both MCM and Cdt1, at different time points. Just prior to initiation, Cdt1 exits via a CDK-dependent mechanism, while MCM proteins fall off the DNA upon fork termination and are then exported in a CDK-dependent manner [7,36-38]. Thus, while CDK initiates replication, it subsequently prevents pre-RC reassembly. This illustrates its dual role in triggering initiation through formation of CMG, then preventing re-initiation by inhibiting pre-RC reformation.
Mathematical modeling has been successfully used in the past to address various aspects of the cell cycle. Early models (e.g. ) did not incorporate specific biochemical mechanisms; they were hypothetical representations of periodic cellular activity. As the molecular mechanisms driving the cell cycle were revealed, models appeared that incorporated these findings (e.g. [40-43]). For S. cerevisiae in particular, multiple modeling approaches have been applied, based both on network descriptions  and on specific molecular details such as gene expression and biochemical kinetics [45-47] (reviewed in ). Some modeling efforts have been comprehensive, such as the Tyson group’s ordinary differential equation (ODE)-based models [45,46], while others address specific cell-cycle phenomena, such as the links between cell size and cycle progression [49,50]. Spiesser et al.  developed a model of chromosomal replication, which reproduced the spatio-temporal replication profile of yeast chromosomes. Origin firing was also described in [52,53] wherein the authors used a stochastic model to describe these origin-specific features of replication.
A recent report  presented an ODE-based model describing the initiation of DNA replication, incorporating origin licensing, firing and the network of regulatory phosphorylation events. The model parameters were partly calibrated against experimental data, but largely selected through an optimization routine designed to attain an idealized function, resulting in a model that is particularly suited to exploring events specifically at the G1/S transition.
Here, we present a new model of the initiation of DNA replication. In contrast to the work of Brümmer et al. , we took a ‘bottom-up’ approach and began by gathering in vivo data for precise protein levels at specific cell cycle time-points, then calibrated our model against these values. Rather than limiting ourselves to the observation of firing near the G1/S transition and fitting to DNA-specific replication profiles, we validated our model against the behaviour of the constituent protein complexes throughout the entire cell cycle. To facilitate the use of our model in a comprehensive description of the cell cycle, we designed it to integrate easily with the model of Chen et al. . Finally, we validated the model by comparing in silico predictions to experimental observations, using both our own knockdown experiments and results from the literature. The model presented here consolidates the known interactions between DNA replication initiation proteins and the mechanisms that allow them to drive genome duplication. Additionally, regulatory aspects of the system, which ensure that re-replication does not occur, have been modeled. The model’s behaviour provides a falsifiable hypothesis regarding the dynamics of DNA replication initiation. Furthermore, it accurately predicts the phenotypes of known experimental cell cycle mutants as well as those arising through in vivo perturbations to proteins in the network. Because our model is constrained only by the fluctuating levels of replication factors, it provides a unique understanding of the kinetics governing the reactions between them. Successful integration into a whole cell cycle model allows the initiation of DNA replication to be explored in a broader quantitative context.
We began construction of our model by identifying the important players in replication initiation and establishing an interaction network, as shown in Figure1. After selecting appropriate descriptions of reaction kinetics, we generated an ODE-based model and calibrated the model parameters to in vivo data.
The model describes sixteen molecular species (twelve of which are dynamically independent) and depends on twenty-four parameters, which characterize the rates of seventeen biochemical processes (protein expression and degradation, complex association/dissociation, and transport across the nuclear membrane). The model describes the following molecular species (Figure1).
RC1 (Replication complex, state 1): origin-bound ORC
RC2: origin-bound ORC associated with CDC6
RC3: origin-bound ORC associated with CDC6, with MCM loaded
RC4: origin-bound ORC, with MCM loaded
RC5: origin-bound ORC, with MCM loaded and DBF4 associated
RC6: origin-bound ORC, with MCM loaded and DBF4 and CDC45 associated
RC7: origin-bound phosphorylated ORC
FORK: the elongation fork, with MCM and CDC45 associated
CDC6N: non-chromatin associated nuclear Cdc6
DBF4N: non-chromatin associated nuclear Dbf4
CDC45N: non-chromatin associated nuclear Cdc45
MCMC: cytosolic MCM
CDT1C: cytosolic CDT1
MCM·CDT1N: non-chromatin associated nuclear MCM bound to Cdt1
CDT1N: non-chromatin associated nuclear Cdt1
MCMN: non-chromatin associated nuclear MCM
The MCM species corresponds to dimers of Mcm2-7 heterohexamers, as two complexes are loaded at each origin. Similarly, the CDC45 species corresponds to a dimer, as described by Bowers et al. . We describe concentrations in units of molecules per cell.
The seventeen processes that make up the model are shown in Table1. Their rates depend on the species concentrations, the model parameters, and on two fixed, time-varying input functions describing the abundance of Clb5 (representing activated CDK) and of Cdc20.
In choosing reaction kinetics, we balanced the complexity of the model against its ability to adequately describe the behaviour of the overall system. We limited our description of initiation to the interactions between the pre-RC and replisome proteins that we found to be the essential core of the network (e.g. Dbf4 representing the Dbf4-Cdc7 complex, discussed below). As a result, certain processes were combined into single events, some reactions were presumed irreversible, and only some reaction rates were presumed to have non-linear kinetics.
Except for RC7, phosphorylation states are not explicitly described, as we have no data for the individual phosphorylation events. This is acceptable for our purposes as the lumped function of CDK in each case is consistent with a scenario where the effect of CDK is proportional to its concentration (i.e. [Clb5]). Additionally, processes that involve multi-protein complexes are represented by a single member – one CMG (Cdc45·Mcm2-7·GINS) complex stabilizes DNA polymerase at each replication fork. Of the three protein factors it is comprised of, Cdc45 is limiting. Although MCM is also included in the GINS complex, we model both MCM and Cdc45 as separate species. Dbf4 represents the Dbf4-Cdc7 kinase complex and Mcm2 represents the Mcm2-7 helicase. Although the protein factors Cdc45, Dbp11, Sld2, Sld3 and GINS interact to facilitate formation of the pre-initiation complex at origins, we model only Cdc45, which is the limiting factor in the CMG complex ; ultimately the number of forks fired (described by our model) is dependent on the Cdc45 concentration. We take Dbf4, which is the limiting regulatory subunit of Cdc7, as representative of active DDK, which is one of the limiting factors in replication initiation . Mcm2 is used to represent MCM complexes; the Mcm2 concentration has been reported to approximate the number of total complexes per cell in an asynchronous population [57,58]. The replication complexes in our models exist only on chromatin and therefore represent the activity of these proteins at the DNA as opposed to soluble complexes.
The network shown in Figure1 includes both reversible and irreversible reactions as indicated. Association/dissociation reactions are considered reversible, in accordance with a dynamic pre-RC/pre-IC loading mechanism as described above. In most cases, phosphorylation events are modeled as irreversible, in the absence of identified countervailing enzymes. We found that it was sufficient to describe most reaction rates by mass action kinetics. In cases where saturation occurs (the nuclear import of MCM·Cdt1, v6, and the association of Cdc45 with ORC, v13), we employed Michaelis-Menten kinetics. To simplify the description of the phosphorylation of ORC by CDK (RC7), we do not describe phosporylation and dephosphorylation explicitly, but combine them into a single dephosphorylation event whose rate is inversely proportional to the level of CDK (v17). We introduce cooperativity in this mechanism to account for multiple phosphorylation events  or an additional inhibitory CDK-Orc6 binding mechanism [34,35].
The establishment of replication complexes in our model reflects the sequential binding of proteins that constitute the pre-replicative complex. In some cases the association and dissociation of pre-RC components is reversible. We treat the loading and maintenance of Mcm2-7 helicase complexes as a dynamic process, which is dependent on the concentrations of the factors ORC, Cdc6, Cdt1 and Mcm2-7 itself. A mechanistic model for the dynamic assembly of pre-RCs was first described by the Bell lab . The requirement of pre-RC factors for maintenance of helicase-loaded origins in late G1 has been further demonstrated by work from these researchers as well as our group [8,9,60].
The remaining state variables are constrained by the following conservations:
where RCTotal, CDT1Total, MCMTotal, and CDC45Total are the fixed total number of origins, Cdt1 molecules, Mcm2-7 complexes, and Cdc45 dimers, respectively. These four factors have been shown to be present at constant levels throughout the cell cycle [7,61-64]. The value for RCTotal used in our model is 332, as described in .
The biological network responsible for the initiation of DNA replication does not oscillate autonomously; it displays periodic behaviour when driven by periodic signals from the cell cycle. Likewise, our model displays oscillations only when driven by periodic forcing input. In order to facilitate the combination of our model with the cell cycle model of Chen et al. , we used the simulated profiles of Clb5 and Cdc20 from their model as periodic inputs to ours. Cdc20 mediates the degradation of Dbf4 (reaction v4). Clb5 is responsible for Cdc6 degradation (v2), loading of Cdc45 (v13), nuclear export of free MCM (v16) and Cdt1 (v8), and phosphorylation of Orc2 and Orc6 (v17). We converted the time-varying profiles of Clb5 from the Chen et al.  model to molecules-per-cell units using the genome-wide GFP tagging and localization experiments described in [66,67]. The profile of Cdc20 was similarly obtained by scaling to cellular abundance levels reported in another study – while Cdc20 has been determined to peak at 2200 copies in a haploid cell, the functional APCCdc20 level can be estimated by considering the APC cyclosome subunit Cdc27 [68,69]. This was reported in different studies to be 593mol/cell in an asynchronous population  and at its maximal value of 750mol/cell in metaphase .
Data for Cdc45 and Cdc6 levels were obtained from individual isogenic strains in which the open reading frame of the corresponding gene was fused to a sequence encoding a 13Myc epitope tag . In Figure2, a representative western blot for Cdc45-Myc is shown (panel A), with the corresponding FACS analysis (panel B). The levels of Mcm2 were determined using an anti-Mcm2 antibody. In each of our time course experiments, cells were first arrested in late G1 phase with the mating pheromone α-factor and then released synchronously into the cell cycle, as described in Methods. From the literature, we used time course data for chromatin-bound and soluble Dbf4 and Mcm2 (to supplement our in vivo data) from Pasero et al.  and quantitation of the nuclear fraction of Cdt1 from Tanaka and Diffley . In order to convert relative measures of protein abundance to molecule-per-cell numbers, we used scaling factors obtained from the database provided by Ghaemmaghami et al. . The data is shown along with a best-fit simulation in Figure3. Raw timecourse data can be found in Additional files 12 and 3. Our calculation of molecule-per-cell estimates is demonstrated in Additional file 4. The fits in Figure3 represent our best solution to a trade-off between quality of fit and model complexity. We explored the effect of adding additional species and parameters. These additional features could, in some cases, provide minor improvements to the fit, but our confidence in the parameter estimates (discussed in the following section) suffered as the complexity of the model grew.
While our model reports proteins and complex concentrations in absolute units of molecules/cell, our accuracy regarding these values is limited to by the literature-reported cellular abundances for the various protein factors. We have used discretion when inconsistencies arise, choosing the reported values that are best justified by multiple studies. The network dynamics are based on the relative protein abundances over the time-course (indeed, many modelling efforts describe concentration changes in arbitrary units). Consequently, changes in the global protein level for a particular factor do not affect the dynamics; instead, they impact our predictions of absolute molecule/cell counts and parameter values. This is an important consideration regarding the conversion of densitometry readings to absolute values, as the overall levels are ultimately determined by a literature-derived scaling factor. Future experiments will undoubtedly result in improved estimates of protein abundances. These can be easily incorporated into the model by scaling the parameter values accordingly (with no direct impact on system dynamics).
The model parameters were calibrated using a weighted least-squares comparison with the data described above. We used a combination of global optimization (adaptive simulated annealing) and local search (Nelder-Mead simplex method) to find the best-fit parameter set shown in Table2. The table also shows the percent error associated with each parameter estimate. The percent error is the relative size of a 95% confidence interval for the estimate, calculated via the Fisher information matrix and the Cramer-Rao bound . The percentage errors show that some parameters are estimated with high confidence while others are represented with less accuracy.
Parameter values that were well constrained by the data include those specifying the rates of production, degradation and association of Cdc6 (k1, k2, k5) and Dbf4 (k3, k4, k12) as well as the rate of origin firing (k14). This reflects the strong reliability of our data for these two protein factors as well as for the proteins that form the replication complex (RC6) that gives rise to active forks.
Parameters values in which we have low confidence include those that govern the loading of MCM by Cdt1 (v7), Cdc6 dissociation from RC3 (k5r), and the phosphorylation of ORC (k17). The reversible dissociation of Cdc6 is needed to accurately fit the data and there is no evidence suggesting that ORC-Cdc6 binding is irreversible. Nevertheless, it is clear that experimental observations specific to this process are required to more precisely estimate this parameter value. The reaction whereby the MCM·Cdt1 species loads the MCM complex (v7) is extremely transient . Provided parameter k7 is sufficiently large, the kinetics of this reaction will be rapid enough to fit the data. Consequently, the data cannot support a precise estimate of the parameter value. This observation suggests that MCM loading is an extremely rapid biochemical step in pre-RC assembly. It may point to a role for Cdt1 in repeatedly targeting MCM complexes to origins throughout G1. Such a phenomenon is consistent with the requirement for a dynamic loading mechanism that ensures pre-RC fidelity up until the G1/S transition. Finally, the phosphorylation of ORC (characterized by k17) contributes to the prevention of repeated origin firing. However, this mechanism has not been well characterized, and our data is unable to accurately constrain the specifics of this process.
The kinetic rates in this network have not been the subject of prior experiments, but previous reports of protein half-lives are consistent with our predicted parameter values. Drury et al.  estimated that Cdc6 is reduced below the point of detection within 5 minutes of S-phase entry, corresponding to a half-life no longer than 1.5min. Similarly, Cheng et al.  reported that Dbf4 is reduced below visible levels within 10 minutes by the APC-dependent pathway, indicating a half-life no longer than 3min. Our model-based predictions of degradation rates correspond to half-lives of 1min. and 2.5min. for Cdc6 and Dbf4 respectively, in good agreement with these earlier findings.
Figure4 shows the simulated model behaviour for the best-fit parameter set. Some replication complex species – RC2, RC4 and RC5 – are extremely transient. Their low levels of abundance are shown separately from other RCs, on an appropriate scale. Simulations were carried out in Matlab (code available from the authors upon request).
Our initial explorations of the model revealed that the network’s behaviour is particularly sensitive to the abundance of Dbf4 and Cdc6 and relatively insensitive to the level of Cdt1. We investigated the effects of perturbations by simulating reductions in Dbf4, Cdt1 and Cdc6 (Figure5) in our model. When the Cdc6 production rate (v1) was reduced to 10% of its nominal (wild-type) value, persistence of the RC1 complex was observed. Similarly, when the Dbf4 production rate (v3) is reduced by the same relative amount, an accumulation of RC3 occurs. In both cases, the perturbation interferes with pre-initiation complex assembly and blocks the system at the nearest previous persistent RC state; RC4 is not persistent since the unloading of MCM causes a rapid transition back to RC1. It is worth noting that because MCM can dissociate from ORC (v22), RC4 represents a complex containing MCMs that will be functionally incorporated into replication forks as opposed to those that loosely associate with origins. Because the timing of our model is fixed, the various state concentrations (RC levels) indicate the progression from licensing to firing. A reduction in the FORK species compared to the wild-type case suggests a slow-down in S-phase because fewer origins are firing within the prescribed time. Using the peak abundance of the FORK species as a measure of replicative efficiency, we saw significant reductions in both simulated knock-downs (by 68% for Dbf4 and 73% for Cdc6, Figure5B, C). Conversely when we simulated the reduction of Cdt1 abundance to 10% of nominal values, origin firing was only reduced by 23%, suggesting that the network is relatively refractory to depletion of Cdt1 (Figure5D). Additional file 5 shows the levels of the various model components for each perturbation.
To investigate the accuracy of these mathematical predictions, we carried out corresponding wet lab depletion experiments. Reducing Dbf4 or Cdc6 concentrations in yeast cells to roughly 90% below normal endogenous levels resulted in a rapid G1 phase arrest, evident after 2 hours of depletion, as judged by FACS analysis indicating the accumulation of cells with 1C (unreplicated) DNA content (Figure6). In contrast, a corresponding depletion of Cdt1 had no appreciable effect, and DNA replication defects were still only minimally evident after 8 hours of further reduction. Thus, our in silico simulations using our nominal parameter set were predictive of in vivo perturbations. These experiments were used to validate the model; they were not used for calibration.
The insensitivity to perturbations in Cdt1 levels is consistent with its apparent excess relative to origins , although the number of Cdt1 molecules that act at each origin has not yet been characterized. Moreover, the mechanism by which Cdt1 aids in recruiting the helicase molecules to pre-RCs is extremely transient .
While many factors are limiting, the system appears to be highly sensitive to the levels of Cdc6. Due to its low abundance relative to MCM and Cdt1, even a moderate depletion of Cdc6 significantly alters the dynamics of pre-RC loading. The same is true for Dbf4, although in this case its role in activation of the Cdc7 kinase renders the system highly sensitive to its concentration; firing cannot occur without the Dbf4-Cdc7 complex. Since Dbf4 is, like Cdc6, limiting, flow through the network is blocked when the kinase does not reach a threshold level. Additionally at limiting levels, the number of replication forks produced by our model is significantly reduced, consistent with in vivo reports from the literature showing a lengthening of S-phase .
Further validation of our model comes from comparison with additional in vivo experiments reported in the literature. Jones et al.  showed that the interaction between the MCM complex and Dbf4 was reduced to half its wild-type level when a Dbf4 domain that binds Mcm2 was mutated, impairing S-phase progression. We mimicked this effect by reducing the rate of association of Dbf4 with RC4 (k12) by 50%, leading to a similar result (compare Figure7 panels A and B). Similarly, Zou et al.  reported that the cdc45-1 mutant shows an aberrant growth phenotype at the non-permissive temperature. This is thought to be due to a disruption of Cdc45’s ability to interact with MCM and ORC (RC6). As shown in Figure7C, by reducing the rate of Cdc45 interaction with RC6 (k13) by 50%, a marked reduction in the peak abundance of the FORK species results, indicative of a slower S-phase, as observed when the mutant was grown at the non-permissive temperature. The actual reduction in Cdc45’s association with the pre-RC due to conformational changes in the mutant might be even more pronounced than a 2-fold reduction. In any case, our simulation is consistent with Cdc45’s origin-initiation role being compromised by impairing its ability to interact with its ligands to form the CMG complex.
Our model of the initiation of DNA replication only displays oscillatory behaviour when forced with periodic signals from the cell cycle. By choosing to incorporate signals that correspond to species in the cell cycle model of Chen et al. , we were able to merge the two models in a straightforward manner. In the Chen model, the initiation of DNA replication is represented by a single lumped state variable, called ORI. At the beginning of the cell cycle, ORI has value zero. Its rate of growth depends linearly on Clb5. When it reaches a threshold value, DNA synthesis is presumed to have begun, and triggers an increase in the value of the parameter kmad2 (activity level of the Mad2 protein) leading to an inactivation of Cdc20, which is required for mitotic exit. This Mad2-dependent inhibition of Cdc20 represents the spindle assembly checkpoint , ensuring that cells with replicated DNA do not complete mitosis without properly aligning the chromosomes. When chromosomes have properly aligned on the metaphase plane kmad2 drops and Cdc20 promoted exit from mitosis. In our model, the level of DNA synthesis is represented by the FORK species. To merge the two models, we removed the ORI state from the Chen model, and instead used the FORK species to trigger the change in kmad2, as detailed in Methods.
Besides “closing the loop” between the two models by incorporating two-way inter-model signalling (involving Clb5, Cdc20, and FORK), we had to deal with a single shared species: both models describe the dynamics of Cdc6. We arrived at a merged description of Cdc6 behaviour by incorporating the dynamics of recognition complex association and dissociation into the Chen model’s formulation of Cdc6 behaviour (details are described in Methods). The resulting combined model behaves only marginally differently from either model in isolation, as shown in Additional file 6: Figure S6 and Additional file 7: Figure S7).
The recently published model of Brümmer et al.  also describes the network responsible for the initiation of DNA replication. The 51 free parameters of that model were chosen by a combination of fitting and optimization. The authors used literature-derived data to fix 28 of the kinetic parameters. The remaining 23 free parameters were not fit to data, but were selected through a procedure that optimized the coherence of origin firing and minimized re-replication (selected as hypothetical goals of evolutionary ‘design’). While it is impossible to assess the accuracy of the parameter values obtained from this procedure, the resulting idealized model provides a useful starting point for examining how the network structure constrains the system behaviour.
The model of Brümmer et al. focuses on early origin firing and so represents the mechanics of firing at the start of S-phase. In contrast, our model describes firing dynamics throughout S-phase in order to fit into the broader context of the cell cycle [76-79]. The parameter set driving our system is not filtered to retain only those that produce replication dynamics consistent with coherent firing just at the G1/S boundary. Rather, the parameters are specified by the actual cellular concentrations of the active protein factors generating replication forks. While both models incorporate the important role of CDK, Brümmer et al. emphasize the multi-site phosphorylations of several factors involved in mechanisms that minimize potential re-replication. To this end, Brümmer et al. employed a metric to assess re-replication. Their idealized model exhibits 0.0028 re-replication events per cycle. Applying the same measure to our bottom-up model yields 0.36 re-replication events per cycle (although that can readily be reduced by modifying our parameters from their best-fit values). The near-zero value obtained by Brümmer et al. is close to their idealized target of zero. Both estimates are consistent with the belief that re-replication occurs in wildtype cells, but at an extremely low rate . Because the nature of the dephosphylation of ORC (RC7→RC1 transition) remains uncharacterized, we use a conservative estimate of the number of ORC phosphorylation sites. Increasing this number by twofold, consistent with the number of CDK target residues on ORC  reduces re-replication to a value on the same order of magnitude as Brümmer’s value. Thus, both models effectively deal with representing control and prevention of rereplication.
While our model provides a sound description of the initiation of DNA replication, a number of aspects of the network remain unresolved: for example, the kinetics of MCM loading, the mechanism by which CDK phosphorylates ORC, and the details of the association of the GINS complex, Sld2, and Sld3. While modeling Cdc45 captures the events regarding CMG formation at origins, being its limiting factor, a future version of the model could better distinguish the initial Cdc45 association at origins from subsequent CMG formation. While this has minimal effect on our network dynamics and no effect on our blocking of re-replication, it would provide a better resolution of events at origins just prior to the G1/S transition. Incorporating timecourse experiment data for levels of a GINS complex member would aid in this analysis. Nevertheless, the assumptions made allow us to approximate the aforementioned processes, simplifying the network without losing information about system behaviour at the level we intend to model. With our nominal parameter set, we observe the system to behave as the ordered accumulation of proteins forming a loading complex at origins throughout the genome. Activation by increasing S-CDK levels and the concentration of Dbf4 (regulating kinase activity of Cdc7) increase linearly the number of replication forks established as a result of successive forward transition of the various replication complexes (RCs). It should be noted that as is found in vivo, not all origins fire as a consequence of being furnished with MCM-containing pre-RCs. Replication is maximal at the G1/S transition, but continues into S-phase as origin firing is temporally spaced. This is thought to ensure sufficient time to address any defects in replication and is mediated by the limiting nature of one of the initiation activators, DDK (reviewed in ). Our in vivo perturbation of Dbf4 levels reproduces this consequence and points to other system observations: Cdc6 levels are intimately controlled by CDK levels to avoid re-replication, however this mechanism is tightly regulated such that Clb5 levels rising too soon would prevent the assembly of the pre-RC in G1, a feature of the system well documented. Additionally, Cdt1 appears to act catalytically rather than stoichiometrically given the system is relatively impervious to a reduction of this factor to 10 % of its wildtype level. This might play into its role in chaperoning Mcm2-7 hexamers to origins, where they are loaded subsequently leading to the release of Cdt1, which may then be recycled to mediate the loading of other MCMs. This aspect of the system has not yet been investigated experimentally and would be of future interest.
Many human orthologs of the yeast proteins described in our network have been associated with cellular pathologies. Our model is specific to the replication machinery in budding yeast, but the mechanisms driving this process are highly conserved throughout Eukarya. Efforts to develop an analogous model in mammalian cells would be useful in understanding and dissecting cell proliferation in humans. A number of models of the mammalian cell cycle have been proposed [80-84]. Incorrect pre-RC formation has been linked to impaired DNA damage repair pathways in humans , while both Orc6  and members of the Mcm2-7 complex (reviewed in ) have been shown to be reliable cancer biomarkers. Recent work by Bicknell et al. [88,89] has shown that point mutations in the human ORC1, ORC4, ORC6, CDT1 and CDC6 genes are associated with Meier-Gorlin syndrome, a form of primordial dwarfism, and several of these mutations were determined to interfere with proper pre-RC formation. These findings highlight the potential utility of in silico mammalian models in further exploring the molecular basis of such disorders. Given that our model shows good predictive capability, it serves not only as an informational tool for yeast biology, but also as a proof of principle for higher order system models. Despite the requirement for a mammalian model to comprehensively verify specific mechanisms, the system of DNA replication initiation is conserved well enough that perturbations to proteins such as those described above can, in fact be preliminarily examined.
Although previously established replication models [51,52,54] consider the ordered timing of origin firing based on genomic replication profiles, our goal was to represent the temporal organization of origin firing as a function of the concentration of active replication species. A focus on using real protein levels as a determinant of replication dynamics is a novel approach. When used in concert with models describing genome-level origin characteristics and/or combining the findings with models exploring other cell-cycle modules, future efforts will generate a well-rounded picture of DNA replication initiation.
Details of the strains used in this study are listed in Table3. All strains are isogenic and derived from the wild-type strain DY-26.
Cells were grown in YPD medium (1% Bacto-yeast extract, 2% Bacto-peptone, 2% dextrose) to exponential phase at 30°C, washed with dH20 and resuspended in fresh YPD at a concentration of 1 x 107 cells/ml. Cultures were subsequently arrested in late G1 phase with the addition of 5μg/ml alpha-factor peptide (Sigma-Aldrich) for 2.5 hours. Cells were monitored for G1 phase arrest through microscope observation of the percentage of unbudded cells. Approximately 1.5×107 cells were collected and treated with 0.1% sodium azide then kept on ice at 4°C. The remainder of the arrested culture was then centrifuged at 200g and the pellet was washed twice with dH20. The cells were all BAR+ and thus secrete the Bar1 protein, which degrades the alpha-factor pheromone used. Medium was collected and saved during centrifugation of the original logarithmic culture containing this protein and was subsequently used to resuspend the cells for the release from G1 phase. Additionally, Pronase E (Sigma-Aldrich), an enzyme that also hydrolyzes alpha-factor, was added at a concentration of 10μg/ml to facilitate a synchronous release. Roughly the same number of cells collected in the G1-arrested samples were collected at the other time points following release and similarly treated with sodium azide and kept on ice until the completion of the time course.
To assess cell synchrony, 1.5×106 cells were removed from each time point sample. They were immediately centrifuged, re-suspended in 1ml of ice-cold 70% ethanol and stored overnight at 4°C. Cells were then re-suspended in 500μl of 50mM Tris-HCl pH 8 containing 10mg/ml of RNase A and incubated for 2 hours at 37°C. This was followed by centrifugation and re-suspension in 500μl 50mM Tris-HCl 7.5 with 2mg/ml Proteinase K. Incubation at 50°C for an hour was performed prior to final resuspension in 100μl FACS buffer (200mM Tris-HCl 7.5, 200mM NaCl and 78mM MgCl2). Cells were stained with SYTOX Green dye (5μM; Molecular Probes) for at least one hour and then analyzed using a Becton-Dickinson FACScan.
Chromatin fractionation was performed as described by Semple et al.  with some modifications. Approximately 1×107 cells collected from each time point were incubated in 7.5ml pre- spheroplasting buffer (100mM EDTA-KOH pH 8, 10mM DTT), after washing once with dH2O. They were then incubated at 30°C for 10min with gentle shaking. Cells were centrifuged and re-suspended in 7.5ml spheroplasting buffer (1 X YPD, 1.1M sorbitol) containing 0.5mg/ml Zymolyase 20T (Seikagaku Corp., Japan) and 0.1mg/ml Oxalyticase (Sigma), followed by shaking at 30°C for 30-45min with gentle mixing. Cells were then washed once with 20ml spheroplasting buffer containing 0.5mM PMSF followed by resuspension in 1ml ice-cold wash buffer (5mM Tris–HCl pH 7.4, 20mM KCl, 2mM EDTA-KOH pH 7.4, 1M sorbitol, 1% thiodiglycol, 125mM spermidine, 50mM spermine). Wash, Breakage and Lysis buffers all contained 1 tablet/10ml of EDTA-free protease inhibitors (Roche) and were supplemented with 0.5mM PMSF. Cells were centrifuged at 400g for 1min at 4°C, washed twice with 1ml of Wash buffer and then re-suspended in 800μl of Breakage buffer (5mM Tris–HCl pH 7.4, 20mM KCl, 2mM EDTA-KOH pH 7.4, 0.4M sorbitol, 1% thiodiglycol, 125mM spermidine, 50mM spermine). To these cells, 1ml of Lysis buffer (Breakage buffer supplemented with 1% Triton X-100) was added and after repeated inversion (until the solution turned clear), cells were pelleted at 16,000g for 10min. This separated the proteins bound to the chromatin (residing in the pellet, referred to as PEL) from those solubilized in the non-chromatin fraction (supernatant or SUP). After removal of the supernatant, an additional 1min spin at 16,000g was performed to isolate any residual supernatant and the pellet was re-suspended in 100μl Breakage buffer. MgCl2 (5mM) and DNase I (2μg/ml) were added to the PEL fractions to solubilize the chromatin and associated proteins. After 10min, the reaction was quenched with the addition of 2μl 0.5M EDTA. 10μl of each SUP and PEL sample were collected for protein quantification. To the remaining of each sample, ½ volume of loading buffer (270mM DTT, 9.9% SDS, 26% Glycerol and 10% Bromophenol blue) was added. Samples were subsequently boiled and equal concentrations of total protein were loaded on 7.5% SDS-PAGE gels. For an equal volume, PEL samples were 20-fold more concentrated than SUP samples due to the fact that approximately 5% of proteins are chromatin bound.
Chromatin fractionation samples were assayed for protein concentration using the Bradford assay (Biorad). Equal amounts of protein were loaded into each lane of a 10% SDS-polyacrylamide gel for each set of time course samples. Following transfer, nitrocellulose membranes were stained with Ponceau S dye (Sigma). Membranes were destained using 1 X TEN buffer for 10min. Mouse anti-Myc (Sigma, 1:5000) and Alexa Fluor 488 goat anti-mouse IgG (Invitrogen, 1:3000) antibodies were used to detect Myc-tagged Cdc45 and Cdc6. Anti-Mcm2 antibody (yN-19 goat polyclonal, Santa Cruz, 1:500) along with Alexa Fluor 488 donkey anti-goat IgG (Invitrogen, 1:3000) antibodies were used to detect Mcm2. Blots were also probed with TAT1 antibody (gift from the Gull lab) for visualizing α-tubulin, which should be exclusive to SUP samples as it does not associate with chromatin. Blots were incubated in primary and secondary antibodies for 2h each, proceeding 4h of blocking in 5% skim milk. Between blocking and each antibody treatment, blots were washed for 2×10min with 1 x TEN+0.05% Tween-20. A Typhoon 9410 scanner (GE healthcare) was used to analyze the blots. Densitometry readings were performed using Image Quant TL software (BD) and were normalized to total protein concentration as judged by the Ponceau S stain and/or tubulin band intensity. The same protocol for Western blotting was used in perturbation experiments involving HA-tagged strains. In this case, mouse anti-HA (Sigma, 1:5000) and Alexa Fluor 488 goat anti-mouse IgG (Invitrogen, 1:3000) antibodies were used to determine levels of Cdc6, Cdt1 and Dbf4.
Determination of in vivo cellular concentrations for each factor was performed for each time point. Normalization of densitometry readings was carried out by averaging the means of all SUP values from each set of experiments for a given protein. This average was divided by the means of the SUPs for each trial to give a scaling factor (S1) for each trial. Each SUP value for a given trial was then multiplied by its scaling factor (S1, S2 or S3). The same procedure was applied to all PEL sample values. In general, when a protein is in the chromatin fraction it is DNA-bound. To correct for non-specific DNA-binding, we determined, for each protein, a background level of non-specific binding corresponding to the observed abundance from a time-point at which the factor is known to be absent from origins. To obtain densitometric values, the program Imgage Quant™ was used to analyze blots scanned by a Typhoon™ 9400 imager (both GE Healthcare). To convert the densitometry measures to molecules/cell concentrations, we determined a scaling factor for each protein from the molecule counts reported in . For each experiment, we averaged the total densitometry measure (SUP plus PEL) across the time-points to arrive at an averaged asynchronous densitometry reading (weighted according to the time contribution of each sample to a 90 minute cycle), which was then compared to the database to arrive at a scaling factor. We could not follow this procedure for Cdc6 or Dbf4, since they are not included in the Ghaemmmaghami et al.  database. A scaling factor was determined for these proteins by determining the relative abundance between Cdc45, Cdc6 and Dbf4 using similarly tagged asynchronous cultures. Whole cell extract levels of Cdc45-Myc and Cdc6-Myc were analysed by Western blotting and densitometry to yield an approximately 3:1 ratio of Cdc45 to Cdc6 copies per cell. Similarly, Cdc6-HA and Dbf4-HA were compared. This analysis allowed us to arrive at an asynchronous value of 576 molecules/cell for Cdc6 (Cdc6Total=576), i.e. one third of the value reported for the concentration of Cdc45, i.e. 1730 copies/cell according to  (Cdc45Total=1730). Similarly, the concentration of Dbf4 in an asynchronous population was 270 molecules/cell (Dbf4Total=270). The copy/cell number used for Mcm2 was 40,000  while that for Cdt1 was 2190  (Cdt1Total=2190). As an example of the agreement of our simulated values with literature-observed origin stoichiometry, Mcm2 (representing the MCM complex) was present at levels that were consistent with having two MCM complexes bound to each origin. This reflects the head to head placement of the heterododecamer at the origin. Once firing occurs, two forks are produced, illustrated in our model as one Mcm2-7·Cdc45 species molecule being generated (each FORK in the model represents a pair of these complexes). The level of chromatin-associated MCM protein that we obtained corresponds to roughly 300 origins being bound in this manner, which is the range of the number of origins that are reported to potentially fire per cell per cell cycle according to various global origin characterization studies [65,90-92].
The values used to assign copies/cell numbers were taken from the best literature source available. The number for Mcm proteins in the GFP-tag database  differ by orders of magnitude from the number that has been reported by . The latter is widely accepted, and so this discrepancy cannot be ignored. While similar inconsistencies might exist for other proteins in our model, where possible, we have carefully evaluated whether the GFP-tag database number gives a plausible cellular abundance through comparison with other reported values. For example, levels for Cdc45 in  are consistent with the value reported by , while no value has been reported for Cdt1 except for that in the GFP-tag database. In the event that future studies provide more accurate values for the various cellular abundances, the model can be easily revised to accommodate these changes by scaling the parameters associated with the relevant species. Indeed, it is common practice for models to be presented in terms of ‘arbitrary’ concentration units. Our use of a molecule-per-cell concentration scale allows us to make absolute predictions about species time-courses. Any future adjustments to the species levels will scale the concentrations and parameter estimates, but will have no direct bearing on the overall dynamic behaviour of the network.
DY-139 (GAL1-CDC6), DY-140 (GAL1-CDT1) and DY-255 (GAL1-DBF4) strains were grown to 1×107 cells/ml in 2% galactose/1% raffinose medium at 30°C, centrifuged at 6000g for 5min and washed with dH20. Cells were then resuspended in 2% glucose medium, maintaining the same cell concentration. Culture aliquots were removed for FACS analysis and preparation of whole cell extracts (as described in ), both before, and at various intervals after the switch to glucose. As references for normal endogenous protein levels, whole cell extracts were also prepared from strains expressing DBF4 (DY-256), CDC6 (DY-142) and CDT1 (DY-143) from their endogenous promoters. All strains produced fusion proteins with a 3HA epitope tag to facilitate visualization via Western blotting.
We collected protein data from yeast strains which were observed to have a generation time of ~90min in log phase. In order to fit our data to the timescale of the Chen et al.  model, which has a period of 101.2 minutes, we scaled our experimental time-course to reflect this change in timing. This value, as described in their model, is chosen to reflect the longer cycle of daughter cells (which are smaller than mother cells in asymmetric cell division). Time-point samples were collected from an alpha factor arrest in late G1 phase, corresponding to experimental time-point T=0. Additional samples were collected at 5, 10, 15, 30, 45, 60 and 75min after synchronous release from the alpha factor block. We observed that our experimental timepoint T=0 corresponds to 19min after the beginning of G1 phase in the 101.2min model. We arrived at this number by comparing the point at which cells entered S phase in vivo (~15min after alpha factor release as determined by FACS analysis) and the corresponding start of S phase in the model.
One of the two inputs we used from the Chen model was the APC co-factor Cdc20. Its role in our model is to activate the APC, which rapidly degrades Dbf4. In the Chen model, Cdc20 serves exclusively as a signal to exit mitosis. The Cdc20 degradation rate is a function of the parameter kmad2, which describes the activity level of the protein Mad2, a key factor in the spindle assembly checkpoint. In order to prevent the occurrence of mitosis before replicated chromosomes have been properly aligned along the metaphase plane, the value of kmad2 jumps discontinuously from 0.01 to 8 once DNA replication has commenced, signified by the lumped variable ORI reaching a threshold value of 1. This ensures that Cdc20 levels are low, and so prevents premature mitosis. Later, when spindle assembly is complete, the value of kmad2 falls back to 0.01, allowing Cdc20 to accumulate. The signal for this event is a second lumped parameter, SPN, hitting its threshold value of 1. As a result of these discontinuous transitions in kmad2, the Cdc20 profile shows rather sharp shifts in behaviour. When we applied this profile as an input to our model, the Dbf4 profile spiked earlier and more abruptly than our laboratory observations; this discrepancy was not observed in the Chen model since that model does not address the influence of Cdc20 on Dbf4. To address this inconsistency, we smoothed the kmad2 profile to allow for a gradual decline in Mad2 activity. We chose the timing to match our Dbf4 observations. The original formulation for kmad2 specifies a value of 0.01 when ORI is less than 1 or SPN is greater than 1, and otherwise is equal to 8. We replaced this condition with:
The difference between the original Cdc20 profile and our modified version is shown in Additional file 8.
In order to implement a combined model, we closed the loop between our model and the cell cycle model of Chen et al.  by eliminating the lumped species ORI and replacing it with an indication of the progress of replication from our FORK species. In this combined model we modified the formula for kmad2 shown above by replacing the ratio (ORI/15) with , where the integration begins at the start of the cell cycle. Since each model incorporates a description of Cdc6 dynamics, we merged them by including the dynamics of origin binding from our model with the Cdc6 dynamics of the Chen model. This resulted in good accordance between the behaviour of the replication initiation network in our model in the combined model. There are two marginal differences, neither of which affects the overall system behaviour (see Additional file 8). Firstly, due to the fact that our internal model uses a Clb5 profile generated by scaling from arbitrary units, it is nearly identical to the Chen model Clb5 profile. The Chen Clb5 profile extends farther past the 101.2 minute mark than ours resulting in a more rapid dephosphorylation of RC7 in the internal model. Thus, RC7 persists for ~5 mins longer in the combined model. Secondly, because the merged Cdc6 decreases in concentration earlier than in our model in isolation, RC1 levels stay high until fork firing occurs. Despite these differences, the essential dynamics of the system are preserved: replication fork firing follows the same pattern, with the RC7→RC1 delay having no effect on timing of firing. The behaviours of the Chen model species are not perceptibly altered by the removal of ORI and the combination of the two models (Additional file 6).
The effects of combining our models are shown to be minimal in the wildtype case. Comparing eight cell cycle mutants used to fit the Chen model in the combined model, we see the identical phenotypes (see Additional file 7 and 9). These results demonstrate that our replication initiation module is a functional replacement for the corresponding black box in Chen’s whole cell cycle model. A summary of simulated mutant phenotypes [7,55,60,73,74,94-97] is given in Additional file 9.
The authors declare that they have no competing interests.
The model was conceived by GL, BI, BM and BD. Initial model implementation was carried out by PS and BI. All experimental work (except for Figure 6, RG and LD) was carried out by RG, supervised by BD. The final model implementation and analysis were done by RG and BI. The manuscript was prepared by RG. All authors read and approved the final manuscript.
Figure S1.In vivochromatin fractionation results for Cdc6 as assayed via Western blotting.
Figure S2.In vivochromatin fractionation results for Cdc45 as assayed via Western blotting.
Figure S3.In vivochromatin fractionation results for Mcm2 as assayed via Western blotting.
Table S1.Sample conversion of densitometry values to molecules per cell values for a Cdc45-myc timecourse experiment.
Figure S4. Levels of model components when Cdc6, Cdt1 or Dbf4 have been reduced to 10% of their wild-type levels.
Figure S5. The combination of DNA replication and whole cell cycle models does not alter either’s behaviour in isolation.
Figure S6.Mutant phenotypes reported in whole cell cycle model are unaltered in the combined model.
We would like to thank Pooja Viswanathan for her help in the early stages of this project. This work was supported by grant ITG-70195 from the Canadian Institutes of Health Research.