|Home | About | Journals | Submit | Contact Us | Français|
The intracellular environment imposes a variety of constraints on biochemical reaction systems that can substantially change reaction rates and equilibria relative to an ideal solution-based environment. One of the most notable features of the intracellular environment is its dense macromolecular crowding, which, among many other effects, tends to strongly enhance binding and assembly reactions. Despite extensive study of biochemistry in crowded media, it remains extremely difficult to predict how crowding will quantitatively affect any given reaction system due to the dependence of the crowding effect on numerous assumptions about the reactants and crowding agents involved. We previously developed a two dimensional stochastic off-lattice model of binding reactions based on the Green’s function reaction dynamics (GFRD) method in order to create a versatile simulation environment in which one can explore interactions among many parameters of a crowded assembly system. In the present work, we examine interactions among several critical parameters for a model dimerization system: the total concentration of reactants and inert particles, the binding probability upon a collision between two reactant monomers, the mean time of dissociation reactions, and the diffusion coefficient of the system. Applying regression models to equilibrium constants across parameter ranges shows that the effect of the total concentration is approximately captured by a low-order nonlinear polynomial model, while the other three parameter effects are each accurately captured by a linear model. Furthermore, validation on tests with multi-parameter variations reveals that the effects of these parameters are separable from one another over a broad range of variation in all four parameters. The simulation work suggests that predictive models of crowding effects can accommodate a wider variety of parameter variations than prior theoretical models have so far achieved.
Physiological and biochemical conditions inside of a cell are very different from the well-mixed and dilute conditions in typical in vitro models . One of the most significant features of the intracellular environment is its dense molecular crowding, with total concentrations in the cell generally in the range of 50–400 mg/ml . This molecular crowding can enhance many important cellular processes, such as macromolecule association [3–6], protein folding [7,8], and enzyme activity [9,10]. Molecular crowding significantly increases equilibria and reaction rates for many cellular processes, with crowded and dilute conditions sometimes differing by more than a hundred-fold . Theoretical and computational approaches have been used to predict how crowding influences biochemical reaction processes under a variety of assumptions [11–15]. Various theoretical models have treated crowding effects as a form of “correction” to classical mass-action models of idealized chemistry. For example, Kopelman  and Schnell and Turner  developed models for reaction systems in crowded media based on the concept of fractal reaction orders. Minton and collaborators developed a theoretical framework based on statistical thermodynamics models and scaled particle theory for modeling reaction thermodynamics and kinetics in terms of additive corrections to activity coefficients relative to idealized reaction rates [18–23]. Such models have made it possible to explore how many individual parameter variations in a system influence the magnitude and direction of crowding effects relative to idealized kinetics [18, 24, 25]. Such analytical approaches are crucial for gaining insight into how various parameters affect crowding, but they can be limited by the difficulty of analytically expressing and analyzing the diversity of possibly synergistic features of a system.
Simulation methods provide an alternative that can make predictions about arbitrarily complicated parameters or combinations of parameters in crowded media. Simulation methods are limited primarily by the accuracy of their underlying models and the computational cost required to apply them [15, 26]. For example, Monte Carlo lattice models provide high efficiency but with the price of oversimplified geometry that can lead them to overestimate the influence of crowding on binding chemistry [27–29]. Off-lattice models based on Brownian or Langevin dynamics provide a more realistic model of particle trajectories, but at greatly increased computational cost [30, 31]. These computational costs in turn limit one’s ability to thoroughly explore a parameter space. To attempt to better balance these competing factors, we previously developed a coarse-grained two dimensional stochastic off-lattice model (2DSOLM)  based on GFRD  for simulating binding kinetics in various molecular crowding conditions. The prior study confirmed that this coarse-grained model provides a reasonable qualitative representation of the effects of crowding on binding chemistry, exhibiting an expected enhancement of polymerization with increasing crowding by either reactant monomers or inert crowding agents. Although this simulator still yields a highly simplified representation of the cellular environment we wish to study, it provides a way to isolate specific features of crowded systems and examine how variations in those specific features, singly or in combination, would influence binding chemistry in an idealized model. Such an idealized model will provide a basis for moving towards more realistic computational models of biochemistry in the cell by helping us identify precisely which features of the cell must be explicitly incorporated into biochemical models, which can be adequately captured by simplified mathematical corrections, and which are superfluous to producing quantitatively accurate descriptions of reaction chemistry.
In the present work, we use the simulation method to explore how the parameters of 2DSOLM affect binding thermodynamics with the goal of identifying parameters that act independently of one another and can be fit to simple analytical models. We explore how dimerization propensity is affected by four adjustable parameters: the total concentration (C), defined as the ratio of all particle area to the investigated area; the probability of binding upon a collision between two reactant monomers (B); the mean time for dissociation event (M), defined as the inverse of the rate constant; and the diffusion coefficient for reactants and inert particles (D). We show that a simple regression model fit analytically to simulation results on variations of C, B, M, and D accurately predicts their variations across a broad range of the simulation parameter space. Because our model is a hypothetical homodimerization system in two dimensional space, the simulating conditions of our model are far from the actual conditions of the intracellular environment. Our model and simulation results, however, demonstrate an opportunity for which a predictive regression model can provide a means of controlling for crowding effects in simulations too large in particle numbers or too long in timescale to allow for an explicit crowding model.
We conducted simulations using a model of homodimerization reactions developed in our prior work , which models reactant monomers and dimers as well as inert crowding agents as rigid, circular particles. Simulation proceeds through GFRD , a discrete-event method for fast off-lattice particle simulation. We applied a hard reflective boundary condition with a square boundary (100 nm × 100 nm) and used a reactant monomer radius of 2.5 nm, as in our prior work . These values are meant to approximate the size scale for a globular actin monomer  in a typical eukaryotic cell. Note, however, that while the motivation of this work is to improve models of biochemistry in the cell, the model itself is a highly simplified and is more properly regarded as an idealized model of a crowded system rather than a model of the cell per se.
In order to discover the individual parameter effects on the homodimerization reaction, we first assigned default values for all parameters and then tested variations in each single parameter individually with all others held at their default values. We established a baseline simulation parameter set with default parameter values of B=0.7, M=1ns, and D=13.9×10−11m2s−1. These default values were chosen based on our prior simulation study  to produce a reasonably strong crowding effect as well as to approximate the temperature and viscosity conditions of the cytoplasm [35, 36]. The model also has three additional adjustable parameters that were not explored here: the ratio of dimer area to monomer area ( ); the ratio of inert particle area to reactant monomer area ( ); and the threshold distance, a simulation parameter describing the maximum distance at which two particles can interact with each other (dth). For the present study, we set α=2 (dimer area is exactly twice that of an unbound monomer), β=1(crowding agents have equal size to reactant monomers), and dth=0.5nm (threshold distance is one fifth of the radius of a reactant monomer), values also chosen based on our prior simulation study  to provide a physically reasonable model of dimerization. Second, we separately varied the individual parameter values for B, M, and D with the default values for the other parameters to test these individual parameter effects on the binding chemistry. We simulated five B values (0.1, 0.3, 0.5, 0.7, 0.9), five M values (0.6, 0.8, 1.0, 1.2, 1.4ns) and five D values (3.9, 13.9, 23.9, 33.9, 43.9×10−11m2s−1). For each of these conditions, we simulated eight total concentrations (C) (0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45) with dimensionless units of fraction of total simulation area occupied by particles. To produce varying concentrations, we began with a fixed concentration of reactants of 0.1 and then added inert crowding particles to yield each higher value of total concentration (e.g., C=0.1 corresponds to purely reactant monomers while C=0.25 corresponds to concentration 0.1 of reactant monomers and 0.15 of inert crowding agents). Initially, all reactants are monomers. Initialization of particle positions is performed by establishing a grid of potential particle positions at the maximum possible packing density for whichever of reactant monomers and crowding agents occupies the larger total area and then randomly inserting particles into the corresponding grid positions. This protocol was developed because it makes it possible to initialize in highly crowded conditions where independent uniform placement of particles would usually result in overlapping particles. Each simulation was run for 25μs with 30 repetitions per simulation, with progress recorded every 0.15625μs. For each condition, we measured reaction progress by the mean number of dimers as a function of time across all simulations.
We used the results of the simulations described in the preceding section to fit an equilibrium constant (Keq) to each condition. For the homodimerization reaction, the governing chemical reaction is given by Eq. (1):
where M is the reactant monomer, D is the reactant dimer, I is the inert crowding particle, K+ is the forward reaction rate, and K− is the reverse reaction rate. The equilibrium constant can be computed from Eq. (1) as follows:
where [Deq] is the concentration of dimers at the quasi-equilibrium state, [Meq] is the concentration of monomers at the quasi-equilibrium state, M0 is the number of initial monomers, Meq is the number of monomers at the quasi-equilibrium state, Deq is the number of dimers at the quasi-equilibrium state, and A is the square investigated area ((100nm)2). The inert particle in Eq. (1) cannot influence the homodimeriation reaction in an idealized mass-action model, because [I] drops out in Eq. (2). Therefore, Eq. (2) is only used to calculate the equilibrium constant from the simulation data on the assumption that 2DSOLM appropriately represents the crowding effect of all particles in the various test cases. In addition, we derived the quadratic equation to calculate the number of dimers at the quasi-equilibrium state (Deq) using the estimated Keq value from Eq. (2), shown in Eq. (3). The analytical solution of Eq. (3) is Eq. (4).
where estimates were based on averaging dimer counts at 5μs intervals from 5–25 μs. 5μs was judged a reasonable time to mix the system based on a series of additional simulations conducted to estimate the mixing time of the system under different parameter conditions. For these experiments, we placed a single reactant monomer at the center of the investigated area and measured its displacement from the center as a function of time for varying C and D values with all the other particles inert. These plots were used to establish a mixing time usable across simulation conditions for measuring steady-state dimer counts. Errorbars in all plots of simulation results were estimated from the standard deviations across the five 5μs time points and thirty repetitions per condition, giving 150 total measurements per condition. This data collection approach produces a reasonably large amount of data while allowing us to simulate the reaction for a relatively long time, providing a post hoc check that simulation time was sufficient to reach a steady-state.
To analyze the molecular crowding effect on the homodimerization reaction in 2DSOLM, we first calculated Keq values for the simulation results using Eq. (2) from C=0.1 to 0.45 at increments of 0.05. Second, we fit polynomials of degrees 0 through 12 for varying C and default values of all other parameters and measured the root mean value of the sum of square difference between the regression equation and simulations for each value of C with 10-fold cross-validation. We used the resulting best-fit to build a regression model of Keq in terms of C, assuming defaults for all other parameters. We also estimated dimer counts at quasi-equilibrium using Keq values from the regression model and Eq. (4).
To derive regression models of B, M, and D, we first calculated Keq values for the simulation results of different B, M, and D values separately using Eq. (2) from C=0.1 to 0.45 at increments of 0.05. Second, we fit three scaling factors, expressed as multiplicative functions of B, M, and D individually, so as to provide a least-squares best fit to the simulation estimates of Keq as a function of each parameter. We then multiplied the previously-built regression model of C by these three scaling factors, corresponding to an assumption that the crowding effect acts independently of these three parameters. Finally, we created a unified regression model by assuming that the effects of each of the four parameters on Keq are independent and multiplicative.
To evaluate the unified regression model, we randomly selected 40 test cases for fixed reactant concentration 0.1 and other parameters selected uniformly at random the ranges C = 0.1 – 0.45, B = 0.1 – 0.9, M = 0.6 – 1.4ns, and D = 3.9 – 43.9 ×10−11m2s−1. For each test case, we compared the mean dimer counts from the simulation to the estimated dimer counts predicted by the unified regression model and Eq. (4). As an additional test of the performance of the model outside the parameter ranges considered in model-fitting, we conducted another 40 random trials in which we also varied the concentration of reactants (CR) uniformly between 0.1 and 0.45 for the second 40 test cases, choosing the concentration of inert particles (CI) uniformly from 0 to 0.45-CR and B, M, and D from the same ranges as above. We similarly compared these simulation results with estimates from the regression model. Simulation values are averages over 30 repetitions per data point.
In order to test whether boundary effects induced by the relatively small bounding box might influence the reactions, we conducted additional random parameter experiments using a 200nm × 200nm boundary area. For these experiments, we selected parameter values at random from the same ranges as in the preceding paragraph. As before, we conducted 40 test cases with fixed CR=0.1 and variable inert concentration from 0 to 0.35 and 40 test cases with CR uniformly between 0.1 and 0.45 and CI uniformly between 0 and 0.45-CR. We compared the results to the predictions from the regression model. Due to the larger number of particles per trial, we performed five repetitions per data point.
To help illustrate the effects of various parameters on binding kinetics, we animated the dynamics of these particles in movie files showing the progress of individual homodimerization simulations. Each movie contains two different moving images, which are distinguished by different values for a single parameter while all other parameters use the common default values. We generated movies for the following comparisons: C=0.1 (0.1 CR + 0.0 CI) vs C=0.45 (0.1 CR + 0.35 CI), B=0.1 vs B=0.9, M=0.6ns vs M=1.4ns, and D=3.9×10−11m2s−1 vs D=43.9×10−11m2s−1. The first half of each movie shows the system in the initial (pre-equilibration) state, and the second half of the movie shows a quasi-equilibrium state . High-resolution versions of the movies can be downloaded from: http://www.cs.cmu.edu/~russells/projects/crowding/crowding.html.
Fig. 1 illustrates the progress of typical simulations, illustrating a snapshot of the simulator at the quasi-equilibrium state (25μs) and the number of dimers versus time for default parameters (B=0.7, M=1ns, D=13.9×10−11m2s−1, α=2, β=1, and dth=0.5nm) under two crowding conditions: low crowding (C=0.15; 0.1 for reactants + 0.05 for inert particles, with simulation progress shown from 0 to 25ns in Fig. 1(b)) and high crowding (C=0.45; 0.1 for reactants + 0.35 for inert particles, with simulation progress shown from 0 to 5μs in Fig. 1(d)). Initially, all reactant particles are monomers, so the dimer count is zero at 0s. The system approaches the quasi-equilibrium state at roughly 5ns for the low crowding case and at 1μs for the high crowding case, and then shows fluctuation of dimer counts around an equilibrium value. There is approximately a 2-fold difference in the equilibrium level between these two scenarios, although more than two orders of magnitude difference in the kinetics of the reaction.
Given large differences in kinetics of the simulations, we ran a series of simulation experiments examining particle diffusion as a function of time to determine a reasonable upper bound on mixing time across parameter values, as described in Sec. II.B. The results are shown in Fig. 2. Based on the observed mixing times, we concluded that 5μs would be sufficient time to adequately mix at all but the lowest extreme of diffusion rate. We therefore estimate equilibrium dimer counts in subsequent experiments by averaging over 5μs intervals from 5–25μs per experiment.
In analyzing effects of parameter variations, we first attempted to determine a model of the effect of C on Keq. We calculated the average Keq from Eq. (2) and the simulation data from C=0.1 to C=0.45, with all other parameters at default values. All particles are reactants at C=0.1, and inert particles are added as C increases. The calculated Keq values for different concentrations are shown in Fig. 3(a), represented by a solid line with star marks. We fit the Keq curve by polynomial regression of varying degrees, minimizing sum-of-square errors for each degree with 10-fold cross validation. Fig. 3(a) shows best-fit first-, second-, and third-order regression models, along with the idealized Keq assuming no crowding effect. We see negligible improvement in regression quality past degree three; beyond degree nine, cross-validated accuracy begins to drop, suggestive of overfitting (Fig. 3(c)). The resulting best-fit third-order model is given in Eq. (5):
Note that because C is dimensionless, as it is the area ratio of all particles to the simulation space, we must multiply the formula by (1counts−1m2) to match the physical dimension. Fig. 3(b) shows dimer counts at pseudo-equilibrium for the homodimerization reaction from the simulation data; the expected constant dimer count assuming no crowding effect (or infinitely dilute conditions); and the calculated data from the first, second, and third degree polynomials of Fig. 3(a) as determined by Eq. (4), providing an alternative way of visualizing the quality of the fit.
We next calculated the Keq values of the simulation data for different B, M, and D values (Fig. 4) with fixed reactant monomer concentration 0.1, varying C from 0.1 to 0.45 and all other parameters set to the defaults. Fig. 4(a) shows Keq for B values 0.1 (bottom), 0.3, 0.5, 0.7, and 0.9 (top). The figure also shows a linear regression fit to the resulting Keq curves, which was combined multiplicatively with the previous third-order C regression model in Eq. (5) to yield the following two-parameter regression curve:
Increasing B yields more dimers at the quasi-equilibrium state because it increases the probability of binding between two reactant monomers upon collisions. Fig. 4(b) shows the measured and predicted equilibrium dimer counts resulting from this curve. Fig. 4(c) shows Keq for M values 0.6 (bottom), 0.8, 1.0, 1.2, and 1.4ns (top), along with the regression model
derived by multiplicative combination of the C model with a linear best-fit regression model for M. Increasing M yields more dimers at the quasi-equilibrium state due to reduced dissociation rates. Fig. 4(d) shows the measured and predicted equilibrium dimer counts from the regression models for the same variations in M and C. Fig. 4(e) shows Keq for D values 3.9 (bottom), 13.9, 23.9, 33.9, and 43.9×10−11m2s−1 (top) using a best-fit linear model of D combined with the previous third-order C model:
Increasing D produces more dimers at the quasi-equilibrium state by increasing the collision frequency of reactant monomers and thus the forward reaction rate. Fig. 4(f) shows the measured and predicted equilibrium dimer counts from the regression models for the same variations in D and C. The relative scaling factor terms in Eq. (6–8) are dimensionless because each term is divided by the default values. Therefore, Eq. (6–8) are multiplied by (1counts−1m2) to match the physical dimension. The figures demonstrate that the contributions of B, M, and D to the crowding effect can each individually be treated as independent from the contribution of C.
We next sought to test whether it is possible to build a unified model of all four parameters (C, B, M, and D) from their single-parameter fits based on the assumption that they independently contribute to the homodimerization reaction equilibrium. We therefore constructed a unified regression model of Keq for C, B, M, and D follows:
We then evaluated how well the model would predict simultaneous changes in all four parameters. To evaluate Eq. (9), we simulated two different test conditions. We first simulated 40 different test cases in which we randomly chose the concentration of inert particles (CI: from 0–0.35), B (from 0.1–0.9), M (from 0.6–1.4ns), and D (from 3.9–43.9 ×10−11m2s−1) values at a fixed concentration of reactants (CR=0.1). Table I provides the parameter values for the forty test cases. The other parameter values were set to the default values: (α=2,β=1, and dth=0.5nm). Fig. 5(a) plots the average simulation values versus predicted values from Eq. (4) and Eq. (9). The simulation values for each test case are derived from 30 different simulation runs with 5 different time progression points (5, 10, 15, 20, 25μs). The figure shows a close match between the regression model of Eq. (9) and the simulations across a broad range of crowding levels, reaction rates, and diffusion rates.
To examine how well the model would perform outside of the training parameter set, we simulated 40 additional test cases for which we randomly chose the concentration of reactants (CR:0.1–0.45) and then varied the total concentration and other parameters as in the preceding test cases: concentration of inert particles [CI: from 0 – (0.45-CR)], B (from 0.1–0.9), M (from 0.6–1.4ns), and D (from 3.9–43.9 ×10−11m2s−1). The parameter values for these forty test cases are shown in Table II. The other parameter values are set to the default values: (α=2,β=1, and dth=0.5nm). Fig. 5(b) plots the average simulation values versus predicted values from the regression model of Eq. (4) and Eq. (9). The simulation values of each test case are derived from 30 different simulation runs with 5 different time progression points (5, 10, 15, 20, 25μs). The figure reveals high accuracy for low to moderate reactant concentrations. The regression model does, however, slightly overestimate the equilibrium dimer counts under conditions in which the equilibrium favors exceptionally high dimer concentration (more than approximately 40).
We simulated 80 additional test cases sampled from the same parameter ranges using a 200nm × 200nm bounding area in order to explore whether boundary effects significantly impact the fit of the regression model. The parameter values for these eighty test cases are shown in Table III and Table IV, respectively. Fig. 5(c) plots the average simulation values versus predicted values from the regression model of Eq. (4) and Eq. (9) for the fixed reactant concentration case and Fig. 5(d) for the variable reactant concentration case. The results appear comparable to those of Fig. 5(a) and (b). The regression model and simulations provide close matches for low to moderate reactant concentrations, but the regression model slightly overestimates the equilibrium dimer counts under conditions in which the equilibrium favors exceptionally high dimer concentration (more than approximately 160). There is no noticable difference in quality of fit between the two sizes of simulation.
We have conducted a series of simulations to test the effects of various binding parameters on effective equilibria of a simple model of binding chemistry under various crowded conditions. We further showed through the use of regression models that the parameter effects we examine act as independent contributions to the dimerization equilibrium over a broad four-dimensional parameter space. Our model is consistent with previous theoretical studies [18, 22, 23, 25] in finding that total concentration acts non-linearly to affect the binding equilibrium constant, but further shows that a third-degree polynomial (Eq. (5)) accurately describes this non-linear effect over a concentration range spanning physiologically relevant intracellular crowding levels. Furthermore, our model showed the other parameters examined (B, M, and D) influence the binding chemistry linearly and independently from one another and C across the range of crowded conditions examined. This result provides support for a strategy of using simulation-assisted studies to develop empirical and analytical models of crowding effects under broad parameter domains, models that can in turn be used where crowding simulations are too computationally intensive to be practical, such as in models of complicated assembly processes .
Our model does, however, begin to slightly overestimate the crowding effect at very high reactant concentrations, where unusually high dimer concentrations occur. We believe this effect suggests a need for further corrections in the model for the equilibrium dimer counts. In addition, there are several additional parameters important to crowding that we would not expect to act linearly nor independently, such as our simulation parameters α, β, and dth. Dealing with such parameters is likely to require a more involved study of covariance among parameter subgroups and multiparameter regression to build a more comprehensive and accurate model.
In evaluating how this work can inform models for cellular biochemistry, it is important to bear in mind that it is a highly simplified representation of binding in crowded media. The most obvious simplification is the use of a two-dimensional model for what is in reality a three-dimensional system. While the prior work  confirmed that the model qualitatively captures the expected effects of crowding on binding thermodynamics, it would not be expected to provide a close quantitative match to a three-dimensional model. We cannot exclude the possibility that there may even be significant qualitative differences between crowding in two and three dimensions. Similarly, the model omits many factors found in a true cellular system that are likely to influence binding processes, such as a fine compartmentalization structure, the presence of membranes that might act as scaffolds for reactions, chaperones or other agents that assist or prevent various reaction types, and various mechanisms for active transport within the cell. Likewise, there are other parameters even of our coarse-grained model one could vary and these factors may prove more difficult to capture by simple regression models. By omitting many such complications, the present approach allows us to isolate a few specific factors and parameters likely to be relevant to reaction kinetics in the cell and determine precisely how they affect reactions and how they might be efficiently modeled. Here, the approach showed specifically that a simple regression model accurately captures the dependence of binding equilibrium on several independent parameters of a crowding model. Such observations can help us on the path towards improved simulations of cellular biochemistry in the face of finite computational resources and many unknowns. Their results, however, cannot in themselves be regarded as realistic models of the cell. Considerable additional work will be needed to fully identify precisely which factors are needed for realistic models of cellular biochemistry, to develop methods to simulate these factors efficiently, and to verify that these methods provide accurate quantitative descriptions of biochemistry in vivo.
We thank Dr. Robert Murphy and the Center for Bioimage Informatics at Carnegie Mellon University for providing the use of their computer cluster for this research. This work is supported by U.S. National Science Foundation award #0346981, U.S. National Institutes of Health award #1R01AI076318, and the Korea Science and Engineering Foundation (M06-2004-000-10505).
PACS number(s): 83.10.Rs, 87.17.Aa, 82.39.Rt, 87.15.R-