The overall hypothesis of our study was that if several physiologically-relevant characteristics of a model's behavior were known, this information would be sufficient to constrain some or all of the model's parameters. We tested this idea using two approaches: one based on multivariable regression and the other based on Bayes's theorem. We began by generating a database of candidate models. The parameters that define maximal conductances and rates of ion transport in the TNNP model 
were varied randomly, and several simulations, defining how the candidate model responded to altered experimental conditions, were performed with each new set of parameters. In general, the simulations reflected experimental tests commonly performed on ventricular myocytes, such as determining the threshold for excitation or changing the rate of pacing.
For the first approach, the results of these simulations were collected in “input” and “output” matrices X
, respectively. Each column of X
corresponded to a model parameter, and each row corresponded to a candidate model (n
300). The columns of Y
were the physiological outputs extracted from the simulation results, such as action potential duration (APD) and Ca2+
transient amplitude. Complete descriptions of the randomization procedure and simulation protocols are provided in the Methods
and Text S1
. Outputs are listed in and described in detail in Text S1
Physiological outputs in simulations with TNNP model.
Multivariable regression techniques were used to quantitatively relate the inputs to the outputs. In the “forward problem,” a matrix of regression coefficients B
was derived such that the predicted output Y
was a close approximation of the actual output Y
. This method has recently been proven useful for characterizing the parameter sensitivity of electrophysiological models 
. We reasoned that a similar approach could be used to address the question: if the measurable physiological characteristics of a cardiac myocyte are known, can this information be used to uniquely specify the magnitudes of the ionic currents and Ca2+
transport processes? Specifically, we hypothesized that if: 1) Y
was a close approximation of the true output Y
, and 2) B
was a square matrix of full rank, then Xpredicted
should be a close approximation of the true input matrix X
. This argument is illustrated schematically in .
Schematic of input, output and regression matrix structures.
demonstrates the accuracy of the reverse regression method. For four chosen conductances, the scatter plots show the “actual” values, generated by randomizing the baseline parameters in the published TNNP model, versus the “predicted” values calculated with the regression model. The large R2 values (>0.9) indicate that the predictions of the regression method are quite accurate. Of the 16 conductances in the TNNP model, 12 could be predicted with R2>0.7. The four that were less well-predicted were the Na+ background conductance (GNab), the rapid component of the K+ delayed rectifier conductance (GKr), the sarcolemmal Ca2+ pump (KpCa) and the second SR Ca2+ release parameter (Krel2).
Predictions of the linear empirical model generated by reverse regression.
To verify that these encouraging results were not specific to the TNNP model, we performed similar analyses on additional models, the human ventricular myocyte model of Bernus et al. 
, and the “Phase 1” ventricular cell model of Luo and Rudy 
. In either case (Figures S3
, respectively), the reverse regression was highly predictive of most parameters, indicating that this approach is generally applicable. The outputs used for these analyses, listed in Text S1
, differed somewhat from those used for the TNNP simulations because the Bernus et al. 
and Phase 1 Luo and Rudy 
models are relatively simple and do not consider intracellular Ca2+
handling in detail.
illustrates how the quantity and identity of the outputs in Y
affected the accuracy of the predictions. Bar graphs show R2
values for prediction of each model parameter obtained by performing the reverse regression in three ways: 1) using all 32 outputs (blue), 2) matrix inversion (green), with the 16 best outputs as identified by the output elimination algorithm (see Methods
), and 3) using only the 16 rejected outputs (red). The R2
values computed using the 16 best outputs were virtually identical to those obtained when all 32 outputs were used whereas R2
values for most conductances were substantially lower when only the 16 rejected outputs were included. These tests validate the algorithm which selected the outputs for matrix inversion. Moreover, since the 16 best outputs performed essentially as well as the full set of 32 outputs, this result implies that the model outputs were not fully linearly independent, and the 16 rejected outputs contained redundant information.
displays, as heat maps, the coefficients for both the forward and reverse regression problems. The former indicate how model parameters influence outputs, whereas the latter specify how changes in model outputs restrict the parameters. Parameter sensitivities for selected outputs and conductances are shown as bar graphs to the right. As previously argued for the case of forward regression 
, these parameter sensitivities help to illustrate the relationships between parameters and outputs. For instance, forward regression coefficients indicate that diastolic [Ca2+
] is determined primarily by a balance between SR Ca2+
uptake and SR Ca2+
leak, with other parameters making only minimal contributions. Conversely, for reverse regression, the maximal conductance of L-type Ca2+
) depends on many model outputs including action potential duration, Ca2+
transient amplitude, and, in particular, how these are altered with changes in extracellular potassium. This result underscores the centrality of intracellular Ca2+
regulation to many cellular processes.
Parameter sensitivities for forward and reverse regression.
The results shown in demonstrated that most of the model parameters used to generate the dataset could be reconstructed using the reverse regression procedure. To provide evidence that this procedure may be more broadly useful, we applied the method to a novel test case by performing simulations with the most recent version of the Hund & Rudy canine ventricular model 
. Specifically, we considered changes in seven parameters corresponding to the condition of heart failure, as previously modeled by Shannon et al 
. shows that implementing these parameter changes dramatically alters both AP shape and Ca2+
transient amplitude. After performing simulations under a range of conditions with both normal, healthy cells and pathological, failing cells (see Methods
and Text S1
), we asked how well the reverse regression matrix could calculate the parameter changes in the failing cells. We found that this method constrained 5 out of 7 parameters with excellent accuracy, while changes in two parameters (GKs
) were overestimated somewhat by the regression algorithm. This novel test cases validates our approach and suggests that it may indeed prove a useful method for developing new models based on experimental measurements.
Application of reverse regression to constrain model parameters in heart failure.
The second approach for constraining model parameters is based on Bayes's theorem. In statistics, this celebrated result describes the conditional probability of one event given another in terms of: 1) the conditional probability of the second event given the first, and 2) the marginal probabilities of the two events:
In this context, we consider event A that a model conductance lies within a given range, and event B that a model output is within a particular range. When many simulations are performed with randomly varying parameters, the probability P(A) is fixed by the user, while the probabilities P(B) and P(B|A) can be estimated from the results. This allows us to approximate P(A|B), which reflects how well a model parameter is constrained by a particular simulation result.
Since our hypothesis was that multiple outputs needed to be considered to constrain model parameters, we were interested in extensions of Bayes's theorem to more than two variables, e.g. P(A|B∩C), where B and C are events related to two model outputs. For instance, B and C could represent, respectively, that APD and Ca2+ transient amplitude are within particular ranges. If the conditional probability of the parameter increases as additional outputs are considered, this validates the thinking underlying the approach.
The application of this strategy to our data set is illustrated in . The two rows of histograms display distributions of GNa and GCa, which are typical of the 16 model parameters considered. The leftmost histogram in each row shows the distribution of conductance values in the entire population, and the remaining columns show conductance values for sub-populations that satisfy constraints on one or more model outputs. Successive columns from left to right show distributions with additional model outputs considered, as noted. In either case, the distributions become progressively narrower, and the conditional probability is unity once 3 outputs are considered.
Illustration of Bayesian probability approach.
This procedure also provides insights into which specific outputs provide the greatest information about particular model parameters. For instance, the distribution of GNa given a certain range of APD appears similar to the overall distribution of GNa because these two variables are not strongly correlated (i.e. P(B|A) ≈ P(B)). In contrast, inclusion of Vpeak, an output highly dependent on GNa, narrows the distribution significantly. In the case of GCa, restricting APD to a particular range makes the distribution narrower, which is to be expected given the relatively strong correlation between the parameter and the output. Thus, an approach based on Bayes's theorem also supports the idea that model parameters can successfully be constrained if multiple model outputs are considered.