Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Biometrics. Author manuscript; available in PMC 2012 November 13.
Published in final edited form as:
PMCID: PMC3496749

Numerical Discretization-Based Estimation Methods for Ordinary Differential Equation Models via Penalized Spline Smoothing with Applications in Biomedical Research


Differential equations are extensively used for modeling dynamics of physical processes in many scientific fields such as engineering, physics, and biomedical sciences. Parameter estimation of differential equation models is a challenging problem because of high computational cost and high-dimensional parameter space. In this paper, we propose a novel class of methods for estimating parameters in ordinary differential equation (ODE) models, which is motivated by HIV dynamics modeling. The new methods exploit the form of numerical discretization algorithms for an ODE solver to formulate estimating equations. First a penalized-spline approach is employed to estimate the state variables and the estimated state variables are then plugged in a discretization formula of an ODE solver to obtain the ODE parameter estimates via a regression approach. We consider three different order of discretization methods, Euler’s method, trapezoidal rule and Runge-Kutta method. A higher order numerical algorithm reduces numerical error in the approximation of the derivative, which produces a more accurate estimate, but its computational cost is higher. To balance the computational cost and estimation accuracy, we demonstrate, via simulation studies, that the trapezoidal discretization-based estimate is the best and is recommended for practical use. The asymptotic properties for the proposed numerical discretization-based estimators (DBE) are established. Comparisons between the proposed methods and existing methods show a clear benefit of the proposed methods in regards to the trade-off between computational cost and estimation accuracy. We apply the proposed methods to an HIV study to further illustrate the usefulness of the proposed approaches.

Keywords: Euler’s method, HIV dynamics, Ordinary differential equation, Penalized splines, Runge-Kutta method, Trapezoidal rule, Two-stage smoothing method

1. Introduction

Differential equations are very popular for modeling dynamics of a physical process in many fields such as engineering, physics, and biomedical sciences. Parameter estimation of differential equation models using the observed data is an important step in characterizing the physical process under investigation. In this paper, we propose a new class of methods for estimating parameters in a nonlinear ordinary differential equation (ODE) model. A general nonlinear ODE model can be written as


where t [set membership] [a, b](−∞ < a < b < +∞) is time (independent variable), X(t) = {X1(t), … , Xm(t)}T is a m-dimensional state variable vector, [Upsilon](t) is an input variable vector, β = (β1, … , βd)T is a d-dimensional vector of unknown constant parameters with true value β0, and f(.) = {f1(.), … , fm(.)}T is a vector of known differentiable functions. We assume that X(t) is measured with noise at time points t1, … , tn and the measurement model can be written as


where the measurement errors (e1, … , en) are independent with mean zero and a variance-covariance matrix Σ. For presentation and notation simplicity, we mainly focus on the case of one equation (i.e., m = 1) without input variable [Upsilon] in the following methodology development and assume the variance-covariance matrix of the measurement errors to be σ2 In, where In is an identity matrix. All of the methodology and computation algorithms developed in this paper can be extended to the multi-dimensional case (m ≥ 2) with input variables [Upsilon]. Our goal is to estimate β given the observed values {ti, Yi}.

The non-linear least squares (NLS) method was the earliest and the most popular method developed for estimating the parameters of the ODE model (Hemker, 1972; Li, Osborne and Pravan, 2005). Eq.(1) usually does not have closed-form solutions for a general nonlinear function f. In this case, numerical methods such as the Runge-Kutta algorithm (see Hairer, Norsett and Wanner, 1993; Mattheij and Molenaar, 2002) have to be used to approximate the solution of the ODEs. Xue, Miao and Wu(2010) studied the properties of the NLS estimator when both numerical error and measurement error are considered, and they showed that the NLS estimator is strongly consistent and asymptotically normal with the same asymptotic covariance as that of the case with the true ODE solution exactly known. However, the NLS method suffers from some problems such as high computational cost, unstability and convergence problems since numerical algorithms are often needed to solve the ODE repeatedly in order to obtain the NLS estimates. In addition, it also requires the initial or boundary conditions of the state variables in order to solve the ODE numerically. These drawbacks of the NLS method have motivated investigators to develop alternative estimation methods in order to avoid numerically solving the differential equations directly.

Varah (1982) proposed a two-stage smoothing method to estimate the ODE parameters. In the first stage, regression splines were used to obtain the estimates for both X(t) and its first-order derivative X′(t) = dX(t)/dt, say X(t) and X(t), from the observed data, and in the second stage, parameters were estimated by minimizing the following objective function:

β^n=arg minβi=1n[X^(ti)f(ti,X^(ti),β)]2.

Liang and Wu(2008) studied the asymptotic properties of this two-stage smoothing estimator by applying a local polynomial approach in the first stage. Brunel(2008) also studied the asymptotic properties for a similar two-stage smoothing estimator using regression splines in the first stage, while the second stage used an integral objective function, instead of the summation form (3). Chen and Wu(2008a,b) established the theoretical results for such a method for a special ODE model with time-varying coefficients. This two-stage smoothing method avoids solving differential equations numerically when estimating ODE parameters. This method is computationally efficient and does not need the initial or boundary conditions of the state variables. However, there is a loss of some estimation accuracy and a large number of observations are required in order to attain a desired level of accuracy for the estimate of derivatives in the first stage. Ramsay (1996) proposed a principal differential analysis (PDA) method for the estimation of differential equation models. To obtain more accurate estimates, Poyton et al.(2006) and Varziri et al.(2008) further developed an iterative PDA algorithm. Ramsay et al. (2007) extended the PDA and developed a generalized profiling estimation procedure to estimate the parameters in ODE models, whose asymptotic properties were investigated by Qi and Zhao(2010).

In this article, we propose a novel class of estimation methods which are formed using the one-step discretization structure of the numerical algorithms for solving ODEs. We call such a new class of estimation methods as numerical discretization-based estimation (DBE) methods. We intend to improve the estimates from the two-stage smoothing method (Varah, 1982; Liang and Wu, 2008) while preserving its computational benefit.

The rest of the paper is organized as follows. We introduce our new estimation methods in Section 2. The asymptotic properties of the proposed estimators are summarized in Section 3. In Section 4, we apply the proposed methods to the HIV dynamic data from HIV-1 infected patients treated by an antiretroviral therapy. Results from simulation studies are presented to demonstrate the finite-sample behavior of the proposed methods in Section 5. In particular, we show that our new methods perform better than the general two-stage method(Varah, 1982; Liang and Wu, 2008) via our simulations. We conclude with some remarks in Section 6. The detailed technical proofs and assumptions as well as some additional simulation results are given in the web Supplemental Materials.

2. Numerical Discretization-Based Estimation Methods

In this section, we introduce our new estimation methods. The proposed methods can be divided into two stages. In the first stage, we estimate the state variable as a smooth function of time t using P-splines. In the second stage, we use the discretization schemes of the ODE numerical algorithms to form a regression-like model to relate the smoothed state variables to the unknown parameters β through the function f. Then a standard optimization technique is applied to the regression-like model to obtain the parameter estimates. This new framework of estimation methods allows us to efficiently use the well-designed ODE computational schemes, but not to implement the recursive scheme directly. Instead a regression-like approach is applied to the “pseudo-data” generated from the estimated state variables from the first stage to obtain the final parameter estimates. We will demonstrate that the new methods can improve the estimation efficiency compared to similar methods. We present the two stages of our new estimation methods in the following two subsections. We describe the methodology for a one-dimensional ODE case for notation and presentation simplicity. Extension to the multi-dimensional case is straightforward but tedious.

2.1 Estimation of X(t) and Its Derivatives Using P-Splines

In the first stage, we employ the penalized splines (P-splines) smoothing approach to estimate the curve X(t). Although other smoothing approaches such as the local polynomial estimation, regression splines and smoothing splines can also be used, we choose to use the P-splines here because of the following reasons: 1) its flexibility in fitting different types of curves and the compromise feature between regression and smoothing splines; 2) more computational efficiency compared to the local polynomial kernel smoothing; and 3) its convenience to implement using the R function, “smooth.spline”. Moreover, since the theoretical results for P-splines, in particular for the estimate of derivatives are not established, it is very useful and has its own value to establish these theoretical results. The original idea of penalized splines smoothing was given in O’Sullivan (1986). The combination of B-splines and difference penalties, coined as P-splines, was introduced by Eilers and Marx (1996). Thereafter the methodological investigation was pursued by several investigators (Yu and Ruppert, 2002; Ruppert, Wand and Carroll, 2003; Wang and Ormerod, 2008), but only a few of them developed theoretical results for P-splines (Hall and Opsomer, 2005; Li and Ruppert, 2008; Claeskens, Krivobokova and Opsomer, 2009; Kauermann, Krivobokova and Fahrmeir, 2009).

We approximate X(t) at time t by X(t)j=νKδjNj,ν+1(t)=Nν+1T(t)δ, where δ = (δ−ν, … , δK)T is the unknown coefficient vector to be estimated from the data, and Nν+1(t) = {N−ν,ν+1(t), … , NK,ν+1(t)}T is the B-spline basis function vector of degree ν (order ν + 1) and dimensional 1 × (K + ν + 1) at a sequence of knots a = τ−ν = τ−ν+1 = … = τ−1 = τ0 < τ1 < … < τK < τK+1 = τK+2 = … = τK+ν+1 = b on the interval [a, b](see Schumaker, 1981). The estimation objective function contains a penalized sum of squared differences, i.e.,


with a penalized term by the integrated squared q-th order derivative of the spline function. It is common to assume q ≤ ν and in practice, q is often chosen to be 2, which is also used in our simulation studies and real data analysis. From de Boor (2001, Ch. X) and using similar notations as those in Claeskens, Krivobokova and Opsomer(2009), we know that the derivatives of spline functions can be expressed in terms of lower order spline functions,


where the coefficients δj,i* are defined recursively via


for i = 1, 2, … with δj,0*=δj. Thus we can rewrite the penalty term in (4) as λδTΔqTRΔqδ, where matrix R has elements Rij=abNj,ν+1q(t)Ni,ν+1q(t)dt for i, j = −ν, … , K and Δi denotes the matrix corresponding to the weighted difference operator defined in (6), which is just D(i) defined by the expression (6) in Zhou and Wolfe (2000). Then δi*(δν+i,i,*,δK,i*)T=Δiδ. Further, define the n × (K + ν + 1) spline design matrix


and let Dq=ΔqTRΔq, then (4) can be written as


The minimizer of (7) takes the form


Then X^(t)=Nν+1T(t)δ^. Moreover, we can easily obtain the following expression for the estimate of ith-order derivative, X(i)(t), from (5) as:


Denote X(0)(t) = X(t), X(0)(t) = X(t), and δ0 = I. To determine the smoothing parameter λ, the generalized cross validation method can be used (Craven and Wahba, 1979).

The asymptotic bias and variance as well as the average mean squared error of the P-spline estimator X(t) have been obtained by Claeskens, Krivobokova and Opsomer(2009). However, more general asymptotic results (including the asymptotic bias and variance, the mean squared error(MSE) and asymptotic normality) of the P-spline estimator and its derivatives are required in order to establish the asymptotic properties of the proposed numerical discretization-based estimators. The details of the theoretical development are provided in Lemmas 1–4 in the Supplementary Materials.

2.2 ODE Parameter Estimation

In the second stage, by one-step discretization methods for ODEs (Hairer, Norsett and Wanner, 1993; Mattheij and Molenaar, 2002), we have


where h = maxi hi with hi = ti+1ti. Obviously, h = O(n−1). The form of F(ti, X(ti), X(ti+1), β) and the order p are determined by the discretization methods. In this article, we consider three different discretization methods, Euler’s method, trapezoidal rule and Runge-Kutta method. For each of these methods, the form of F(ti, X(ti), X(ti+1), β) and order p are given as follows. For Euler’s method, we have


with p = 1; the trapezoidal rule gives the form


with p = 2; for the fourth-order Runge-Kutta method, we have


with p = 4, where k1 = f(ti, X(ti), β), k2 = f(ti + hi/2, X(ti) + hik1/2, β), k3 = f(ti + hi/2, X(ti) + hik2/2, β), k4 = f(ti + hi, X(ti) + hik3, β).

Replacing the state variable, X(ti), with its estimate, X(ti) from the first stage, the expression (10) can be re-written as


where Γi is the sum of discretization error and estimation error from the first stage. We propose a numerical discretization-based estimator [beta]n of β by minimizing the following weighted least squares criterion:


where w(t) is a prescribed non-negative weight function on [a, b]. A more precise assumption for w(.) required for the asymptotic results is discussed in Section 3. For a multi-dimensional ODE model (1) with m ≥ 2, we use the following objective function:


where Xj(t) is the estimator of Xj(t) via the first stage using P-splines, Fj(.) is the function given by the associated discretization method, and wj(.) is the weight function.

In general, a higher order discretization method is expected to give a better estimation accuracy since its discretization error is smaller, but its computational cost is higher. The trade-off between the estimation accuracy and computational cost needs to be considered in practical applications. For convenience, we use the abbreviations EDB, TDB, and RDB for the Euler’s discretization-based estimator, the trapezoidal discretization-based estimator, and the Runge-Kutta discretization-based estimator, respectively. Compared to the existing two-stage smoothing methods(Varah, 1982; Liang and Wu, 2008; Brunel, 2008), the proposed method avoids the estimation of the derivatives of the state variables, which may not be easy to do for sparse data. We study the asymptotic properties of the proposed estimators in Section 3 and evaluate the finite-sample performance of these estimators in Section 5.

3. Asymptotic Properties

In this section, we summarize the theoretical properties for the proposed numerical discretization-based estimators (DBEs). Due to space limitations, we provide detailed regularity conditions and the technical proofs in the Supplementary Materials. First, we derived the asymptotic bias and variance of the P-spline estimators for the derivatives of the state variables and also establish their asymptotic normality. Then we apply such results to obtain the following consistency and asymptotic normality for the proposed estimators.

Proposition 1: Under Assumptions B1-B11 and B13 in the Supplementary Materials, if K ~ cnς with ς ≥ 1/(2ν + 3) and λ = O(nγ) with γ ≤ (ν + 2 − q)/(2ν + 3), then [beta]n is a strongly consistent estimator of β0.

Proposition 2: Under Assumptions B1-B11 and B13 in the Supplementary Materials, (i) For w(a) = w(b) = 0, if K ~ cnς with ς ≥ 1/(2ν+3) and λ = O(nγ) with γ ≤ (ν+2−q)/(2ν+3), then n(β^nβ0)dN(0,Σ1*) with Σ1* given by (A.7) in the Supplemental Materials. (ii) For w(a) ≠ 0 or w(b) ≠ 0, if K ~ cnς with ς > 1/(2ν + 3) and λ = O(nγ) with γ ≤ (ν + 2 − q)/(2ν + 3), then nκ(β^nβ0)dN(0,Σ2*) with Σ2* given by (A.8) in the Supplementary Materials and κ = max0≤jKj+1 − τj).

Remark 1: All of these theoretical results (Lemmas 1–5 and Propositions 1–2) can be extended to random designs if the random design points, t1, … , tn are i.i.d. with a cumulative distribution function Q(t), where Q(t) is a continuous distribution function with a positive and continuous density ρ(t) on [a, b]. We only need to replace E(X(i)(t)), Var(X(i)(t)) and o(.) by E(X(i)(t)|t1, … , tn), Var(X(i)(t)|t1, … , tn) and oP(.), respectively, and in this case [beta]n is a weakly consistent estimator of β0 in Proposition 1.

Remark 2: From Lemma 5 in the Supplementary Materials, when h approaches to zero as n goes to infinity, we have F(ti, X(ti), X(ti+1), β) → f(ti, X(ti), β). Therefore for large n, different numerical methods are expected to behave similarly, which is supported by our asymptotic results that the rates for consistency and the asymptotic variance for different numerical discretization-based methods are the same. Our numerical studies also demonstrate this trend. But for small sample sizes, our simulation results in the next section show that there is some difference among different discretization methods.

Remark 3: In Proposition 2, the estimator [beta]n has different convergence rates for different boundary conditions of the weight function. The standard root n rate can be achieved if the weight function is zero at both boundaries. Otherwise, the proposed estimators only achieve the nonparametric rate. Similar results were obtained by employing a similar strategy for handling the boundary problem in literature (Gasser and Müller, 1983; Bickel and Ritov, 1988; Huang and Fan, 1999; Brunel, 2008). Liang and Wu(2008) did not notice this phenomenon which resulted in an error in their derivation of the asymptotic variance.

Remark 4: In Proposition 2(ii), in order to ensure that the asymptotic bias of [beta]n is zero, a faster rate of K compared to the optimal rate n1/(2ν+3) is required. In this case, the choice of K gives an undersmoothed X(t). Then we cannot directly use the knots selected according to standard GCV or other similar methods. Zhou, Shen and Wolfe(1998) proposed an alternative two-step procedure.

4. Application: HIV Dynamics

The work presented in this article is motivated by a clinical data set obtained from HIV-1 infected patients who were treated with an antiretroviral therapy. This clinical study was conducted to monitor HIV dynamics for individual patients. Very frequent viral load measurements were collected from the patients after initiating the antiretroviral regimen at hours: 0, 1, 2, 3, 4, 6, 8, 10, 12, 14, 16, 18, 20, 24, 28, 32, 40, 46, 52, 58, 64, 70, 144, 240 and 336 during the first two weeks of treatment, and then at weeks 3, 4, 6, 8, 10, 12, 14, 16, 20, 24, 28, 32, 36, 40, 44 and 48 during follow-up treatment. In addition, the measurements of total CD4+ T cell counts were also taken at Day 1, weeks 2 and 4, and monthly thereafter. Liang and Wu(2008) fitted a 3-dimensional ODE model to this HIV data set.

We apply the proposed methods to estimate the unknown parameters in a HIV dynamic model based on the clinical data set. One popular HIV dynamic model is given by the following set of differential equations (Huang, Liu and Wu, 2006; Liang and Wu 2008)


where TU(t) is the concentration of uninfected target cells, TI(t) is the concentration of infected cells, and V(t) is the concentration of plasma virus (viral load) at time t; λ represents the rate at which new T cells are continuously generated; ρ is the death rate of uninfected T cells; η(t) is the time varying infection rate of T cells which depends on antiviral drug efficacy; δ is the death rate of infected cells; c is the clearance rate of free virions; N is the number of virions produced from each infected cell. The functions V(t), TU(t), and TI(t) are state variables and (c, δ, λ, ρ, N, η(t))T are unknown dynamic parameters. Estimation of these parameters is a challenging problem and has been studied by many authors (Wu and Ding, 1999; Putter et al., 2002; Huang, Liu and Wu, 2006; Huang and Wu, 2006; Chen and Wu, 2008a,b; Liang , Miao and Wu, 2009). Note that we only have the observation data for total CD4+ T cell count T(t) = TU(t) + TI(t) and viral load V(t). Viral load V(t) was observed in all the clinical visits, but total CD4+ T cell count, considered as an input variable, was measured less frequently. The viral load data were taken very frequently in the first two days (20 observations) of antiviral treatment because a significant drop in the viral load is expected during the early time period. We can first simplify the ODE model (16) as


where α0=Nδλρδ,α1=Nδρρδ, and α2=Nδρδ, so that measurement data can be directly used. Derivation of model (17) is given in Liang and Wu (2008).

In the first stage, we use the P-spline smoothing method to obtain the estimates of V(t), V′(t), T(t) and T′(t), respectively. R function “smooth.spline” was used to fit a P-spline to the data. This function uses the generalized cross validation criterion to estimate the smoothing parameter. Note that we imputed the CD4+T cell count data and its derivative if missing at the time points with viral load data available using the smoothing estimate as suggested by Liang and Wu (2008). Then from (17), we have


We apply the proposed EDB, TDB and RDB methods to estimate the parameters (α0, α1, α2) or (λ, ρ, δ). Here we used w(ti) = 1 in the objective function (15). For comparison, we also obtain the estimates of the two-stage smoothing method in Liang and Wu(2008) (named LW). But our implementation of the LW method is slightly different from the original paper (Liang and Wu, 2008). We used P-splines instead of the local polynomial method for the first-stage smoothing in order to make the comparisons fair. Note that we fixed the parameter values for N and c as the values from the literature and our preliminary data analysis (Liang and Wu 2008) in order to avoid identifiability problems. Parameters λ, ρ, and δ were estimated from the data for the two patients. The estimation results are reported in Table 1.

Table 1
Application of HIV Dynamics: Point estimates with 95% bootstrap confidence intervals obtained from different methods when N and c were fixed at 264.74 and 2.46 for the first patient, and 1114.37 and 3.78 for the second patient based on our preliminary ...

From Table 1, we can see that the estimation results for the first patient from all the methods are close to each other although there are some differences for different parameters. For the second patient, the estimates from the TDB and RDB methods are very close, while the estimates from the LW and EDB methods are quite similar. The difference in the estimates from these two groups of methods is large. We suspect that this is probably due to the difference in the performance of these different orders of discretization-based estimation methods as we observed in our simulation studies in the next section.

5. Simulation Studies

We designed two simulation studies to evaluate the finite-sample performance of the proposed numerical discretization-based estimation methods. The average relative error (ARE), calculated by the following formula, is used to compare the performance of different methods:


where [beta]i is the estimate and β0 is the true value of a parameter β. To summarize the overall performance for all simulation scenarios, we use the total average of the AREs for all cases. In addition, to evaluate the trade-off between the estimation accuracy and the computational cost, we calculate the overall average computational cost for all simulation cases.

Simulation Example I: A linear ODE model. In order to assess the difference in finite-sample performance for different orders of the discretization algorithms, EDB, TDB, and RDB in the linear HIV dynamic model (17), first we selected to use a simple two-parameter ODE model for our simulation study:


We used parameter values (α0, α1) = (3, 1/3) and V(0) = −1 to generate the data. Two different data generation schemes were used. The first is to generate the observations at every time interval of 0.4 in the range of t [set membership] [0, 20] which gives n = 51 observations; and the second is to generate the observations at every time interval of 0.1 in the range of t [set membership] [0, 20], which produces n = 201 observations. The observations were generated by numerically solving the ODE model (18) using the 4th-order Runge-Kutta method, and then measurement errors, generated from a normal distribution with mean 0 and standard deviation σ, were added to the observations. The measurement error standard deviation is taken as σ [set membership] (0.2, 0.4, 0.6, 0.8).

We compare the proposed EDB, TDB, and RDB methods with the LW method. We run simulations for the NLS method as well. But the results from the NLS method varied from very good to very bad depending on the assumption about the prior knowledge of the initial conditions. The unstable results from the NLS method due to the variation in the search range for the initial conditions made it difficult to compare the results from the NLS method with the results from other methods. Therefore, we have excluded the NLS method from the simulation comparisons. In this simulation study, we used w(ti) = 1 in the objective function (15). For each simulation case, 500 runs were replicated.

We report the AREs for the estimates of different methods in Table 2. From Table 2, we see a clear difference in the performance of different methods. When the sample size is larger, this difference is smaller. We observe a general trend that the EDB method produces the largest AREs for almost all the simulation cases. The TDB and RDB methods are quite similar, and about 50% of the cases the TDB is better than the RDB method in terms of AREs. The RDB tends to be slightly better than the TDB when the sample size is larger. The LW method seems to be somewhere in between the EDB and TDB.

Table 2
Simulation Results: Average relative errors in percentage (sample mean± sample standard deviation) for the estimates of the parameters obtained from 500 replications (note that true α0 = 3.0 and α1 = 0.33 for Example I; a = 1.5, ...

Simulation Example II: A non-linear ODE model. This simulation example was designed to further evaluate the performance of the proposed estimators under a more general non-linear ODE model setting. We adopted a simplified model for gene-protein interactions (Chen, He and Church, 1999) which is given by:


where R denotes mRNA concentration and P denotes protein concentration. Parameter a is the transcription effect, parameter b is the degradation rate for mRNA, and parameter c is the degradation rate for protein. We used parameter values (a, b, c) = (1.5, 1, 2) and initial values (R(0), P(0)) = (0, 1) to generate the observations at every time interval of 0.8 and 0.4 in the range of t [set membership] [0, 20] giving n = 26 and n = 51 observations, respectively. The high computational cost prohibited us from using too large sample sizes for the non-linear model. The observations were generated by numerically solving the ODE model (19) using the 4th-order Runge-Kutta method, and then adding measurement errors to R(t) generated from a normal distribution with mean 0 and standard deviation σ1 and adding measurement errors to P(t) generated from a normal distribution with mean 0 and standard deviation σ2. The measurement error standard deviations were taken as (σ1, σ2) [set membership] {(0.02, 0.01), (0.05, 0.03)}. For each simulation case, 500 runs were replicated.

We report the AREs for the estimates of different methods in Table 2. From Table 2, we can see a similar trend as that in simulation results of the linear ODE model above. The results confirm a general trend that the EDB method produces the largest AREs for all the simulation cases. The RDB method is slightly better than the TDB method. The LW method seems to be somewhere in between the EDB and the TDB.

For all above simulations, a gradient-based method (the method “L-BFGS-B” in the R function “optim”) was used for optimizing the objective function. We report the overall average computational cost and overall average AREs for the proposed methods as well as the LW estimates in Table 3.

Table 3
Summary of Simulation Results: Average simulation time (cost) in seconds and overall average AREs for Examples I and II.

From Table 3, again we can see that the EDB produces the largest AREs among all the methods for all cases and the computational cost of the EDB method is similar to that of the TDB method. The RDB method is slightly better than the TDB method in terms of AREs. However, the computational cost of the TDB is much lower for all cases (ranging from 23% to 55% lower in computational cost) compared to the RDB method. Again, the LW method is in between the EDB and the TDB methods in terms of AREs, but the computational cost of the LW method is higher than that of the EDB and the TDB methods. To summarize these simulation results, we recommend the TDB method for practical applications. A theoretical explanation for these simulation results are provided in Remark 5 below.

In all above simulations, the weight function w(ti) in the objective function (15) is assumed to be a constant. To evaluate the effect of the weight function as suggested by one referee, we performed additional simulations based on Model (19). To save space, we report the AREs for the estimates of different methods in the online Supplementary Materials (Appendix B). The results show considerable improvement in the ARE for all the methods if the appropriate weight function is used. Comparing different methods, the EDB method again has the largest ARE. The LW method performed better than the TDB method for the cases of small noise. For the larger noise cases, the TDB method has the smallest ARE. The RDB method was slightly worse than the LW and the TDB method for most of the cases. In addition, we also performed more simulations for the cases of unequally-spaced observations as suggested by the referee, similar trends and conclusions are obtained. These additional simulation results are also included in the online Supplementary Materials (Appendix B).

Remark 5: The finite-sample behavior of the proposed numerical discretization-based methods can be explained by the expression (A.3) in the proof of Proposition 1 in the Supplemental Materials. The bias in the estimate of β is due to replacement of state vectors by their estimates in the objective function (15) which introduces an estimation error given by x̂(ti) − x′(ti) + O(h). In addition, the omission of remainder terms in the Taylor series expansion of x(ti+1) introduces a numerical error given by an order of O(hp). Thus, the numerical error of the EDB method (p = 1) is in the same order as the estimation error (O(h)). But for the TDB method (p = 2) and the RDB method (p = 4), the numerical error is in a higher order compared to the estimation error. That is why we can see a significant improvement in terms of AREs when we move from the lower order EDB method to the higher order TDB method in our simulations. However, the improvement of AREs from the TDB method to the RDB method diminishes. The performance of the LW method is between that of the EDB and TDB. This is probably because that the LW method uses the estimates of derivative curves of the state variables to form a regression model by a first-order approximation (similar to the EDB method), and the estimate of derivatives of the LW method utilizes multiple data points, but the EDB method only uses the difference of two data points to approximate the derivative. That is why the LW method improves the first-order EDB method, but not to the level of the second-order TDB method.

6. Conclusion and Discussion

In this paper, we have proposed a new class of numerical discretization-based estimation (DBE) methods to estimate unknown parameters in ODE models. This new class of methods are shown to have some benefits in computational efficiency and estimation accuracy over the existing approaches. In particular, the proposed methods intend to provide a flexible balance between the computational cost and estimation accuracy. The higher order of discretization algorithms can produce better accuracy in the estimates, but require a higher computational cost. Our simulation studies demonstrate that the trapezoidal discretization-based (TDB) method provides a good balance between the computational cost and estimation efficiency and is recommended for practical use. Our theoretical analysis of the proposed methods suggests that the benefit of using a higher order discretization algorithm will diminish when the sample size is large. Thus, a lower-order dicretization method can be used when the sample size is large. Our simulation studies also support this conclusion. Our simulation results and theoretical analysis also show that using an appropriate weight at the second stage to discount the boundary effect of the first stage smoothing can also improve the estimates considerably.

Since the proposed estimates are formulated based on the numerical algorithms of ODE solvers, the reasonably frequent observations may be required in order to obtain reasonable estimates of ODE parameters. Although more frequent smoothing estimates of the state variables from the first stage can be used in the regression estimation of the ODE parameters in the second stage, this data augmentation strategy in the second stage may need to be carefully evaluated from both theoretical and practical perspectives, which is an interesting future research topic.

Besides the P-spline smoothing method, we are currently exploring other smoothing methods such as regression splines, smoothing splines and local polynomial techniques for the first-stage of the proposed methods. We plan to compare how these different smoothing methods affect the proposed numerical discretization-based estimates from theoretical perspectives and finite-sample behaviors. We may also need to extend the proposed methods to deal with the high-dimensional case and with the latent (unobservable) state variables. In summary, this paper reflects our initial work on this new promising class of estimation methods for ODE models, and more work in this direction is necessary in the future.

Supplementary Material



This research was partially supported by the NIAID/NIH grants AI50020, AI087135, AI078842 and AI078498. The authors appreciate the AE and two referees for their insightful comments and suggestions.


Supplementary Materials

Web Appendix A and B, referenced in Sections 3 and 5, are available under the Paper Information link at the Biometrics website


  • Bickel PJ, Ritov Y. Estimating integrated squared density derivatives: sharp best order of convergence estimates. Sankhyā 1988;50:381–393.
  • Brunel N. Parameter estimation of ODE’s via nonparametric estimators. Electronic Journal of Statistics. 2008;2:1242–1267.
  • Chen J, Wu H. Efficient local estimation for time-varying coefficients in deterministic dynamic models with applications to HIV-1 dynamics. J. Amer. Statist. Assoc. 2008a;103:369–384.
  • Chen J, Wu H. Estimation of time-varying parameters in deterministic dynamic models with application to HIV infections. Statistical Sinica. 2008b;18:987–1006.
  • Chen T, He HL, Church GM. Modeling gene expression with differential equations. Pacific Symposium of Biocomputing. 1999 [PubMed]
  • Claeskens G, Krivobokova T, Opsomer J. Asymptotic properties of penalized spline estimators. Biometrika. 2009;96:529–544.
  • Craven P, Wahba G. Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the generalized cross-validation. Numer. Math. 1979;31:337–403.
  • de Boor C. A Practical Guide to Splines. Revised Edition. New York: Springer; 2001.
  • Eliers PHC, Marx BD. Flexible smoothing with B-splines and penalties (with discussion) Stat. Science. 1996;11:89–121.
  • Gasser T, Müller HG. Estimating regression functions and their derivatives by the kernel method. Scand. J. Statist. 1984;11:171–185.
  • Hairer E, Norsett S, Wanner G. Solving Ordinary Differential Equation I: Nonstiff Problems. Berlin: Springer-Veralag; 1993.
  • Hall P, Opsomer JD. Theory of penalised spline regression. Biometrika. 2005;92:105–118.
  • Hemker P. Numerical methods for differential equations in system simulations and in parameter estimation. Analysis and Simulation of Biochemical Systems. 1972:59–80.
  • Huang L, Fan J. Nonparametric estimation of quadratic regression functionals. Bernoulli. 1999;5:927–949.
  • Huang Y, Liu D, Wu H. Hierarchical Bayesian methods for estimation of parameters in a longitudinal HIV dynamic system. Biometrics. 2006;62:413–423. [PMC free article] [PubMed]
  • Huang Y, Wu H. A Bayesian approach for estimating antiviral efficacy in HIV dynamic models. Journal of Applied Statistics. 2006;33:155–174.
  • Kauermann G, Krivobokova T, Fahrmeir L. Some asymptotic results on generalized penalized spline smooting. J. R. Statis. Soc. B. 2009;71:487–503.
  • Li Y, Ruppert D. On the asymptotics of penalized splines. Biometrika. 2008;95:415–436.
  • Li Z, Osborne MR, Pravan T. Parameter estimation of ordinary differential equations. IMA Journal of Numerical Analysis. 2005;25:264–285.
  • Liang H, Miao H, Wu H. Estimation of constant and time-varying dynamic parameters of HIV infection in a nonlinear differential equation model. Ann. Appl. Statist. 2009;4:460–483. [PMC free article] [PubMed]
  • Liang H, Wu H. Parameter estimation for differential equation models using a framework of measurement error in regression. J. Amer. Statist. Assoc. 2008;103:1570–1583. [PMC free article] [PubMed]
  • Matteij R, Molenaar J. Ordinary Differential Equations in Theory and Practice. Philadelphia: SIAM; 2002.
  • O’Sullivan F. A statistical perspective on ill-posed inverse problems (with discussion) Stat. Science. 1986;1:505–527.
  • Poyton AA, Varziri MS, McAuley KB, McLellan PJ, Ramsay JO. Parameter estimation in continuous-time dynamic models using principal differential analysis. Computers and Chemical Engineering. 2006;30:698–708.
  • Putter H, Heisterkamp S, Lange J, Wolf F. A Bayesian approach to parameter estimation in HIV dynamic models. Statistics in Medicine. 2002;21:2199–2214. [PubMed]
  • Qi X, Zhao H. Asymptotic efficiency and finite-sample properties of the generalized profiling estimation of parameters in ordinary differential equations. Ann. Statist. 2010;38:435–481.
  • Ramsay JO. Principal differential analysis: data reduction by differential operators. J. R. Statist. Soc. B. 1996;58:495–508.
  • Ramsay JO, Hooker G, Campbell D, Cao J. Parameter estimation for differential equations: a generalized smoothing approach (with discussion) J. R. Statis. Soc. B. 2007;5:741–796.
  • Ruppert D, Wand MP, Carroll RJ. Semiparametric Regression. Cambridge: Cambridge University Press; 2003.
  • Schumaker LL. Spline Functions: Basis Theory. New York: Wiley; 1981.
  • Varah J. A spline least squares method for numerical parameter estimation in differential equations. SIAM J. Sci. Stat. Comput. 1982;3:28–46.
  • Varziri MS, Poyton AA, McAuley KB, McLellan PJ, Ramsay JO. Selecting optimal weighting factors in iPDA for parameter estimation in continuous-time dynamic models. Comp. Chem. Eng. 2008;32:3011–3022.
  • Wand M, Ormerod J. On semiparametric regression with O’Sullivan penalised splines. Aust. New Zeal. J. Statist. 2008;50:179–198.
  • Wu H, Ding A. Population HIV-1 dynamics in vivo: applicable models and inferential tools for virological data from aids clinical trials. Biometrics. 1999;55:410–418. [PubMed]
  • Xue H, Miao H, Wu H. Sieve estimation of constant and time-varying coefficients in nonlinear ordinary differential equation models by considering both numerical error and measurement error. Ann. Statist. 2010;38:2351–2387. [PMC free article] [PubMed]
  • Yu Y, Ruppert D. Penalized spline estimation for partially linear single index model. J. Am. Statist. Assoc. 2002;97:1042–1054.
  • Zhou S, Shen X, Wolfe D. Local asymptotics for regression splines and confidence regions. Ann. Statist. 1998;26:1760–1782.
  • Zhou S, Wolfe D. On derivative regression in spline estimation. Statistica Sinica. 2000;10:93–108.