PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
J Stat Plan Inference. Author manuscript; available in PMC 2013 May 1.
Published in final edited form as:
J Stat Plan Inference. 2012 May 1; 142(5): 1047–1062.
doi:  10.1016/j.jspi.2011.11.003
PMCID: PMC3278194
NIHMSID: NIHMS341167

Accounting for Uncertainty in Heteroscedasticity in Nonlinear Regression

Abstract

Toxicologists and pharmacologists often describe toxicity of a chemical using parameters of a nonlinear regression model. Thus estimation of parameters of a nonlinear regression model is an important problem. The estimates of the parameters and their uncertainty estimates depend upon the underlying error variance structure in the model. Typically, a priori the researcher would know if the error variances are homoscedastic (i.e., constant across dose) or if they are heteroscedastic (i.e., the variance is a function of dose). Motivated by this concern, in this article we introduce an estimation procedure based on preliminary test which selects an appropriate estimation procedure accounting for the underlying error variance structure. Since outliers and influential observations are common in toxicological data, the proposed methodology uses M-estimators. The asymptotic properties of the preliminary test estimator are investigated; in particular its asymptotic covariance matrix is derived. The performance of the proposed estimator is compared with several standard estimators using simulation studies. The proposed methodology is also illustrated using a data set obtained from the National Toxicology Program.

Keywords: Asymptotic normality, Dose-response study, Heteroscedasticity, Hill model, M-estimation procedure, Preliminary test estimation, Toxicology

1 Introduction

Often toxicologists are interested in investigating the dose-response relationship when animals are exposed to varying doses of a chemical. Usually a nonlinear regression model such as a Hill model is used to describe the relationship (Gaylor and Aylward, 2004; Sand et al., 2004; Crofton et al., 2007). There may be several problems when fitting nonlinear models. Among them, one important concern is the error variance structure. Depending upon various factors, including the bioassay, dose-spacing and the endpoint of interest etc., the variability in response may not be constant across dose groups (heteroscedasticity). The standard asymptotic confidence intervals and test procedures based on the ordinary least squares (OLS) methodology may not be robust to heteroscedasticity and consequently may produce inaccurate coverage probabilities and Type I error rates. On the other hand, the standard iterated weighted least squares (IWLS) based methodology may not be efficient when the variances are approximately equal across dose groups (homoscedasticity). However, in practice one generally does not know if the data are homoscedastic or heteroscedastic.

The problem of heteroscedasticity has been extensively discussed in the literature in a wide range of contexts involving linear and nonlinear models. For instance, Hoferkamp and Peddada (2002) considered the problem of heteroscedasticity in the context of groups of experiments, as in fertilizer trials, where the error variances are ordered. Cysneiros et al. (2007) derived a joint iterative process for estimating the location and dispersion parameters in heteroscedastic linear models with symmetrical errors. Guo and Koul (2008) developed asymptotic theory for long memory time series based on heteroscedastic linear models. Recently, the problem of heteroscedasticity has also been addressed in the context of semi-parametric partially linear models (Ma et al., 2006; You et al., 2007; Lu, 2009). A Bayesian method for testing for equality of regression parameters in a heteroscedastic linear model has also been considered in the literature (Moreno et al., 2005).

Several authors modeled error variance as a function of dose in dose-response models (cf. Davidian and Carroll, 1987). Wang and Zhou (2007) even developed a nonparametric test for checking the adequacy of a given variance function. Bellio et al. (2000) proposed the use of higher order likelihood based methods for inference in heteroscedastic nonlinear models with application to dose-response models in herbicide bioassays.

Although IWLS based methods perform well under heteroscedasticity, they may lose efficiency relative to other methods when the data are homoscedastic. To illustrate this, consider the simulated data presented in Fig. 1 which is based on homoscedastic errors. The data were generated using the Hill model (Hill, 1910), y=θ0+θ1xθ2/(θ3θ2+xθ2)+ε, which is usually used to study in vivo concentration response relationships, where y is the response at dose x, θ0 is the intercept parameter, θ1 is the difference between the maximum effect of a drug (Emax) and the intercept, θ2 is the slope parameter that reflects the steepness of the effect-concentration curve, and θ3 is the sensitivity parameter, the drug concentration producing 50% of Emax (ED50).

Figure 1
Example of model predictions by OLSE (solid), IWLSEN (dot), IWLSEM (dashes) methods for homoscedastic data.

In Fig. 1, the point estimate (with standard error in parentheses) for the ED50 (whose true value is 120) based on the OLS estimator (OLSE) is 149.7 (31.90). The IWLS estimator using the sample variances, denoted as IWLSEN, is 226.4 (65.37), and the IWLS estimator using a variance model, denoted as IWLSEM, is 233.0 (64.77). This example illustrates that a method designed for heteroscedastic data may not perform well when the data are homoscedastic.

Because the performance of a method relies on whether the data are homoscedastic or heteroscedastic, it is important to develop an estimation procedure which is robust to whether the error variance is homoscedastic or heteroscedastic. To make the procedure robust to the structure of the error variance, a preliminary test estimation (PTE) based methodology is developed in this paper. PTE has been well studied in the literature in a variety of contexts (Judge and Bock, 1978). For instance, Sen (1986) studied the asymptotic distributional risks for the preliminary test version of a maximum likelihood estimator. Recently, Ahmed et al. (2007) investigated the asymptotic properties of a pretest semiparametric estimator under quadratic loss and examined its performance using asymptotic analysis of quadratic risk functions in a partially linear model. Hoque et al. (2009) studied the performance of the PTE of the slope parameter of a simple linear regression model under a linex loss function and derived the risk function and the moment generating function of the PTE.

Outliers and influential observations are common in toxicological data. To make the proposed procedure robust against outliers, we use the principle of M-estimation. Thus in this paper the PTE is either the ordinary M-estimator (OME) or the weighted M-estimator (WME) depending upon the outcome of a preliminary test for heteroscedasticity.

The PTE methodology based on OME and WME is proposed in Section 2. Results of a sample of simulation studies are provided in Section 3 and the proposed methodology is illustrated using a data set from the National Toxicology Program (NTP) in Section 4. Proofs of the main results are provided in Appendix while the theorems needed for proving these main results are provided in the online supplementary material.

2 The proposed methodology

2.1 Weighted M-estimation

Let yi denote an ni × 1 response vector corresponding to an m × 1 vector of covariates xi in the ith sample, i = 1, 2, …, k. Let

yi=f(xi,θ)+σiεi,i=1,,k,

denote the nonlinear regression model, where f(xi, θ) is some pre-specified nonlinear function of a p × 1 parameter vector θ = (θ1, θ2, …, θp)T and εi are independent ni × 1 vectors that are identically distributed as N(0, I). The total sample size n is given by i=1kni. The components of yi are denoted by yij, j = 1, 2, …, ni.

It is assumed that σi [equivalent] σ(zi, τ), where σ(·, ·) is a known function of a known q × 1 covariate vector zi and an unknown q × 1 parameter vector τ. Such models are commonly used in practice. For example, in some studies it may be reasonable to assume that the error variance σi2 is a linear function of dose. For more examples one may refer to Carroll and Ruppert (1988).

The definition of an M-estimator depends upon the Huber score function which is defined as follows. For a pre-specified positive constant k0, the Huber score function h(u) is given by:

h(u)={u/2,if[mid ]u[mid ]<k0{k0([mid ]u[mid ]k0/2)}1/2,otherwise.

Throughout this paper we took k0 to be 1.5. Then the ordinary M-estimator for θ is obtained by solving the following minimization problem:

So(θ)=i,jh2(yijf(xi,θ)).

If we take h(u) = u then we obtain the classical OLSE. Note that the OME does not account for heteroscedasticity in the data. The estimating equations for solving the above optimization problem are given by (Sanhueza and Sen, 2001):

i,jλo(xi,yij,θ~n)=0,

where

λo(xi,yij,θ)=ψ(yijf(xi,θ))fθ(xi,θ),

fθ(xi, θ) = ([partial differential]/[partial differential]θ)f(xi, θ), and ψ(u) = ([partial differential]/[partial differential]u)h2(u).

To deal with heteroscedasticity, one may define a weighted M-estimation procedure similar in spirit to the popular IWLSE methodology using a variance function. Thus the WME is obtained by solving the following minimization problem:

(θ^nτ^n)=Argmin[i,j{h2(yijf(xi,θ)σ(zi,τ))+logσ(zi,τ)}:θ[set membership]Rp,τ[set membership]Rq],

where log σ(zi, τ) is added within the sum, which is analogous to maximum likelihood estimation when the errors are normally distributed. The above minimization problem can be solved using the following estimating equations:

i,jλ(xi,yij,θ^n,τ^n)=0,
(1)

where

λ(xi,yij,θ,τ)=(k(zi,τ)ψ(εij)fθ(xi,θ)k(zi,τ){ψ(εij)εij1}στ(zi,τ)),
(2)

στ (zi, τ) = ([partial differential]/[partial differential]τ)σ(zi, τ), and k(zi, τ) = 1/σ(zi, τ).

We now derive the asymptotic normality of the WME. It is important to note that all the asymptotic results discussed in this paper are valid as long as n, the total sample size, goes to infinity. Two types of asymptotics often discussed in the literature are (a) the ni is finite (e.g., ni = 1) for all i and the number of doses becomes large (c.f., Wu, 1986), and (b) ni goes to infinity for i = 1, …, k (c.f., Peddada and Smith, 1997). Although our asymptotic results are valid under both situations, in the following theorem we provide the proofs for (a). The proof for (b) can be obtained similarly and hence omitted. Essentially, all the asymptotic results obtained in this paper are valid as long as the total sample size goes to infinity.

We require the following sets of regularity conditions concerning (A) the score function ψ, (B) the function f and (C) the function σ.

  • [A1]
    Let ε = {yf(x, θ)}/σ(z, τ). Then γ1 = E{ψ(ε)ε}(≠ 0); ′(ε) = γ2 (≠ 0); E{ψ′(ε)ε2} = γ3 (≠ 0); Eψ2(ε)=σψ12<;var{ψ(ε)ε}=σψ22<.
  • [B1]
    1. limn→∞n−1Γ1n(θ, τ) = Γ1(θ, τ), where
      Γ1n(θ,τ)=γ2i=1nk2(zi,τ)fθ(xi,θ)fθT(xi,θ).
    2. limn→∞n−1Γ31n(θ, τ) = Γ31(θ, τ), where
      Γ31n(θ,τ)=σψ12i=1nk2(zi,τ)fθ(xi,θ)fθT(xi,θ),
      and Γ31(θ, τ) is a positive definite matrix.
    3. maxi{k2(zi,τ)fθT(xi,θ)Γ31n1(θ,τ)fθ(xi,θ)}0, as n → ∞
  • [C]
    1. limn→∞n−1Γ2n(θ, τ) = Γ2(θ, τ), where
      Γ2n(θ,τ)=i=1n{2γ1+γ31σ2(zi,τ)στ(zi,τ)στT(zi,τ)+1γ1σ(zi,τ)τ(zi,τ)},
      and Στ(zi, τ) = ([partial differential]2/[partial differential]τ[partial differential]τT)σ (zi, τ).
    2. limn→∞n−1Γ32n(θ, τ) = Γ32(θ, τ), where
      Γ32n(θ,τ)=σψ22i=1nk2(zi,τ)στ(zi,τ)στT(zi,τ),
      and Γ32(θ, τ) is a positive definite matrix.
    3. maxi{k2(zi,τ)στT(zi,τ)Γ32n1(θ,τ)στ(zi,τ)}0, as n → ∞

The asymptotic normality of the WME is established in Theorem 1 under the above regularity conditions.

Theorem 1

Under the conditions [A1], [B1] and [C]; [S1] – [S9] in the supplementary material,

Γ^12n(θ^nθτ^nτνn(θ,τ))Np+q(0,Ip+q),
(3)

where

νn(θ,τ)=(1nΓ2n(θ,τ))1γ11ni=1nk(zi,τ)στ(zi,τ),Γ^=(1nΓ5n(θ^n,τ^n))1(1nΓ3n(θ^n,τ^n))(1nΓ5n(θ^n,τ^n))1,Γ3n(θ,τ)=(Γ31n(θ,τ)00Γ32n(θ,τ)),

and

Γ5n(θ,τ)=(Γ1n(θ,τ)00Γ2n(θ,τ)).

Note that the above theorem also provides the asymptotic covariance matrix of the WME.

2.2 Preliminary test estimation

We now describe the PTE procedure in the context of dose-response studies. Based on our experience with dose-response studies in toxicology, it is reasonable to assume that log-variance is a linear function of dose. Thus we assume log σi = τ0 + τ1xi. Then τ1 = 0 corresponds to homoscedasticity, while τ1 not equal to 0 corresponds to heteroscedasticity. Without loss of generality, in this article we consider assays where the response increases with dose. Accordingly, under heteroscedasticity, we assume that potentially the variance increases with dose (i.e., τ1 > 0). Thus we determine if the data are heteroscedastic by testing the hypotheses H0: τ1 = 0 vs. H1: τ1 > 0. Depending upon the researcher’s belief, one could test for τ1 ≠ 0 or τ1 < 0. Let rij denote the residual based on the OME. Then under suitable conditions [tau] = (ZTZ)−1ZTu is asymptotically normally distributed (Sen et al., 2009), where Z = (z11, z12, …, z1n1, …, zk,nk)T is an n × 2 matrix and u = (u11, u12, …, u1n1, …, uk,nk)T is an n × 1 vector, with zij = (1, xi)T and uij = log |rij|, i = 1, …, k, j = 1, …, ni, i=1kni=n. Hence we may test the above hypotheses using Tn = [tau]1n/√var([tau]1n), where [tau]1n is the least squares estimator of τ1. One can use a variety of test statistics for testing for τ1, but here we use a simple statistic which can be derived directly from residuals based on the OME.

Then, the PTE is defined as

θ^nPT={θ~nifTntα,n2θ~nifTn>tα,n2,
(4)

where tα,n−2 is the critical value of the t-distribution with n − 2 degrees of freedom having probability 1 − α and α is the significance level of the preliminary test.

In order to derive asymptotic results regarding PTE, we require the following sets of regularity conditions.

  • [A2]
    Let ε = {yf(x, θ)}/σ(z, τ). Then, ′(σ(z, τ) ε) = γ4 (≠ 0), Eψ2(σ(z,τ)ε)=σψ32w1(x)< and E{ψ(ε)ψ(σ(z,τ)ε)}=σψ42w2(x)<.
  • [B2]
    1. limn→∞n−1Γ4n(θ) = Γ4(θ), where
      Γ4n(θ)=γ4i=1knifθ(xi,θ)fθT(xi,θ).
    2. limn→∞n−1Γ33n(θ) = Γ33(θ), where
      Γ33n(θ)=σψ32i=1kniw1(xi)fθ(xi,θ)fθT(xi,θ),
      and Γ33(θ) is a positive definite matrix.
    3. limn→∞n−1Γ34n(θ, τ) = Γ34(θ, τ), where
      Γ34n(θ,τ)=σψ42i=1kniw2(xi)σifθ(xi,θ)fθT(xi,θ),
    4. limn→∞n−1G2n(θ, τ) = G2(θ, τ), where
      G2n(θ,τ)=(Γ31n(θ,τ)Γ34n(θ,τ)0Γ34n(θ,τ)Γ33n(θ)0002n2i=1kniwi122),
      (5)
      wi12 is the second element of wi1 = (ZTZ)−1zi1, and G2(θ, τ) is a positive definite matrix.
    5. maxi ch1{G1(xi, θ, τ) G2n(θ, τ))−1} → 0, as n → ∞, where
      G1(xi,θ,τ)=(σψ12σi2Hiσψ42w2(xi)σi1Hi0σψ42w2(xi)σi1Hiσψ32w1(xi)Hi0002n2wi122),
      and Hi=fθ(xi,θ)fθT(xi,θ).

We begin by proving the asymptotic joint normality of OME and WME which is then used for deriving the asymptotic covariance matrix of the PTE. The proof, provided in the appendix, follows arguments similar to that of Theorem 1.

Theorem 2

Let β = (θT, θT, τ1)T and β^n=(θ^nT,θ~nT,τ^1n)T. Then, under the conditions [A1], [A2], [B1], [B2] and [C]; [S1] – [S9] in the supplementary material,

n(β^nβ)N2p+1(0,G(θ,τ))asn,
(6)

where

G(θ,τ)=G31(θ,τ)G2(θ,τ)G31(θ,τ),G3(θ,τ)=(Γ1(θ,τ)000Γ4(θ)0002).

From the above theorem we deduce the asymptotic covariance matrix of PTE in the following theorem.

Theorem 3

Under the conditions [A1], [A2], [B1], [B2] and [C]; [S1] – [S9] in the supplementary material,

E[n(θ^nPTθ)(θ^nPTθ)T]=Ft(tα,n2τ1var(τ^1n))E[n(θ~nθ)(θ~nθ)T]+{1Ft(tα,n2τ1var(τ^1n))}E[n(θ^nθ)(θ^nθ)T]=Ft(tα,n2τ1var(τ^1n))(1nΓ4n(θ))1(1nΓ33n(θ))(1nΓ4n(θ))1+{1Ft(tα,n2τ1var(τ^1n))}(1nΓ1n(θ))1(1nΓ31n(θ,τ))(1nΓ1n(θ))1,
(7)

where Ft is the cdf of the t-distribution with n − 2 degrees of freedom.

As we see from the above theorem, the asymptotic covariance matrix of PTE is a weighted average of those of OME and WME. The weights are directly related to τ1. In particular, the weight corresponding to the covariance of OME is monotonically decreasing in τ1. It equals 1 − α when τ1 = 0.

3 Simulation studies

3.1 Study design

We simulated data using the following Hill model under three different error variance structures. In each case the errors are normally distributed with mean 0.

yij=f(xi,θ)+εij=θ0+θ1xiθ2x3θ2+xiθ2+εij,i=1,,8,j=1,,5.
(8)

The values of xi were set to be 0, 1, 3, 10, 30, 100, 400, 600, and (θ0, θ1, θ2, θ3) = (1, 4, 1. 5, 120). In Data 1 the errors are homoscedastic with variance e−3. In Data 2 the variances were chosen to follow the log-linear model in dose as in the previous section. Thus the generated data are heteroscedastic with variance e−6+0.01xi. Lastly, to evaluate the performance of the proposed method when the variance model is mis-specified, in Data 3 we generated data according to the variance model, 0. 01f2(xi; θ). We also investigated the performance of the proposed methodology in the presence of outliers. Typically, in toxicological studies outliers are observed in the high dose group where the observed response may drop below the expected response because of deaths due to treatment toxicity. For this reason, we generated data with outliers in the two highest dose groups using a shifted normal error with mean centered at −3 rather than 0.

There are two parts to the simulation study. Firstly, for illustration purposes, we generated one data set of each type (i.e., Data 1, Data 2 and Data 3) and the parameters were estimated using OLSE, IWLSEN, IWLSEM, OME, WME, and PTE methods. We used 0.05 as the significance level for the preliminary test in the PTE methodology.

Secondly, using 1,000 simulation runs, we compared the performance of the estimators in terms of three standard criteria: (i) mean squared error (MSE) of individual parameters as well as total MSE, (ii) the coverage probabilities of the 95% confidence intervals (CIs) of individual parameters as well as the simultaneous confidence ellipsoid defined below, and (iii) the length of the 95% CI of individual parameters as well as the volume of 95% confidence ellipsoid for each estimator. The 100(1 − α)% confidence ellipsoid centered at an estimator [theta w/ hat] is defined as

(θ^θ)T[var^(θ^)]1(θ^θ)pFp,np(α),

where p is the number of parameters and var^(θ^) is the appropriate variance estimator. Note that for simplicity we are using the critical values based on WME for the critical values for PTE because the exact critical values for PTE are not available. This is because the asymptotic distribution of PTE is not normal but a mixture of two normals.

3.2 Results

The results of various estimates (and their standard errors) for the three simulated data sets without outliers are summarized in Table 1. As described in the introduction, for homoscedastic data (Data 1), although the fitted curves using IWLSEN, IWLSEM, and WME methods may seem reasonable based on the data (Fig. 2(a)), their estimated values of θ3 (ED50) differed from the true value (120) much more than those of the other methods and also they had very large standard errors. However, PTE automatically selected OME and the standard errors of PTE were much less than those of WME. Similarly, as expected, the converse was true in the case of heteroscedastic data (Data 2 and 3).

Figure 2Figure 2Figure 2
Model predictions by OLSE (solid), IWLSEN (dot), IWLSEM (dashes), OME (long dashes), WME (dot-dash) methods when there are no outliers for: (a) homoscedastic data (Data 1), (b) heteroscedastic data (Data 2), and (c) mismodeled heteroscedastic data (Data ...
Table 1
Estimate and Standard Error for parameters of the models for Data 1, 2 and 3 without outliers using OLSE, IWLSEN, IWLSEM, OME, WME and PTE methods.

Note that if the data are homoscedastic (Data 1), then the “correct” choice of estimator is OME (OLSE when there are no outliers), whereas for heteroscedastic data (Data 2), the “correct” choice is WME (IWLSEM when there are no outliers). However, in a practical setting, for a given data set one does not know a priori whether the data are homoscedastic or heteroscedastic. In all three data sets, PTE automatically chose the “correct” estimation procedure (either OME or WME) while keeping the standard error nearly as small as that of the “correct” estimation procedure.

The results of various estimates (and their standard errors) for the three data sets with outliers are summarized in Table 2. Because there were outliers in the data, θ1 (Emax) and θ3 (ED50) were severely underestimated using the least square estimators (true values of θ1 and θ3 are 4 and 120, respectively), while OME and WME were closer to the true values. Figure 3 also reflects this result. It is noted that because of the outliers the preliminary test rejected the null hypothesis of homoscedasticity and PTE selected WME, which was closer to the true value than OME, even though the data were homogeneous.

Figure 3Figure 3Figure 3
Model predictions by OLSE (solid), IWLSEN (dot), IWLSEM (dashes), OME (long dashes), WME (dot-dash) methods when there are outliers for: (a) homoscedastic data (Data 1), (b) heteroscedastic data (Data 2), and (c) mismodeled heteroscedastic data (Data ...
Table 2
Estimate and Standard Error for parameters of the models for Data 1, 2 and 3 with outliers using OLSE, IWLSEN, IWLSEM, OME, WME and PTE methods.

Table 3 shows the results of the 1,000 simulations using data without outliers. When data were generated from homoscedastic model (Data 1), as expected, the estimated mean squared errors of OLSE and OME were smaller than those of the other estimators (except θ0, the intercept) and the estimated MSEs of PTE were slightly larger than those of OME and much smaller than those of IWLSEN and IWLSEM, especially for θ3 (ED50). Relative to OME, the loss in efficiency (in terms of total MSE) due to PTE was less than 0.1%. However, the PTE gained substantially relative to all the weighted estimators. For example, there was a 44% gain in efficiency relative to IWLSEN, an 18% gain relative to IWLSEM and a 22% gain in efficiency relative to WME. Furthermore, the coverage probability of PTE was closer to the nominal level (0.95) than that of IWLSEN and IWLSEM for θ3 with similar length of CI.

Table 3
Simulation results based on 1,000 replications generated from Data 1, 2 and 3 without outliers for OLSE, IWLSEN, IWLSEM, OME, WME and PTE.

In the case of heteroscedastic data (Table 3) we observe that OLSE and OME could potentially perform extremely poorly. This is because when there is a large variation in the higher dose groups the observed data may fail to “plateau” at higher dose groups. Consequently, estimators such as the OLSE and OME would tend to overestimate θ1 (Emax) and θ3 (ED50). For this reason, for some random samples, the estimates of ED50 became arbitrarily large. As a consequence the estimated MSEs of OLSE and OME became extremely large! However, as one would expect, estimators such as IWLSE and WME performed well for such data with smaller MSE, and approximately correct coverage probability. The PTE performed as well as IWLSE and WME in terms of MSE and coverage probability. It competed very well in terms of efficiency relative to the weighted estimators (in terms of total MSE). For example, in the case of Data 2, it was 23% more efficient than IWLSEN, about 4% less efficient than IWLSEM and was just as efficient as WME. We see similar relative efficiencies in the case of Data 3, but the striking result here is that PTE was almost 270% more efficient than IWLSEN. The PTE performed well in attaining the true coverage probability (0.95) although it had slightly wider confidence region than the weighted estimators since for some samples it used the unweighted estimator, OME. As expected, PTE performed substantially better than the unweighted estimators such as OLSE and OME in terms of all criteria. The reduction in total MSE was substantial.

Table 4 shows the results of the 1,000 simulations using data with outliers. As expected, the least squares based methods performed very poorly in terms of MSE and coverage probability in the presence of outliers, while the M-estimation based methods performed much better. In some cases the coverage probability of CIs centered at the least squares estimators were substantially smaller than the nominal level. For example, in the case of Data 2, the coverage probability of CIs centered at IWLSEM for parameter θ1 was as low as 18%.

Table 4
Simulation results based on 1,000 replications generated from Data 1, 2 and 3 with outliers for OSLE, IWLSEN, IWLSEM, OME, WME and PTE.

In all the cases, the gains in efficiency (in terms of total MSE) of WME and PTE relative to IWLSEN and IWLSEM was almost 100% whether the data were homoscedastic or heteroscedastic. Because the PTE was developed using the M-estimators (OME and WME), the PTE also performed much better than the least squares methods. It is also noted that the MSE of PTE was exactly same as that of WME because the preliminary test rejected the null hypothesis of homoscedasticity for all 1,000 data sets. As explained earlier, in the presence of outliers, the test for heteroscedasticity can potentially have a higher Type I error rate. In the present context that is not an undesirable feature.

Our simulation studies made a strong case for the use of the proposed methodology. The gains in terms of MSE were generally substantial.

As commonly understood, the performance of an estimator may be affected much by dose placement. To illustrate this point, we generated a data set from the Hill model with homoscedastic error. The true values of the parameters are (θ0, θ1, θ2, θ3) = (2, 4, 2, 30) and the values of the dose are x = 0, 1, 3, 10, 30, 100, 300. Because all estimators considered in this paper are affected, the OLSE was used to fit the curve for simplicity. The fitted curve and the estimation result are presented in Fig. 4(a) and Table 5, respectively. Even though the fitted curve is visually reasonable based on the data, the estimate of the parameter θ2 (slope), its standard error and the standard error of the estimate of θ3 (ED50) are extremely large. We then chose additional dose arbitrarily, that is, x = 70, to generate observations at the dose and estimate the parameters and their standard errors based on the new data set. The fitted curve and the estimation result are presented in Fig. 4(b) and Table 5, respectively. Now the estimate of θ2, its standard error and the standard error of the estimate of θ3 are all reasonably small. This illustration suggests the same argument presented in Lim et al. (2011) that “dose-spacing plays a major role when estimating parameters of nonlinear models, especially the ED50 and the slope parameters of a Hill model”.

Figure 4Figure 4
Data generated from the Hill model and the fitted curve (a) without x = 70 and (b) with x = 70 using OLSE.
Table 5
Estimate and Standard Error for parameters of the models for data generated from the Hill model with/without x = 70 using OLSE method.

4 Application to hexavalent chromium data

In this section the proposed PTE methodology is applied to a data set from a toxicological study that was designed to examine the relationship between concentrations of Hexavalent Chromium (CrVI), as sodium dichromate dihydrate, in drinking water and accumulation of total chromium in tissue for three species (rats, mice, and guinea pigs) (NTP, 2007).

As commonly done in toxicology, we use the Hill model (8) to describe the dose-response relationship. In our model, x denotes the dose (in mg/L), ranging from 0 to 300, and y denotes the total chromium concentration (in mg/L) in the tissues. The proposed methodology is illustrated using rat kidney and blood data sets where chromium concentration (y) is modeled. There were 7 dose levels and 4 observations at each dose level except x = 0 (3 observation). Thus, total sample size is 27. Based on each of the data sets the parameters (and their standard errors) are estimated using OLSE, IWLSEN, IWLSEM, OME, WME, and PTE methods. Estimates of parameters and their standard errors are summarized in Table 6. The corresponding fitted curves are plotted in Fig. 5.

Figure 5Figure 5
Chromium concentration in (a) rat kidney and (b) rat blood using OLSE (solid), IWLSEN (dot-dash), IWLSEM (long dashes), OME (dot), WME (dashes) methods.
Table 6
Estimate and Standard Error for parameters of the models for chromium rat kidney and blood data using OLSE, IWLSEN, IWLSEM, OME, WME and PTE methods.

Visually the scatter plot in Fig. 5(a) seems to suggest that there is some amount of heteroscedasticity in the data. The sample variances of kidney chromium for the seven dose groups were 0.0012, 0.0046, 0.0006, 0.054, 0.720, 0.213, and 6.507, respectively. Thus they ranged from about 0.0006 to 6.507, which indicates potential heteroscedasticity in the data. When the log-linear model for the absolute residual based on the OME was fitted against doses, the estimated value of τ1 was 0.005 with a standard error of 0.002. The slope was significant at the 5% level (p = 0. 009). Thus the data appears to be heteroscedastic.

The data in Fig. 5(a) appear to be heteroscedastic and hence it is not surprising that the point estimates and their standard errors are quite different between ordinary estimates (OLSE and OME) and weighted estimates (IWLSEN, IWLSEM and WME) (see Table 6). Since the preliminary test rejected the null hypothesis, PTE is the same as WME and both estimators had similar standard errors.

On the other hand, Fig. 5(b) shows that the data might be homoscedastic. The sample variances of blood chromium were 0.0006, 0.0002, 0.0003, 0.0001, 0.0007, 0.0016, and 0.0013, respectively. Thus they ranged from about 0.0001 to 0.0016. The result of the preliminary test using the absolute residuals based on the OME revealed that the slope (τ1) was not significantly greater than zero at the 5% level (p = 0. 531). Thus the data appear homoscedastic.

Visually the fitted curves are almost identical except IWLSEN. However, Table 6 shows that point estimates and their standard errors are not similar, although OLSE and OME are exactly the same. Again, point estimates of θ3 (ED50) and their standard errors using OLSE (OME) and WME are quite different from each other. For this data set, the preliminary test could not reject the null hypothesis, and hence PTE is same as OME, although standard error for OME is larger than that for WME.

5 Concluding remarks

In this paper, PTE based methodology has been developed for analyzing nonlinear models that are possibly subject to heteroscedastic variance structure. The methodology proposed here allows researchers to use estimation procedures that are robust to the error variance structure in nonlinear models. We demonstrated its utility using simulation studies and a real data set obtained by the NTP on chromium VI.

The proposed methodology depends on the model used for describing the heteroscedasticity. In our experience, a log-linear model for variance is plausible for data observed in these toxicological experiments because of the underlying sigmoidal shaped dose-response curve and variance being monotonic with mean response. Also, the log-linear model offers a simple interpretation of variability with fewer parameters. If, however, an experimenter is interested in using a different parametric model to describe the variance, then the proposed methodology can be modified easily. For example, Lim et al. (2011a) have studied WME using a different variance model, where the error standard deviation was assumed to be a nonlinear function of three unknown parameters.

According to their document “Benchmark Dose Technical Guidance Document” (USEPA, 2000), for independent continuous outcome variables, the EPA determines BMD using nonlinear least squares methodology. They typically estimate various parameters of the model, including BMD, using the ordinary least squares estimator (OLSE) for homoscedastic data and the weighted least squares estimator (WLSE) for heteroscedastic data. According to the above document, they determine whether the data are homoscedastic or heteroscedastic by visual judgment using a scatter plot of the data. After making decisions regarding the error variance on the basis of the observed data, the EPA either uses the OLSE or the WLSE. However, since the choice of WLSE and OLSE depends upon the observed data, the standard errors of the estimators of various parameters (including BMD) should account for the uncertainty in decision made regarding the error variance. This is not done by the EPA’s methodology, which is the point of our paper where we account for the uncertainty induced by the preliminary evaluation of data regarding the error variance. Hence we believe that the methodology developed in this paper can be extended to estimate the BMD.

Supplementary Material

01

Acknowledgments

This research was supported, in part, by the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences [Z01 ES101744-04]. We thank Drs Gregg Dinse and Paramita Saha as well as the editors and the referees for many important comments which helped improve the presentation of the manuscript.

Appendix A. Proofs of the main results

Proof of Theorem 1

The proof relies on the asymptotic linearity of the WME (Lim et al., 2011b) and the existence of a solution to (1) which yields a √n-consistent estimator of (θT, τT)T (Theorem 4 in the supplementary material).

From Theorem 5 in the supplementary material we have that:

n(θ^nθτ^nτ)=(1nΓ5n(θ,τ))11ni=1nλ(xi,yi,θ,τ)+op(1).

Then from Theorem 6 in the supplementary material and the Slutsky Theorem we have the expression in (3).

Proof of Theorem 2

From the estimating equation (1) for the WME, we let

λw(xi,yij,θ)=1σiψ(yijf(xi,θ)σi)fθ(xi,θ).

Then, from (13) in Theorem 5 in the supplementary material, the following asymptotic representation of [theta w/ hat]n is obtained:

n(θ^nθ)=(1nΓ1n(θ,τ))11ni,jλw(xi,yij,θ)+op(1).
(9)

Now, similarly as in Theorem 4 and 5 in the supplementary material, the uniform asymptotic linearity on the OME can be shown as:

sup[mid ][mid ]t[mid ][mid ]C1ni,j{λo(xi,yij,θ+n12t)λo(xi,yij,θ)}+1nΓ4n(θ)t=op(1),

and hence the following asymptotic representation of [theta w/ tilde]n is obtained:

n(θ^nθ)=(1nΓ4n(θ))11ni,jλo(xi,yij,θ,)+op(1).
(10)

Then, from (9), (10) and (16) in the supplementary material,

n(β^nβ)=G3n1(θ,τ)1ni,jλ*(xi,yij,θ)+op(1),

where

G3n(θ,τ)=(n1Γ1n(θ,τ)000n1Γ4n(θ)0002).

Then from Theorem 7 in the supplementary material and the Slutsky Theorem the expression in (5) is shown.

Proof of Theorem 3

From (4), for arbitrary x [set membership] Rp,

P{n(θ^nPTθ)x}=P{n(θ~nθ)x,Tntα,n2}+P{n(θ^nθ)x,Tn>tα,n2}=P{n(θ~nθ)x}P{Tntα,n2}+P{n(θ^nθ)x}P{Tn>tα,n2}=Ft(tα,n2τ1var(τ^1n))P{n(θ~nθ)x}+{1Ft(tα,n2τ1var(τ^1n))}P{n(θ^nθ)x}.

The second equality above holds because of the asymptotic independence of [theta w/ tilde]n and Tn as well as the asymptotic independence of [theta w/ hat]n and Tn proved in Theorem 2. Then from (6),

E[n(θ^nPTθ)(θ^nPTθ^)T]=Ft(tα,n2τ1var(τ^1n))E[n(θ~nθ)(θ~nθ)T]+{1Ft(tα,n2τ1var(τ^1n))}E[n(θ^nθ)(θ^nθ)T],

and since from (6),

E[n(θ~nθ)(θ~nθ)T]=(1nΓ4n(θ))1(1nΓ33n(θ))(1nΓ4n(θ))1

and

E[n(θ^nθ)(θ^nθ)T]=(1nΓ1n(θ,τ))1(1nΓ31n(θ,τ))(1nΓ1n(θ,τ))1,

the expression in (7) is proved.

Appendix B. Supplementary material

Supplementary material associated with this article can be found in the online version.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • Ahmed SE, Doksum KA, Hossain S, You J. Shrinkage, pretest and absolute penalty estimators in partially linear models. Aust N Z J Stat. 2007;49:435–454.
  • Bellio R, Jensen JE, Seiden P. Applications of likelihood asymptotics for nonlinear regression in herbicide bioassays. Biometrics. 2000;56:1204–1212. [PubMed]
  • Carroll RJ, Ruppert D. Transformation and Weighting in Regression. Chapman and Hall; New York: 1988.
  • Crofton KM, Paul KB, DeVito MJ, Hedge JM. Short-term in vivo exposure to the water contaminant triclosan: evidence for disruption of thyroxine. Environ Toxicol Pharmacol. 2007;24:194–197. [PubMed]
  • Cysneiros FJA, Paula GA, Galea M. Heteroscedastic symmetrical linear models. Statist Prob Lett. 2007;77:1084–1090.
  • Davidian M, Carroll RJ. Variance function estimation. J Amer Statist Assoc. 1987;82:1079–1091.
  • Gaylor DW, Aylward LL. An evaluation of benchmark dose methodology for non-cancer continuous-data health effects in animals due to exposures to dioxin (TCDD) Regul Toxicol Pharm. 2004;40:9–17. [PubMed]
  • Guo H, Koul HL. Asymptotic inference in some heteroscedastic regression models with long memory design and errors. Ann Statist. 2008;36:458–487.
  • Hill AV. The possible effects of the aggregation of the molecules of haemoglobin on its dissociation curves. J Physiol. 1910;40(Suppl):iv–vii.
  • Hoferkamp C, Peddada SD. Parameter estimation in linear models with heteroscedastic variances subject to order restrictions. J Multivariate Anal. 2002;82:65–87.
  • Hoque Z, Khan S, Wesolowski J. Performance of preliminary test estimator under linex loss function. Commun Stat–Theor M. 2009;38:252–261.
  • Judge GG, Bock ME. The Statistical Implications of Pre-test and Stein-rule Estimators in Econometrics. North-Holland, Amsterdam: 1978.
  • Lim C, Sen PK, Peddada SD. Statistical inference in nonlinear regression under heteroscedasticity. Sankhya B. 2011a;72:202–218.
  • Lim C, Sen PK, Peddada SD. Asymptotic linearity of an M-estimator in heteroscedastic nonlinear regression models. 2011b submitted.
  • Lu X. Empirical likelihood for heteroscedastic partially linear models. J Multivariate Anal. 2009;100:387–396.
  • Ma Y, Chiou JM, Wang N. Efficient semiparametric estimator for heteroscedastic partially linear models. Biometrika. 2006;93:75–84.
  • Moreno E, Torres F, Casella G. Testing equality of regression coefficients in heteroscedastic normal regression models. J Statist Plann Inference. 2005;131:117–134.
  • National Toxicology Program. Toxicity Report Series 72. U.S. Department of Health and Human Services, Public Health Service, National Institutes of Health, RTP; North Carolina, U.S.A: 2007. NTP Toxicity Studies of Sodium Dichromate Dihydrate (CAS No. 7789-12-0) Administered in Drinking Water to Male and Female F344/N Rats and B6C3F1 Mice and Male BALB/c and am3-C57BL/6 Mice; pp. 1–G4. [PubMed]
  • Sand S, von Rosen D, Eriksson P, Fredriksson A, Viberg H, Victorin K, Filpsson AF. Dose-response modeling and benchmark calculations from spontaneous behavior data on mice neonatally exposed to 2,2′,4,4′,5-pentabromodiphenyl ether. Toxicol Sci. 2004;81:491–501. [PubMed]
  • Sanhueza AI, Sen PK. M-methods in generalized nonlinear models. In: Charalambides ChA, Koutras MV, Balakrishnan N., editors. Probability and Statistical Models with Applications. Chapman & Hall/CRC; New York: 2001. pp. 359–375.
  • Sen PK. On the asymptotic distributional risks of shrinkage and preliminary test versions of maximum likelihood estimators. Sankhya Ser A. 1986;48:354–371.
  • Sen PK, Singer JM, Pedroso de Lima AC. From Finite Samples to Asymptotic Methods in Statistics. Cambridge Univ. Press; New York: 2009.
  • Peddada SD, Smith T. Consistency of a class of variance estimators in linear models under heteroscedasticity. Sankhya Ser B. 1997;59:1–10.
  • US Environmetal Protection Agency. Benchmark Dose Technical Guidance Document. 2000. EPA/630/R-00/001.
  • Wang L, Zhou XH. Assessing the adequacy of variance function in heteroscedastic regression models. Biometrics. 2007;63:1218–1225. [PubMed]
  • Wu CFJ. Jackknife, bootstrap and other resampling methods in regression analysis. Ann Stat. 1986;14:1261–1295.
  • You J, Chen G, Zhou Y. Statistical inference of partially linear regression models with heteroscedastic errors. J Multivariate Anal. 2007;98:1539–1557.