3.1 Study design
We simulated data using the following Hill model under three different error variance structures. In each case the errors are normally distributed with mean 0.
The values of xi were set to be 0, 1, 3, 10, 30, 100, 400, 600, and (θ0, θ1, θ2, θ3) = (1, 4, 1. 5, 120). In Data 1 the errors are homoscedastic with variance e−3. In Data 2 the variances were chosen to follow the log-linear model in dose as in the previous section. Thus the generated data are heteroscedastic with variance e−6+0.01xi. Lastly, to evaluate the performance of the proposed method when the variance model is mis-specified, in Data 3 we generated data according to the variance model, 0. 01f2(xi; θ). We also investigated the performance of the proposed methodology in the presence of outliers. Typically, in toxicological studies outliers are observed in the high dose group where the observed response may drop below the expected response because of deaths due to treatment toxicity. For this reason, we generated data with outliers in the two highest dose groups using a shifted normal error with mean centered at −3 rather than 0.
There are two parts to the simulation study. Firstly, for illustration purposes, we generated one data set of each type (i.e., Data 1, Data 2 and Data 3) and the parameters were estimated using OLSE, IWLSEN, IWLSEM, OME, WME, and PTE methods. We used 0.05 as the significance level for the preliminary test in the PTE methodology.
Secondly, using 1,000 simulation runs, we compared the performance of the estimators in terms of three standard criteria: (i) mean squared error (MSE) of individual parameters as well as total MSE, (ii) the coverage probabilities of the 95% confidence intervals (CIs) of individual parameters as well as the simultaneous confidence ellipsoid defined below, and (iii) the length of the 95% CI of individual parameters as well as the volume of 95% confidence ellipsoid for each estimator. The 100(1 − α
)% confidence ellipsoid centered at an estimator
is defined as
is the number of parameters and
is the appropriate variance estimator. Note that for simplicity we are using the critical values based on WME for the critical values for PTE because the exact critical values for PTE are not available. This is because the asymptotic distribution of PTE is not normal but a mixture of two normals.
The results of various estimates (and their standard errors) for the three simulated data sets without outliers are summarized in . As described in the introduction, for homoscedastic data (Data 1), although the fitted curves using IWLSEN, IWLSEM, and WME methods may seem reasonable based on the data (), their estimated values of θ3 (ED50) differed from the true value (120) much more than those of the other methods and also they had very large standard errors. However, PTE automatically selected OME and the standard errors of PTE were much less than those of WME. Similarly, as expected, the converse was true in the case of heteroscedastic data (Data 2 and 3).
Estimate and Standard Error for parameters of the models for Data 1, 2 and 3 without outliers using OLSE, IWLSEN, IWLSEM, OME, WME and PTE methods.
Figure 2 Model predictions by OLSE (solid), IWLSEN (dot), IWLSEM (dashes), OME (long dashes), WME (dot-dash) methods when there are no outliers for: (a) homoscedastic data (Data 1), (b) heteroscedastic data (Data 2), and (c) mismodeled heteroscedastic data (Data (more ...)
Note that if the data are homoscedastic (Data 1), then the “correct” choice of estimator is OME (OLSE when there are no outliers), whereas for heteroscedastic data (Data 2), the “correct” choice is WME (IWLSEM when there are no outliers). However, in a practical setting, for a given data set one does not know a priori whether the data are homoscedastic or heteroscedastic. In all three data sets, PTE automatically chose the “correct” estimation procedure (either OME or WME) while keeping the standard error nearly as small as that of the “correct” estimation procedure.
The results of various estimates (and their standard errors) for the three data sets with outliers are summarized in . Because there were outliers in the data, θ1 (Emax) and θ3 (ED50) were severely underestimated using the least square estimators (true values of θ1 and θ3 are 4 and 120, respectively), while OME and WME were closer to the true values. also reflects this result. It is noted that because of the outliers the preliminary test rejected the null hypothesis of homoscedasticity and PTE selected WME, which was closer to the true value than OME, even though the data were homogeneous.
Estimate and Standard Error for parameters of the models for Data 1, 2 and 3 with outliers using OLSE, IWLSEN, IWLSEM, OME, WME and PTE methods.
Figure 3 Model predictions by OLSE (solid), IWLSEN (dot), IWLSEM (dashes), OME (long dashes), WME (dot-dash) methods when there are outliers for: (a) homoscedastic data (Data 1), (b) heteroscedastic data (Data 2), and (c) mismodeled heteroscedastic data (Data (more ...)
shows the results of the 1,000 simulations using data without outliers. When data were generated from homoscedastic model (Data 1), as expected, the estimated mean squared errors of OLSE and OME were smaller than those of the other estimators (except θ0, the intercept) and the estimated MSEs of PTE were slightly larger than those of OME and much smaller than those of IWLSEN and IWLSEM, especially for θ3 (ED50). Relative to OME, the loss in efficiency (in terms of total MSE) due to PTE was less than 0.1%. However, the PTE gained substantially relative to all the weighted estimators. For example, there was a 44% gain in efficiency relative to IWLSEN, an 18% gain relative to IWLSEM and a 22% gain in efficiency relative to WME. Furthermore, the coverage probability of PTE was closer to the nominal level (0.95) than that of IWLSEN and IWLSEM for θ3 with similar length of CI.
Simulation results based on 1,000 replications generated from Data 1, 2 and 3 without outliers for OLSE, IWLSEN, IWLSEM, OME, WME and PTE.
In the case of heteroscedastic data () we observe that OLSE and OME could potentially perform extremely poorly. This is because when there is a large variation in the higher dose groups the observed data may fail to “plateau” at higher dose groups. Consequently, estimators such as the OLSE and OME would tend to overestimate θ1 (Emax) and θ3 (ED50). For this reason, for some random samples, the estimates of ED50 became arbitrarily large. As a consequence the estimated MSEs of OLSE and OME became extremely large! However, as one would expect, estimators such as IWLSE and WME performed well for such data with smaller MSE, and approximately correct coverage probability. The PTE performed as well as IWLSE and WME in terms of MSE and coverage probability. It competed very well in terms of efficiency relative to the weighted estimators (in terms of total MSE). For example, in the case of Data 2, it was 23% more efficient than IWLSEN, about 4% less efficient than IWLSEM and was just as efficient as WME. We see similar relative efficiencies in the case of Data 3, but the striking result here is that PTE was almost 270% more efficient than IWLSEN. The PTE performed well in attaining the true coverage probability (0.95) although it had slightly wider confidence region than the weighted estimators since for some samples it used the unweighted estimator, OME. As expected, PTE performed substantially better than the unweighted estimators such as OLSE and OME in terms of all criteria. The reduction in total MSE was substantial.
shows the results of the 1,000 simulations using data with outliers. As expected, the least squares based methods performed very poorly in terms of MSE and coverage probability in the presence of outliers, while the M-estimation based methods performed much better. In some cases the coverage probability of CIs centered at the least squares estimators were substantially smaller than the nominal level. For example, in the case of Data 2, the coverage probability of CIs centered at IWLSEM for parameter θ1 was as low as 18%.
Simulation results based on 1,000 replications generated from Data 1, 2 and 3 with outliers for OSLE, IWLSEN, IWLSEM, OME, WME and PTE.
In all the cases, the gains in efficiency (in terms of total MSE) of WME and PTE relative to IWLSEN and IWLSEM was almost 100% whether the data were homoscedastic or heteroscedastic. Because the PTE was developed using the M-estimators (OME and WME), the PTE also performed much better than the least squares methods. It is also noted that the MSE of PTE was exactly same as that of WME because the preliminary test rejected the null hypothesis of homoscedasticity for all 1,000 data sets. As explained earlier, in the presence of outliers, the test for heteroscedasticity can potentially have a higher Type I error rate. In the present context that is not an undesirable feature.
Our simulation studies made a strong case for the use of the proposed methodology. The gains in terms of MSE were generally substantial.
As commonly understood, the performance of an estimator may be affected much by dose placement. To illustrate this point, we generated a data set from the Hill model with homoscedastic error. The true values of the parameters are (θ0
) = (2, 4, 2, 30) and the values of the dose are x
= 0, 1, 3, 10, 30, 100, 300. Because all estimators considered in this paper are affected, the OLSE was used to fit the curve for simplicity. The fitted curve and the estimation result are presented in and , respectively. Even though the fitted curve is visually reasonable based on the data, the estimate of the parameter θ2
(slope), its standard error and the standard error of the estimate of θ3
) are extremely large. We then chose additional dose arbitrarily, that is, x
= 70, to generate observations at the dose and estimate the parameters and their standard errors based on the new data set. The fitted curve and the estimation result are presented in and , respectively. Now the estimate of θ2
, its standard error and the standard error of the estimate of θ3
are all reasonably small. This illustration suggests the same argument presented in Lim et al. (2011)
that “dose-spacing plays a major role when estimating parameters of nonlinear models, especially the ED50
and the slope parameters of a Hill model”.
Data generated from the Hill model and the fitted curve (a) without x = 70 and (b) with x = 70 using OLSE.
Estimate and Standard Error for parameters of the models for data generated from the Hill model with/without x = 70 using OLSE method.