These results show that non-Gaussian data that include many zeroes need not greatly complicate regression calibration. If the relation between T′ and Q′ is not well-described by a straight line, a simple linear calibration will still provide appropriate estimates when the “disease” regression is linear, or under the common condition of a relatively linear region of a logistic relation. However, a poorly fitting calibration equation will cause unnecessary loss of power. For instance, if RC2 = 0.25 and RT2 = 0.3, there is marked loss of power due to the calibration (measured in relative sample sizes for equal power). A ratio of sample sizes equal to 5.29 is necessary to obtain the same power as if T were observed. If RC2 = 0.40, this ratio improves to 3.14.
Thus, if there is substantial nonlinearity, an important improvement in power may be produced by the relatively small additional effort necessary to find a calibration model with better fit. In the AHS-2 data, we hardly ever found evidence of strong departures from linearity in the calibration equation, and the soy example () is typical. However, in other data sets or with other variables this may not be the case.
A 2-part method described by Kipnis et al. (10
) was developed for dietary surveys (with associated outcome data). It gives a combined analysis of calibration and main study data but requires that repeated reference measures (i.e., 24-hour recalls) be available for all subjects as the primary dietary measure. Logistic regression is used to model the probability that true exposure is greater than 0, and then a Box-Cox transformation and linear regression to model exposures greater than zero. It is possible that investigators in future cohort studies may be able to collect repeated recall data from all subjects, but such data are not available in most existing cohort studies.
Was the calibration worthwhile in the AHS-2 results of ? As expected, power is not greatly changed compared with an uncorrected regression, but the magnitude of effect was markedly changed. This is an important benefit, giving greater interest to the effects of the dietary variable on disease risk. In multivariate regression calibration, we have previously shown that effect size, the sizes of statistical tests (especially in large studies), and power may change markedly depending on the correlations between errors (16
), providing an even greater motivation for corrected analyses.
Although the variance in the calibration equation coefficients was effectively removed in analyses of the simulated data by using a large calibration study, the main results presented still hold when the calibration coefficients are less precisely identified. Var(βcalib
) can be divided into the 2 parts identified in equation 2
. The first part results when the calibration coefficients are known precisely, and the second results from random variation in these estimates.
The log transformation is warranted when values of Q
exhibit significant heteroscedasticity or skewness. Thoresen (23
) has demonstrated in logistic regression calibration that although heteroscedasticity is not a severe problem, markedly skewed variables (rather than residuals) can produce biased results. However, as noted, transformations can disturb the required approximation that E
′) = T
′, even if it does hold exactly for R
We also demonstrate that transformations can markedly affect the relative bias of uncorrected analyses.
In summary, where the distribution of intake of a food is markedly non-Gaussian, perhaps containing many zeroes, linear regression calibration can safely be applied, with the usual assumptions. If the calibration relation is nonlinear, the calibration population must properly represent the cohort. In this situation, it could also be worth considerable effort to optimize the fit of the calibration equation, since this will improve the power of a calibrated analysis. As usual, when compared with uncorrected analyses, appropriate regression calibration will improve bias (usually markedly), and where there is a nonlinear calibration relation, it will potentially also improve power.