In the example presented, we have demonstrated an association of nuclear cataract risk with both reported dietary lutein/zeaxanthin and serum lutein/zeaxanthin. The association with reported dietary lutein/zeaxanthin is somewhat weaker than that reported in the original publication, with odds ratios between the 90th and 10th percentiles of approximately 0.75 compared with approximately 0.65 reported originally (6
). The difference is due to our having modeled the risk according to a continuous intake model rather than in quintiles. Nevertheless, both our analysis and that presented previously (6
) clearly demonstrate the association.
We have shown in our analysis that combining a biomarker of dietary intake with self-reported dietary intake can increase the statistical power for detecting a diet-disease association. The gains demonstrated in this example are rather similar to those reported previously in computer simulations (5
). Besides using Howe's method for combining measures, we also applied a principal components method, described in detail previously (5
). Given that the principal components results were very similar to those using Howe's method, we have reported only the latter here.
The gains obtained from combining a marker with a self-report may be considered in one of two ways. Firstly, we may ask what we gain from introducing a marker into a study. To answer this question, the correct comparison is between the combination of marker and self-report versus self-report alone, or between the marker alone and self-report alone. Here, we have demonstrated substantial efficiencies, equivalent to halving the sample size, although the monetary cost of introducing the marker may be high. Secondly, we may ask what we gain from combining a marker and a self-report when both are already included in the study. In this case, the correct comparison is between the combination and the best of either self-report or marker alone. Here, the power gains we have demonstrated are more modest, sometimes a few percent, sometimes a 20%–30% reduction in sample size. However, the monetary cost of performing the combined analysis is essentially nil because both measurements are already available.
There are several limitations to our suggested approach. First, the measure of lutein in either the diet or serum might, to some extent, reflect other aspects of diet that, in addition to the effects of lutein itself, might also lower the risk of nuclear cataract. These, rather than improved estimation of lutein status, may explain stronger associations with cataract. For example, women who have high, compared with low, estimates of dietary lutein are likely to also have lower intakes of fat and higher intakes (and blood levels) of many other micronutrients that may lower risk of cataract (6
). Thus, the stronger associations with cataract may reflect broader aspects of diet that are captured by a measure of lutein in the serum rather than less error in measuring lutein status. On the other hand, the gains in power demonstrated here are rather close to those predicted from computer simulations of a simple model in which all gains were due to reductions in measurement error (5
Second, Howe's score has no recognized units, being the sum of 2 ranks. Nevertheless, as shown in , odds ratios for the 90th versus the 10th percentile may be estimated as a useful quantity to express the strength of the association with disease. This measure is really no different from the conventional odds ratio between highest and lowest quintiles often used in epidemiologic research reports. The combined approach proposed here could be used as an efficient means of establishing the existence of a nutrition-disease association. Subsequent analyses could explore the associations between dietary intake, biomarker level, and disease risk in more depth, including, for example, whether the dietary effect is mediated by the biomarker.
Use of the combined exposure measure will not always increase statistical power over that obtained by using one of the separate exposures. For example, if the reported intake were to demonstrate no association (estimated odds ratio = 1) with disease while the serum level were found to be associated strongly (estimated odds ratio much greater than 1), then the combined exposure measure would likely have an estimated odds ratio intermediate between those for the separate exposure measures and would have less power than that for the serum-level analysis. The most successful results from the combined exposure measure will arise when the associations for each separate exposure are of similar strength, as occurs in the example presented here.
The markers that should be considered for combination with dietary reports are those that have been shown in controlled feeding studies to be modified by changes in the relevant dietary intake. Then, if changes in the marker causally affect disease risk, it implies that dietary intake will also affect disease risk, thus justifying the combination of the 2 measures. Clearly, situations will arise in which no suitable marker exists for the nutrient or food under study.
It is not essential that there be a high correlation between biomarker and reported dietary intake. More important is the biomarker's correlation with true
intake. To be helpful in combination, this correlation should be similar to, or higher than, the correlation between reported intake and true intake (5
). Information about these correlations may be available from controlled feeding studies.
It would also be helpful if the biomarker were known not to be affected by other risk factors for the disease. If other risk factors were to affect the biomarker, then the association between biomarker and disease would be at least partly a result of confounding. In the worst case, modifying the biomarker level through diet change might not affect disease. This problem of confounding has been an important consideration in previous studies of biomarkers and disease. If a strong risk factor for the disease is known to affect the marker, that risk factor must be included in the disease risk model to avoid ascribing its effect to nutritional causes. In our study, smoking was included in the full model since it is associated with nuclear cataract and may also lead to depletion of lutein and zeaxanthin in blood, being a source of free radicals and oxidative stress (12
Including biomarker measurements for all participants in a large study is a considerable challenge and can be extremely expensive. However, collecting biologic samples from participants is no longer uncommon, and their uses are manifold. Thus, the proposed approach will be feasible for studies with an already established “biobank.” Note that, for large prospective studies, the analytic cost of the bioassays need not be prohibitive if analyses are based on a nested case-control design. In view of the potential reductions of approximately 50% in sample size, it seems worth considering, in the design stage of a cohort study, the budgets required for a study with and without biomarkers relevant to the exposures of greatest interest.
The extent to which a dietary intake effect is mediated by the biomarker is often unknown. Methods that perform well under different disease risk models are therefore to be encouraged. We found in previous simulation work (5
) that Howe's method seemed to do this. In the scenarios examined, it was uniformly superior to univariate dietary intake analysis. It was also superior to univariate biomarker analysis under the no-mediation model and was not substantially inferior to that analysis under full mediation. Thus, when the extent to which the biomarker mediates the dietary effect is unknown, a combination approach would appear to be a good strategy.
In summary, we have provided an example of how combining a biomarker of dietary intake with self-reported dietary intake somewhat increases the statistical power to detect a diet-disease association. The gains in statistical power are fairly modest if compared with the best of the 2 separate exposure measures but are potentially useful in a research area in which measurement error severely limits our ability to elucidate such associations.