Search tips
Search criteria 


Logo of amjepidLink to Publisher's site
Am J Epidemiol. 2010 May 15; 171(10): 1079–1089.
Published online 2010 April 25. doi:  10.1093/aje/kwq026
PMCID: PMC2866739

Improvements in Ability to Detect Undiagnosed Diabetes by Using Information on Family History Among Adults in the United States


Family history is an independent risk factor for diabetes, but it is not clear how much adding family history to other known risk factors would improve detection of undiagnosed diabetes in a population. Using the National Health and Nutrition Examination Survey for 1999−2004, the authors compared logistic regression models with established risk factors (model 1) with a model (model 2) that also included familial risk of diabetes (average, moderate, and high). Adjusted odds ratios for undiagnosed diabetes, using average familial risk as referent, were 1.7 (95% confidence interval (CI): 1.2, 2.5) and 3.8 (95% CI: 2.2, 6.3) for those with moderate and high familial risk, respectively. Model 2 was superior to model 1 in detecting undiagnosed diabetes, as reflected by several significant improvements, including weighted C statistics of 0.826 versus 0.842 (bootstrap P = 0.001) and integrated discrimination improvement of 0.012 (95% CI: 0.004, 0.030). With a risk threshold of 7.3% (sensitivity of 40% based on model 1), adding family history would identify an additional 620,000 (95% CI: 221,100, 1,020,000) cases without a significant change in false-positive fraction. Study findings suggest that adding family history of diabetes can provide significant improvements in detecting undiagnosed diabetes in the US population. Further research is needed to validate the authors’ findings.

Keywords: decision analysis, logistic regression, mass screening, model fitting, nutrition surveys, risk

Family history is a consistent risk factor for many chronic diseases of public health significance (1) and, in the past few years, it has increasingly been discussed as a tool for preventing common diseases and for promoting health (24). In 2005, the US Surgeon General launched a public health campaign to enhance the public's awareness of the importance of family history (, and the Centers for Disease Control and Prevention (CDC) has initiated a public health research initiative on this topic. The CDC's initiative is focused primarily on several common chronic diseases, including diabetes, stroke, heart disease, and cancers ( Yet, in spite of the increased interest in family history as a public health tool, the clinical validity and utility of this readily obtained risk factor have not been systematically evaluated.

In the present study, we assessed the improvements in detecting undiagnosed diabetes among US adults that might be obtained by using information on family history. Among an estimated 24 million individuals with diabetes in the United States in 2007 (based on fasting plasma glucose), 28% (6.6 million) were undiagnosed (5). One of the rationales for asking undiagnosed people about their family history is that a number of diabetes risk models/tools have included family history of diabetes as a risk factor, with an estimated relative risk 2–6 times that of people without family history (615). Furthermore, other studies suggest that family history might be an effective screening tool for identifying both diabetes and undiagnosed diabetes (1, 3, 14, 1618). Even so, none of these studies has formally evaluated the improvements in detecting undiagnosed diabetes by using family history. This is important in part because both empirical and theoretical analyses have suggested that a significant and independent risk factor for a disease does not necessarily increase the ability of detecting the disease or to enhance the discrimination ability between people with and without disease (19).

Receiver-operating characteristic (ROC) curves and the associated C statistics are commonly used to summarize the diagnostic accuracy of risk models and to assess the improvements made to such models that are gained from adding other risk factors (20). Some studies, however, have criticized ROC curves for lacking the ability to display the risk in a particular population and to assess the reclassification of individuals into different risk groups (e.g., higher risk, lower risk) (19, 21). Recently, researchers have developed several alternative methods to assess the improvements made by a new marker or risk factor in risk models (2225). Predictiveness curves, for example, display the distribution of risk in the population and also assess the classification ability of additional risk factors (24). Alternatively, the net reclassification improvement and integrated discrimination improvement integrate sensitivity, specificity, and the information from reclassification tables to assess improvements in risk models that include new risk factors (23). A third method involves net benefit curves, which might help to determine whether it would be cost-effective to include an additional risk factor in the risk model (25). We applied both conventional and recently developed methods to assess the improvements made from using family history to detect cases of undiagnosed diabetes among adults. Our source of data was the National Health and Nutrition Examination Survey (NHANES) for 1999–2004.


NHANES is a series of stratified, multistage probability surveys designed to obtain information on the health and nutritional status of the civilian, noninstitutionalized US population. From 1999, NHANES data have been collected continuously, with every 2 years serving as 1 analytical cycle. The data are collected by the National Center for Health Statistics, CDC, via household interviews and physical examinations and are intended to provide estimates that are representative of the US population. Detailed information is available elsewhere ( The present study included 3 cycles (1999–2000, 2001–2002, and 2003–2004) of samples of adults aged ≥20 years who were examined in the morning after overnight fasting (between 8 and 23 hours) and did not have diagnosed diabetes. When analyzing combined data sets, we found that the sampling weights must be recalculated to produce unbiased estimates, because weights for the 1999–2000 cycle were based on population data prior to the 2000 US Census, and weights for the other cycles were based on the 2000 US Census. Detailed NHANES analytic and reporting guidelines that provide algorithms to recalculate the sampling weights can be found at the following website: (

Undiagnosed diabetes and family history of diabetes

We excluded pregnant women and persons with diagnosed diabetes, with unknown diabetes status, and with missing values for some of the covariates. Participants with a fasting plasma glucose level of ≥126 mg/dL (7.0 mmol/L) who reported no previous diagnosis of diabetes were defined as cases of undiagnosed diabetes (26).

We classified all participants into 3 mutually exclusive groups of familial risk on the basis of their family history of diabetes among first- and second-degree relatives: 1) high (at least 2 first-degree relatives or 1 first-degree and at least 2 second-degree relatives from the same lineage); 2) moderate (just 1 first-degree and 1 second-degree relative with diabetes, or only 1 first-degree relative with diabetes, or at least 2 second-degree relatives with diabetes from the same maternal or paternal line); or 3) average (no family history of diabetes or, at most, 1 second-degree relative with diabetes) (15). We use the term “family history of diabetes” to mean all 3 groups (high, moderate, and average). Limited information on family history of diabetes in NHANES 1999–2004 does not allow further detailed analysis.

Risk models

We used logistic regression models to calculate the predicted risk for undiagnosed diabetes. To select the appropriate models, we started with the list of those risk factors suggested by the American Diabetes Association that were available in NHANES 1999–2004 (27); these risk factors included age, race/ethnicity (non-Hispanic white, non-Hispanic black, Mexican American, others), body mass index, physical activity (inactive, irregularly active, regularly active), hypertension (≥140/90 mm Hg or on therapy for hypertension), a high density lipoprotein cholesterol level of ≤35 mg/dL (0.90 mmol/L) and/or a triglyceride level of ≥250 mg/dL (2.82 mmol/L), history of cardiovascular disease, and family history of diabetes. We used the backwards selection approach, including all suggested risk factors in the multiple logistic regression models with α = 0.10 to select the final models (28, 29). These final models included age, gender, body mass index, hypertension, low high-density lipoprotein cholesterol and/or elevated triglycerides, and family history of diabetes. We found no evidence of multicollinearity among the selected risk factors (30). We tested interactions between family history and other risk factors by including the product terms in the risk models based on the Satterthwaite-adjusted F test. There is no evidence of significant interaction. We also included age as a nonlinear term, the logarithm of high density lipoprotein cholesterol as a continuous variable, and the interaction between body mass index and high density lipoprotein cholesterol in the model. The full model does not offer significant improvements over the main effect models (results not shown). For simplicity, we used the main effects models. Similar sets of risk factors have been used and validated by other studies using the NHANES data (15, 31, 32). For assessments of the improvements in detecting undiagnosed diabetes by using family history of diabetes, we calculated 2 risk models: one that had the selected risk factors excluding family history of diabetes (model 1) and the other with the selected risk factors plus family history of diabetes (model 2, nested model).

Statistical analysis

The adjusted and weighted prevalence and odds ratios and 95% confidence intervals for undiagnosed diabetes were obtained by logistic regression models by using the predicted margins by the 3 categories of family history of diabetes (33). The prevalence and odds ratios were adjusted by the risk factors selected for the final model. We estimated the mean and standard error for continuous variables, proportions for categorical variables, and their 95% confidence intervals by levels of family history of diabetes. We tested for significant differences in the mean and prevalence across levels of family history of diabetes based on Satterthwaite-adjusted F statistics and on the χ2 test, respectively. All tests were 2 tailed at the α = 0.05 level of significance.

Assessment of risk models and improvements from using family history in detection of undiagnosed diabetes

For the global measure of models’ fit, we used the Akaike Information Criterion (AIC) estimated from the logistic regression models; a difference in AIC between 2 models of >2 was interpreted as a significant improvement for the model with the smaller AIC (34). For models’ calibration, we calculated Hosmer-Lemeshow goodness-of-fit statistics on the basis of deciles of risk (29). For the discrimination abilities of family history of diabetes, we constructed the weighted ROC curves and calculated the C statistics (35). To test for the significance of differences between AIC values, between weighted ROC curves, and between C statistics of different risk models, we used the rescaling bootstrap method of Cheng et al. (36) and Rao et al. (37) that takes into account the complex survey design by changing the sampling weights for each resample. We generated 1,000 rescaled bootstrap weights, calculated the distribution for the 2.5 and 97.5 percentiles, and reported these values as the 95% confidence intervals of the differences between different risk models (38).

The predictiveness curve described earlier is an integrated plot of predicted risks from logistic regression models formed by the percentiles of risk in the population (24). From the predictiveness curves, one could read off the predicted probability of an event for any corresponding true-positive fraction (sensitivity) or false-positive fraction (1 − specificity). We constructed the weighted predictiveness curves. For the summary measure of weighted predictiveness curves, we calculated the proportion-explained variations (R2) for each risk model and used the rescaling bootstrap method to make the inference about significant differences between the different R2 variations (37). The difference between R2 variations is equivalent to the integrated discrimination improvement index proposed by Pencina et al. (23) and Pepe et al. (39) that measures the ability of the additional risk factor to increase the predicted probability among those who had the event and to decrease the predicted probability among those who were event free (23).

For risk prediction, it is important to examine if the model with the additional risk factor can more accurately stratify individuals into higher or lower risk categories (risk reclassification) (21). Some recently developed risk reclassification measures require use of recognized risk thresholds (22, 23), but at present no researchers or clinicians have proposed any risk classification schemes (risk thresholds) for clinical use in identifying higher- or lower-risk patients for diabetes. Nor have they proposed follow-up tests such as glucose testing for people at higher or lower risk to identify those who really have diabetes. Accordingly, we used logistic regression model 1 to determine the predicted probability of events that corresponded approximately to 20%, 40%, 60%, and 80% of undiagnosed diabetes (dichotomous cutpoints at quintiles of sensitivity) and used these probability thresholds to identify the true-positive fraction and false-positive fraction from the predictiveness curves. We also calculated the net reclassification improvement index, positive predictive values, and negative predictive values for each dichotomous risk threshold for model 1 and model 2, respectively. The net reclassification improvement index is a special case of integrated discrimination improvement with the recognized risk thresholds (23). We used the rescaled bootstrap method with 1,000 samples to estimate the 95% confidence intervals of integrated discrimination improvement and net reclassification improvement (39).

To help to determine whether including a risk factor in a risk model might be cost-effective, we used decision curve analysis (25). Briefly, decision curve analysis estimates the net benefit of a model by taking the difference between the number of true positives and the number of false positives weighted by the odds of the selected threshold probability of risk for a range of threshold probabilities (25, 40). The net benefit of a model compared with the reference net benefit or compared with another model might be interpreted as the net increase in the proportion of cases identified. The reference was calculated by assuming that all people were tested for the events, and testing no one was set to a net benefit of zero. For any given threshold probability cutpoint, the risk models with the higher net benefit are the preferred model (41). We calculated and plotted the weighted net benefit curves for the reference model (testing all), model 1, and model 2, respectively. We used the quintile cutpoints of the predicted probabilities to compare the net benefits curves of model 1 and model 2 and calculated the differences in the net benefit between the 2 models for the each cutpoint and 95% confidence interval of difference between the 2 models using the rescaled bootstrap method. Unless otherwise specified, data were analyzed by using SAS, version 9.2, software (SAS Institute, Inc., Cary, North Carolina) and SUDAAN, release 9.0, software (Research Triangle Institute, Research Triangle Park, North Carolina) to account for the complex sampling design of NHANES 1999–2004 (42).


NHANES 1999–2004 surveyed 5,551 adults aged ≥20 years without diagnosed diabetes who were asked for a blood sample after fasting overnight. The 498 persons excluded included 324 pregnant women, 1 person with unknown diabetes status, and 173 people with missing covariates. Of the final sample (n = 5,053), 73.6% were non-Hispanic white; 10.5%, non-Hispanic black; 7.2%, Mexican American; and 8.8%, other race/ethnicity.

The prevalence of undiagnosed diabetes, adjusted odds ratios, and characteristics of the people by level of familial risk of diabetes are summarized in Table 1. The prevalence increased significantly with level of familial risk from 2.2% (95% confidence interval (CI): 1.7, 2.6) to 7.2% (95% CI: 4.2, 10.1) (P = 0.001). The adjusted odds ratio increased from 1.7 (95% CI: 1.2, 2.5) to 3.8 (95% CI: 2.2, 6.3) for moderate and high familial risk, respectively. Familial risk of diabetes was significantly associated with all the selected covariates except for physical activity.

Table 1.
Characteristics of Participants by Family History of Diabetes, National Health and Nutrition Examination Survey, 1999–2004

Assessing improvements in the detection of undiagnosed diabetes by using family history

Table 2 includes several statistical measures of overall fit, discrimination ability, and reclassification of risk for models 1 and 2. Compared with model 1, model 2 represented significant improvements in 3 statistical measures in detecting undiagnosed diabetes: a lower AIC, a significant improvement in the weighted C statistic, and a significant improvement in reclassification as measured by integrated discrimination improvement. Models 1 and 2 demonstrated similar levels of calibration (goodness-of-fit tests), suggesting the adequate fit of both models.

Table 2.
Comparison of 2 Models’ Fit, Discrimination Ability, and Risk Reclassification, National Health and Nutrition Examination Survey, 1999–2004

Figure 1A plots the weighted predictiveness curves, and Figure 1B shows the weighted true-positive fraction and false-positive fraction by risk percentiles in the population. These graphs show that using family history of diabetes, in addition to the selected risk factors, reclassified the people with undiagnosed diabetes to the higher predicted risk and the diabetes-free people to the lower predicted risk. Appendix Table 1 presents a detailed analysis of the selected risk thresholds. In a comparison of model 2 with model 1, for a higher risk threshold (e.g., at 7.3%, or approximately the 89th percentile of risk distribution in the population) (Figure 1), the weighted true-positive fraction (Appendix Table 1) increased from 40.0% (95% CI: 29.4, 51.5) in model 1 to 49.4% (95% CI: 37.9, 60.9) in model 2. The weighted positive predictive value rose from 11.0% (95% CI: 8.5, 14.4) to 14.2% (95% CI: 10.9, 18.2), and the net reclassification improvement in model 2 was 10.1% (95% CI: 1.0, 18.1; P = 0.009). The weighted false-positive fraction and the negative predictive value remained largely unchanged at this risk threshold. At this level of risk, model 2 would identify approximately 620,000 (95% CI: 221,100, 1,020,000) more cases of undiagnosed diabetes in the population than would model 1 (2.64 million vs. 3.26 million). As the risk thresholds lowered, model 2 was associated with a decreased false-positive fraction and little change in negative predictive value compared with model 1. However, these changes were not significant enough to have a significant improvement in risk reclassification indicated by net reclassification improvement.

Figure 1.
Weighted predictiveness curves (A) and true-positive fraction (TPF) and false-positive fraction (FPF, 1 − specificity) (B) for model with selected risk factors (model 1) and model with selected risk factors plus family history of diabetes (model ...

Decision curves analysis

Figure 2 presents the weighted net benefit curves derived for testing all people versus testing strategies based on model 1 and model 2. Model 2 appeared to offer greater net benefit across most risk thresholds, especially from the predicted risk of around 5% to 15%. Both of the model-based net benefits were higher than testing all (the reference testing strategy). Appendix Table 2 presents the detailed analysis of net benefits for 4 selected risk thresholds. Comparing model 2 with model 1, for example, at a 7.3% risk threshold (40% sensitivity based on model 1), the difference of net benefit equals 0.32 per 100 people (95% CI: 0.06, 0.58), indicating that 3 extra cases of undiagnosed diabetes would be detected per 1,000 subjects based on model 2. The differences in net benefits between the 2 models diminished at either higher or lower risk thresholds, especially at the lower risk thresholds.

Figure 2.
Weighted decision curves for models predicting undiagnosed diabetes using models with family history of diabetes (solid line) and without this history (small dashed line), National Health and Nutrition Examination Survey, 1999–2004. The dash-dot-dot-dash ...


This study confirms that family history of diabetes is an independent risk factor for undiagnosed diabetes, a finding that is consistent with those of many other studies (615, 43). Recent National Institutes of Health state-of-the-science statements on family history recognized the important role of family history in the practice of medicine, motivation of positive lifestyle changes, and influence of clinical interventions (44). Our study assessed the improvements in detecting undiagnosed diabetes that would come from including family history in risk assessment and population screening. Our findings suggest that using a risk model with family history of diabetes offers significant improvements over a model with common risk factors in detecting undiagnosed diabetes, especially among populations at higher risk. For example, by using a risk threshold of 7.3% (the median predicted risk = 1.3% in the population), approximately 11% of the population had a predicted risk ≥7.3% based on model 1. With model 2 we had a net reclassification improvement of 10.1% (95% CI: 1.0, 18.1; P = 0.009) that was mainly due to the increase in true-positive fraction from 40.0% (95% CI: 29.4, 51.5) in model 1 to 49.4% (95% CI: 37.9, 60.9) in model 2, a 24% increase in the number of undiagnosed diabetes cases identified. In other words, using model 2 at a risk threshold of 7.3%, one would identify approximately 3.26 million cases instead of 2.64 million cases of undiagnosed diabetes of an estimated 6.6 million total cases without an increase in false-positive fraction.

Some researchers have argued that the statistical measures of risk models for performance in prediction and reclassification have limited value for evaluation of the clinical utility of the additional risk factor/marker because they do not consider cost-effectiveness (25, 41, 45). However, the traditional cost-effectiveness analysis of diagnostic tests has involved collecting additional data on alternative treatments that could involve substantial cost and sometimes might be difficult to collect (46, 47). The decision curve analysis, which does not require collecting additional data on cost and effectiveness, offers a simple approach to examining the clinical consequences of alternative testing strategies and to comparing the different risk models in terms of net benefits over a range of predicted probabilities for an event (25). The focus of the net benefit curves is not on any particular point estimate, but rather on the entire range of threshold probabilities in a way that one net benefit curve is greater or lesser than the other alternatives (25, 48). Our findings indicate that the net benefit curves derived from model 2 (versus model 1) were greater over nearly the whole range of risk thresholds, especially from 5% to 15% predicted risks, indicating the net benefit of detecting extra cases of undiagnosed diabetes based on model 2. Given the fact that little cost might be involved in collecting information on family history of diabetes, the evaluation of added value of using family history should mainly focus on the magnitude of the benefit rather than on cost-effectiveness.

The limitations to our study include, first, that NHANES is a cross-sectional survey, and it cannot be used to predict the risk of developing diabetes. Accordingly, we focused our analysis on the improvements in detecting undiagnosed diabetes that might be realized by incorporating family history of diabetes in a model. Second, NHANES 1999–2004 measured fasting glucose but did not assess glucose tolerance, and thus it might have underestimated the prevalence of diabetes. However, the American Diabetes Association has recommended that, for epidemiologic studies and estimates of diabetes prevalence, a fasting plasma glucose level of ≥126 mg/dL (7.0 mmol/L) should be used (49). Third, diabetes was self-reported in NHANES 1999–2004, and reporting bias by different groups might exist. Studies indicated that the proportion of undiagnosed diabetes was higher in men, Mexican Americans, and the uninsured compared with women, non-Hispanic whites, and the insured, suggesting some reporting bias of diagnosed diabetes (50). The prevalence of undiagnosed diabetes might be overrepresented in certain groups in NHANES 1999–2004. Fourth, the family risk of diabetes was significantly related to sex and race/ethnicity (31, 51, 52). Women tend to have a better knowledge of the presence of the disease among their relatives, and the large families, for example, non-Hispanic blacks and Mexican Americans compared with non-Hispanic whites, are likely to have a greater possibility of relatives with diabetes than the smaller families, especially among populations where the disease prevalence is high. To examine the possible effect of sex, ethnicity, or racial differences in the familial risk of diabetes on the detection of undiagnosed diabetes, we conducted stratified analysis by sex and race/ethnicity; the results suggested that the improvements in detecting undiagnosed diabetes by using family history of diabetes are consistent across sex and race/ethnicity strata (Appendix Table 3). Fifth, there are no generally recognized risk thresholds for undiagnosed diabetes; we arbitrarily used the quintile cutpoints of predicted risk that included 20%, 40% 60%, or 80% of undiagnosed diabetes cases based on risk model 1. Some statistical measures of how well a model performs in prediction, such as net reclassification improvement, might be sensitive to the risk thresholds used (23). Sixth, using the same data to fit a risk model and to assess its performance could lead to overfitting. We conducted 5-fold cross-validation and obtained an average weighted area under curve = 0.84 for the final model with family history, and external validation using the NHANES III (1988–1994) data set obtained a weighted area under curve = 0.89, indicating adequate performance of our risk models.

The major strengths of our study include the availability of fasting glucose measurements from a nationally representative sample of the US adult population and the large number of potential risk factors for undiagnosed diabetes to investigate.

Our findings suggest that family history of diabetes provides significant improvements in the detection of additional cases of undiagnosed diabetes, especially among people with higher predicted risk. It also provides greater net benefits than a risk model without family history when applied to the US population. Unlike other biomarkers, for example, prostate-specific antigen for prostate cancer or C-reactive protein for cardiovascular diseases, or genetic testing, obtaining information on family history of diabetes costs little, and no adverse effect is associated with the process. With increased awareness and education, family history could be a useful part of a public health tool designed for the detection and control of diabetes in populations.


Author affiliations: National Office of Public Health Genomics, Centers for Disease Control and Prevention, Atlanta, Georgia (Quanhe Yang, Tiebin Liu, Rodolfo Valdez, Muin J. Khoury); and Office of Minority Health, Centers for Disease Control and Prevention, Atlanta, Georgia (Ramal Moonesinghe).

The findings and conclusions in this report are those of the author(s) and do not necessarily represent the official position of the Centers for Disease Control and Prevention.

Conflict of interest: none declared.



Akaike Information Criterion
Centers for Disease Control and Prevention
confidence interval
National Health and Nutrition Examination Survey
receiver-operating characteristic


Appendix Table 1.

Weighted True-Positive Fraction, False-Positive Fraction, Positive Predictive Value, Negative Predictive Value, and Net Reclassification Index of Undiagnosed Diabetes Using Risk Models With and Without Family History of Diabetes, National Health and Nutrition Examination Survey, 1999–2004

Predicted Probability of Events, %Undiagnosed Diabetes
No. of Cases of Undiagnosed Diabetes Identified in Population (× 100,000)No. of People ≥ Predicted Probability in Population (× 100,000)
True-Positive Fraction95% CIFalse-Positive Fraction95% CIPositive Predictive Value95% CINegative Predictive Value95% CINet Reclassification Index, %95% CI
Model Without Family History (Model 1)a,, 3.915.810.7, 22.697.697.1,
7.340.029.4,, 10.511.08.5, 14.498.197.4, 98.526.4170.9
5.460.050.8, 69.515.714.5, 16.910.58.6, 12.798.698.1, 99.039.6268.7
3.680.071.5, 86.826.124.5,, 9.999.298.7, 99.552.8448.3
Model with Family History (Model 2)b,, 4.418.313.1, 25.097.897.2,, 12.1c17.963.1
7.349.437.9,,, 18.298.497.8, 98.810.11.0, 18.1c32.6159.4
5.460.048.2, 70.414.413.4, 15.511.29.0, 13.998.798.0, 99.00.7−7.1, 7.6c39.6246.8
3.675.266.8, 82.123.522.0,, 10.599.098.6, 99.3−2.3−9.3, 3.8c49.6403.5

Abbreviation: CI, confidence interval.

aModel 1 included age, gender, body mass index, hypertension, and a high density lipoprotein cholesterol level of ≤35 mg/dL (0.90 mmol/L) and/or a triglyceride level of ≥250 mg/dL.
bModel 2 included, in addition to model 1 risk factors, family history of diabetes.
cThe 95% confidence intervals of the net reclassification index were estimated by using 1,000 rescaled bootstrap samples for the complex surveys.

Appendix Table 2.

Weighted Net Benefit and Differences in Net Benefit for Testing All People for Undiagnosed Diabetes or According to Risk Models With or Without Family History Using Selected Thresholds of Predicted Probabilities of Undiagnosed Diabetes, National Health and Nutrition Examination Survey, 1999–2004

Predicted Probability of Events, %True-Positive Fraction, %95% CIModelsNet Benefit, %95% CIaDifferences Between Testing All vs. Model 1 and Model 1 vs. Model 295% CIa, 29.6Testing allb−10.30−10.86, 9.85
Model 1c0.16−0.09, 0.3910.59.99, 11.00
Model 2d0.320.03, 0.580.160.03, 0.33
7.340.029.4, 51.5Testing allb−4.70−5.23, 4.27
Model 1c0.440.10, 0.745.144.61, 5.65
Model 2d0.760.37, 1.130.320.06, 0.58
5.460.050.8, 69.5Testing allb−2.61−3.13, 2.19
Model 1c0.910.52, 1.213.523.11, 3.90
Model 2d0.960.55, 1.330.05−0.12, 0.32
3.680.071.5, 86.8Testing allb−0.69−1.2, 0.28
Model 1c1.411.00, 1.742.101.82, 2.37
Model 2d1.360.92, 1.68−0.03−0.23, 0.14

Abbreviation: CI, confidence interval.

aNinety-five percent confidence intervals of the difference in net benefit between testing all versus model 1 and model 1 versus model 2 were estimated by using 1,000 rescaled bootstrap samples for complex surveys.
bAssuming that all people were tested for fasting glucose concentrations for diagnosis of diabetes.
cModel 1 included age, gender, body mass index, hypertension, and a high density lipoprotein cholesterol level of ≤35 mg/dL (0.90 mmol/L) and/or a triglyceride level of ≥250 mg/dL.
dModel 2 included, in addition to the risk factors of model 1, family history of diabetes.

Appendix Table 3.

Comparison of Models’ Fit, Discrimination Ability, Risk Stratification, and Risk Reclassification Between Models With and Without Family History of Diabetes for Detecting Undiagnosed Diabetes Stratified by Sex and Race/Ethnicity, National Health and Nutrition Examination Survey, 1999–2004

Statistical MeasuresModels
Differences (Model 1 − Model 2)95% CIa
Without Family History (Model 1)bWith Family History (Model 2)c
    AICd357.4356.31.1−3.6, 9.9
    Goodness-of-fit teste7.4 (0.289)3.5 (0.743)
    Weighted C statistics0.8370.8480.0110.001, 0.024
    R2/IDI0.06770.07340.006−0.001, 0.031f
    AICd247.3239.28.1−0.5, 18.9
    Goodness-of-fit teste5.1 (0.280)2.0 (0.732)
    Weighted C statistics0.8200.8470.0270.005, 0.054
    R2/IDI0.0490.0730.0240.006, 0.065f
Non-Hispanic white
    AICd319.6316.43.2−2.6, 11.4
    Goodness-of-fit teste4.2 (0.124)3.1 (0.213)
    Weighted C statistics0.8480.8630.0150.003, 0.031
    R2/IDI0.0630.0730.0100.001, 0.039f
Non-Hispanic black
    AICd131.9128.33.6−3.2, 15.5
    Goodness-of-fit teste3.2 (0.788)6.3 (0.392)
    Weighted C statistics0.8310.8560.025−0.006, 0.061
    R2/IDI0.0720.1130.0410.010, 0.125f
Mexican American
    AICd125.7118.57.3−3.3, 22.2
    Goodness-of-fit teste3.1 (0.381)4.6 (0.203)
    Weighted C statistics0.8540.8960.042−0.003, 0.085
    R2/IDI0.0640.1250.0610.010, 0.191f

Abbreviations: AIC, Akaike Information Criterion; CI, confidence interval; IDI, integrated discrimination improvement.

aThe 2.5 and 97.5 percentile distributions of 1,000 rescaled bootstrap samples of the differences between the different risk models.
bModel 1 was adjusted for age, gender, body mass index, hypertension, and a high density lipoprotein cholesterol level of ≤35 mg/dL (0.90 mmol/L) and/or a triglyceride level of ≥250 mg/dL.
cModel 2 included, in addition to the risk factors in model 1, family history of diabetes.
dThe means and differences of AIC were generated from 1,000 rescaled bootstrap samples for the different risk models.
eHosmer-Lemeshow goodness-of-fit test; the numbers are χ2, with P values in parentheses.
fThe difference between the R2 of the 2 risk models equals the IDI.


1. Yoon PW, Scheuner MT, Khoury MJ. Research priorities for evaluating family history in the prevention of common chronic diseases. Am J Prev Med. 2003;24(2):128–135. [PubMed]
2. Guttmacher AE, Collins FS, Carmona RH. The family history—more important than ever. N Engl J Med. 2004;351(22):2333–2336. [PubMed]
3. Yoon PW, Scheuner MT, Peterson-Oehlke KL, et al. Can family history be used as a tool for public health and preventive medicine? Genet Med. 2002;4(4):304–310. [PubMed]
4. Rich EC, Burke W, Heaton CJ, et al. Reconsidering the family history in primary care. J Gen Intern Med. 2004;19(3):273–280. [PMC free article] [PubMed]
5. American Diabetes Association. Economic costs of diabetes in the U.S. in 2007. Diabetes Care. 2008;31(3):596–615. [PubMed]
6. Baan CA, Ruige JB, Stolk RP, et al. Performance of a predictive model to identify undiagnosed diabetes in a health care setting. Diabetes Care. 1999;22(2):213–219. [PubMed]
7. Herman WH, Smith PJ, Thompson TJ, et al. A new and simple questionnaire to identify people at increased risk for undiagnosed diabetes. Diabetes Care. 1995;18(3):382–387. [PubMed]
8. Kanaya AM, Wassel Fyr CL, de Rekeneire N, et al. Predicting the development of diabetes in older adults: the derivation and validation of a prediction rule. Diabetes Care. 2005;28(2):404–408. [PubMed]
9. Lindström J, Louheranta A, Mannelin M, et al. The Finnish Diabetes Prevention Study (DPS): lifestyle intervention and 3-year results on diet and physical activity. Diabetes Care. 2003;26(12):3230–3236. [PubMed]
10. Lindström J, Tuomilehto J. The diabetes risk score: a practical tool to predict type 2 diabetes risk. Diabetes Care. 2003;26(3):725–731. [PubMed]
11. Tabaei BP, Engelgau MM, Herman WH. A multivariate logistic regression equation to screen for dysglycaemia: development and validation. Diabet Med. 2005;22(5):599–605. [PubMed]
12. Tunstall-Pedoe H. The Dundee coronary risk-disk for management of change in risk factors. BMJ. 1991;303(6805):744–747. [PMC free article] [PubMed]
13. Schwarz PE, Li J, Lindstrom J, et al. Tools for predicting the risk of type 2 diabetes in daily practice. Horm Metab Res. 2009;41(2):86–97. [PubMed]
14. Harrison TA, Hindorff LA, Kim H, et al. Family history of diabetes as a potential public health tool. Am J Prev Med. 2003;24(2):152–159. [PubMed]
15. Valdez R, Yoon PW, Liu T, et al. Family history and prevalence of diabetes in the U.S. population: the 6-year results from the National Health and Nutrition Examination Survey (1999–2004) Diabetes Care. 2007;30(10):2517–2522. [PubMed]
16. Hariri S, Yoon PW, Moonesinghe R, et al. Evaluation of family history as a risk factor and screening tool for detecting undiagnosed diabetes in a nationally representative survey population. Genet Med. 2006;8(12):752–759. [PubMed]
17. Hariri S, Yoon PW, Qureshi N, et al. Family history of type 2 diabetes: a population-based screening tool for prevention? Genet Med. 2006;8(2):102–108. [PubMed]
18. Valdez R, Greenlund KJ, Khoury MJ, et al. Is family history a useful tool for detecting children at risk for diabetes and cardiovascular diseases? A public health perspective. Pediatrics. 2007;120(suppl 2):S78–S86. [PubMed]
19. Pepe MS, Janes H, Longton G, et al. Limitations of the odds ratio in gauging the performance of a diagnostic, prognostic, or screening marker. Am J Epidemiol. 2004;159(9):882–890. [PubMed]
20. Fletcher RH, Fletcher SW. Clinical Epidemiology: The Essentials. 4th ed. Baltimore, MD: Lippincott Williams & Wilkins; 2005.
21. Cook NR. Use and misuse of the receiver operating characteristic curve in risk prediction. Circulation. 2007;115(7):928–935. [PubMed]
22. Cook NR, Ridker PM. Advances in measuring the effect of individual predictors of cardiovascular risk: the role of reclassification measures. Ann Intern Med. 2009;150(11):795–802. [PMC free article] [PubMed]
23. Pencina MJ, D'Agostino RB, Sr, D'Agostino RB, Jr, et al. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med. 2008;27(2):157–172. discussion 207–212. [PubMed]
24. Pepe MS, Feng Z, Huang Y, et al. Integrating the predictiveness of a marker with its performance as a classifier. Am J Epidemiol. 2008;167(3):362–368. [PMC free article] [PubMed]
25. Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making. 2006;26(6):565–574. [PMC free article] [PubMed]
26. American Diabetes Association. Diagnosis and classification of diabetes mellitus. Diabetes Care. 2009;32(suppl 1):S62–S67. [PMC free article] [PubMed]
27. American Diabetes Association. Report of the expert committee on the diagnosis and classification of diabetes mellitus. Diabetes Care. 2003;26(suppl 1):S5–S20. [PubMed]
28. Kleinbaum DG. Logistic Regression: A Self-Learning Text. New York, NY: Springer; 1994.
29. Hosmer DW, Lemeshow S. Applied Logistic Regression. 2nd ed. New York, NY: Wiley; 2000.
30. Belsley DA, Kuh E, Welsch RE. Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. New York, NY: Wiley; 1980.
31. Annis AM, Caulder MS, Cook ML, et al. Family history, diabetes, and other demographic and risk factors among participants of the National Health and Nutrition Examination Survey 1999–2002 [electronic article] Prev Chronic Dis. 2005;2(2):A19. [PMC free article] [PubMed]
32. Heikes KE, Eddy DM, Arondekar B, et al. Diabetes risk calculator: a simple tool for detecting undiagnosed diabetes and pre-diabetes. Diabetes Care. 2008;31(5):1040–1045. [PubMed]
33. Graubard BI, Korn EL. Predictive margins with survey data. Biometrics. 1999;55(2):652–659. [PubMed]
34. Burnham KP. Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. 2nd ed. New York, NY: Springer; 2002.
35. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143(1):29–36. [PubMed]
36. Cheng NF, Han PZ, Gansky SA. Methods and software for estimating health disparities: the case of children's oral health. Am J Epidemiol. 2008;168(8):906–914. [PMC free article] [PubMed]
37. Rao JNK, Wu CFJ, Yue K. Some recent work on resampling methods for complex surveys. Surv Methodol. 1992;18(2):209–217.
38. Efron B, Tibshirani R. An Introduction to the Bootstrap. New York, NY: Chapman & Hall; 1993.
39. Pepe MS, Feng Z, Gu JW. Comments on ‘Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond’ by M.J. Pencina et al, Statistics in Medicine (DOI: 10.1002/sim.2929). Stat Med. 2008;27(2):173–181. [PubMed]
40. Vickers AJ, Cronin AM, Elkin EB, et al. Extensions to decision curve analysis, a novel method for evaluating diagnostic tests, prediction models and molecular markers [electronic article] BMC Med Inform Decis Mak. 2008;8:53. [PMC free article] [PubMed]
41. Vickers AJ. Decision analysis for the evaluation of diagnostic tests, prediction models and molecular markers. Am Stat. 2008;62(4):314–320. [PMC free article] [PubMed]
42. Shah BV, Barnwell BG, Bieler GS. SUDAAN User's Manual, Release 9.0. Research Triangle Park, NC: Research Triangle Institute; 2005.
43. Eddy DM, Schlessinger L. Validation of the Archimedes diabetes model. Diabetes Care. 2003;26(11):3102–3110. [PubMed]
44. Berg AO, Baird MA, Botkin JR, et al. National Institutes of Health State-of-the-Science Conference statement: family history and improving health. Ann Intern Med. 2009;151(12):872–877. [PubMed]
45. Vickers AJ, Elkin EB, Steyerberg E. Net reclassification improvement and decision theory. Stat Med. 2009;28(3):525–526. author reply 526–528. [PubMed]
46. Mushlin AI, Ruchlin HS, Callahan MA. Costeffectiveness of diagnostic tests. Lancet. 2001;358(9290):1353–1355. [PubMed]
47. Petitti DB. Meta-analysis, Decision Analysis, and Cost-Effectiveness Analysis: Methods for Quantitative Synthesis in Medicine. 2nd ed. New York, NY: Oxford University Press; 2000.
48. Steyerberg EW, Vickers AJ. Decision curve analysis: a discussion. Med Decis Making. 2008;28(1):146–149. [PMC free article] [PubMed]
49. Report of the Expert Committee on the Diagnosis and Classification of Diabetes Mellitus. Expert Committee on the Diagnosis and Classification of Diabetes Mellitus. Diabetes Care. 2003;26(suppl 1):S5–S20. [PubMed]
50. Danaei G, Friedman AB, Oza S, et al. Diabetes prevalence and diagnosis in US states: analysis of health surveys [electronic article] Popul Health Metr. 2009;7:16. [PMC free article] [PubMed]
51. Centers for Disease Control and Prevention. Awareness of family health history as a risk factor for disease—United States, 2004. MMWR Morb Mortal Wkly Rep. 2004;53(44):1044–1047. [PubMed]
52. Suchindran S, Vana AM, Shaffer RA, et al. Racial differences in the interaction between family history and risk factors associated with diabetes in the National Health and Nutritional Examination Survey, 1999–2004. Genet Med. 2009;11(7):542–547. [PubMed]

Articles from American Journal of Epidemiology are provided here courtesy of Oxford University Press