|Home | About | Journals | Submit | Contact Us | Français|
Division of Nephrology, Department of Medicine, Columbia University Medical Center, 622 West 168th Street, PH 4-124, New York, NY 10032.
Section of Atherosclerosis and Lipoprotein Research, Department of Medicine, Baylor College of Medicine, 6565 Fannin, M.S. A-60, Houston, TX 77030.
Department of Medicine, Weill Cornell Medical College, Baker Pavilion, 20th Floor, 525 East 68th Street, New York, NY 10065.
FOJP Service Corporation, 28 East 28th Street, New York, NY 10016.
Los Angeles County Public Health, 313 N Figueroa St, Rm 708, Los Angeles, CA 90012.
National guidelines disagree on who should be screened for undiagnosed diabetes. No existing diabetes risk score is highly generalizable or widely followed.
To develop a new diabetes screening score and compare it to other available screening instruments (Centers for Disease Control and Prevention, American Diabetes Association (ADA) and U.S. Preventive Services Task Force guidelines; two ADA risk questionnaires; and Rotterdam model)
National Health and Nutrition Examination Survey (NHANES) 1999–2004 for model development, and NHANES 2005–2006 plus a combined cohort of two community studies, Atherosclerosis Risk in Communities (ARIC) and Cardiovascular Health Study (CHS), for validation.
U.S. adults ≥20 years old.
A risk scoring algorithm for undiagnosed diabetes, defined as fasting plasma glucose ≥7.0 mmol/L(126 mg/dL) without known diabetes, was developed in the development dataset. Logistic regression was used to determine participant characteristics that were independently associated with undiagnosed diabetes. The new algorithm and other methods were evaluated by standard diagnostic and feasibility measures.
Age, sex, family history of diabetes, history of hypertension, obesity, and physical activity were associated with undiagnosed diabetes. In NHANES (in ARIC/CHS), the cutpoint of ≥5 selected 30(40)% of persons for diabetes screening and yielded sensitivity of 79(72)%, specificity of 67(62)%, positive predictive value of 10(10)% and likelihood ratio-positive of 2.39(1.89). In contrast, the comparison scores yielded sensitivity of 44–100%, specificity of 10–73%, positive predictive value of 5–8%, and likelihood ratio-positive of 1.11–1.98.
Data during pregnancy were not available.
This new diabetes screening score, simple and easily implemented, seems to demonstrate improvements upon the existing methods. Future studies are needed to evaluate it in diverse populations in real world settings.
Clinical and Translational Science Center at Cornell Medical College.
Diabetes and its complications are major causes of morbidity and mortality worldwide. Over 60 million U.S. adults are estimated to have either diagnosed diabetes, undiagnosed diabetes, or pre-diabetes, with approximately 30% of diabetes cases estimated to be undiagnosed. With the steadily rising prevalence of diabetes, prevention of diabetes has become a major health priority (1–5). Recent clinical trials demonstrate that lifestyle(6–8) and pharmaceutical(6, 9, 10) interventions in individuals with impaired glucose tolerance can prevent or delay the development of diabetes, providing a rationale for the identification of high-risk subjects who may benefit from early lifestyle interventions.
National guidelines for diabetes screening are available to help detect undiagnosed disease, and various risk assessment tools for prevalent or incident diabetes have been developed to identify individuals most in need of screening. Yet many of these risk assessment tools were developed from specific cohorts, often with restrictive age range or race/ethnic groups, limiting generalizability to the entire population. There are three national guidelines for diabetes screening in the U.S.: the Centers for Disease Control and Prevention (CDC)(11), the American Diabetes Association (ADA)(12), and the U.S. Preventive Services Task Force (USPSTF)(13). Additionally, two risk scoring algorithms for undiagnosed diabetes have been derived from nationally representative samples: Herman et al.’s model from the National Health and Nutrition Examination Survey (NHANES) II (conducted in 1976–1980)(14), and Heikes et al.’s model from the NHANES III (conducted in 1988–1994)(4). These two algorithms are also known as the ADA diabetes questionnaires.
In this paper, we have developed a new screening score for undiagnosed diabetes in multi-ethnic U.S. adults using readily available health information. Our aim was to improve existing diabetic risk scoring algorithms by using a more contemporary NHANES (conducted in 1999–2006) and formulating an easy, systematic scoring system that enables lay persons to assess their own risk of undiagnosed diabetes.
The NHANES is a cross-sectional study conducted by the National Center for Health Statistics in the CDC. In order to represent the US population, NHANES utilized complex, multistage probability sampling of the civilian, non-institutionalized population. To produce reliable statistics, NHANES over-sampled elderly persons and some racial/ethnic minorities. We utilized de-identified data from multiple waves of NHANES (i.e., 1999–2006) that are publicly available.
We included participants who were aged ≥20 years and had fasting plasma glucose (FPG) results. We excluded pregnant women. We used NHANES 1999–2004 for prediction modeling and screening score development, and NHANES 2005–6 for validation. Additionally, we conducted further validation combining the baseline data from two biracial cohort studies: the Atherosclerosis Risk in Communities (ARIC) study and the Cardiovascular Health Study (CHS). Detailed descriptions of these two studies have been published previously(15, 16). Briefly, ARIC enrolled 15,732 participants aged 45 to 64 years between 1987 and 1989 from four communities, and CHS recruited 5,201 participants 65 years and older between 1989 and 1990 from four communities. Between 1992 and 1993, CHS enrolled an additional 687 blacks to increase minority participation.
For each participant, we retrieved data that were collected through interviews, physical examinations, and laboratory tests. Specifically, we utilized data regarding participants’ demographic and socioeconomic characteristics, health care utilization, personal and family medical histories, health habits, physical examinations including anthropometric findings, and laboratory tests. For the obesity measure, we combined body mass index and waist circumference. For variable categorization, we used conventional cutoffs or well accepted clinical guidelines, if available. If information was missing or unknown in categorical variables, we defined the condition as absent, a convention commonly adopted in the risk questionnaire setting. We planned to instruct users, “Enter your score (But if you don’t know the answer, enter 0),” in our questionnaire.
We stratified the participants into four groups by diabetes status: known diabetes, normal glucose metabolism (fasting plasma glucose (FPG) < 5.5 mmol/L (100 mg/dL)), impaired fasting glucose or pre-diabetes (FPG 5.6–6.9 mmol/L (100–125 mg/dL)), and undiagnosed diabetes (FPG ≥ 7.0 mmol/L (126 mg/dL))(4, 17, 18). Specifically, we defined known diabetes as participants who answered “yes” to the question “Other than during pregnancy, have you ever been told by a doctor or health professional that you have diabetes or sugar diabetes?” or reported using insulin or other diabetic medications.
For medical history variables, we considered data from multiple sources. For example, we classified participants as having hypertension if they reported a history of hypertension, reported using prescribed medication for hypertension, had a systolic blood pressure ≥ 140 mmHg, or had a diastolic blood pressure ≥ 90 mmHg. We defined hyperlipidemia if total cholesterol ≥ 5.17 mmol/L (200 mg/dL) or triglycerides ≥ 1.69 mmol/L (150 mg/dL)(19), and high cholesterol if a person reported a history of high cholesterol, used cholesterol-lowering medication, or had a fasting low-density lipoprotein cholesterol ≥ 2.59 mmol/L (100 mg/dL) with a history of cardiovascular disease(20). Cardiovascular disease was defined as myocardial infarction or stroke.
When definitions of variables were not identical across the different studies (e.g., physical activity), we tried to use the best available variables to achieve reasonable consistency across databases. For example, in NHANES, we defined ‘physically active’ if a person answered “more active” to the question,“Compare your activity with others of the same age”. Otherwise, we classified subjects as ‘not physically active’. In ARIC, physical activity was assessed in a yes vs. no question, while in CHS, we dichotomized the physical activity question into “no” or “low” vs. “moderate” or “high”. None of the databases we used collected data during pregnancy. Finally, some ARIC participants (N=521) did not fast. For those, we used a random blood glucose of 11.1 mmol/L (200 mg/dL) cutpoint to define diabetes(18, 21, 22). Non-fasting participants were not included in the model development (using NHANES) but they were included in the external validation to reflect a realistic scenario.
We used descriptive statistics to characterize the 4 groups according to diabetes status: mean and standard error were used for continuous variables and percentage was used for categorical variables. For model fitting, multiple logistic regression was adopted with undiagnosed diabetes cases as the endpoint, excluding diagnosed diabetes cases. Due to small proportions of missing data, we used all non-missing observations available in the relevant analyses. The only exception is that we imputed ‘family history of diabetes’ using a statistical technique (missing data imputation procedure, Proc MI, in SAS) for handling missing data in CHS, one of the external validation datasets, as this information was not collected in CHS(23, 24).
Using the development dataset (NHANES 1999–2004), we included a comprehensive list of predictors known to be potentially associated with undiagnosed diabetes in an initial model. Specifically, the main effects of all variables listed in Table 1 and their interaction effects with age were included. Due to a large number of covariates, we started with continuous variables and later categorized them in the final model. We employed backward elimination (deleting the covariate with the largest p-value, one at a time) from the initial model until we reached a final model with statistically significant covariates. We were guided by statistical significance for the model building but also used scientific and qualitative judgments as well. For example, although income and health insurance status were statistically significant, we decided not to keep these variables in the final model as they were deemed less appropriate or less user-friendly in risk assessment questionnaires. Physical activity showed borderline significance, p-value=0.06, in the development dataset but was kept as this is an underlying protective factor that often fails to reach statistical significance for various reasons (e.g., difficult to quantify, misclassification, or insufficient statistical power). Moreover, physical activity is highly modifiable, in contrast to demographic and health history variables which are not(25).
In the final model, we double-checked that any important covariates were not erroneously omitted in this sequential process. We intentionally used only categorized variables that captured easy but relevant and validated health information in the prediction model, aiming to develop a user-friendly and educational screening score. We created a weighted scoring system by rounding up all regression coefficients in the final model to the nearest integer (when strong monotonicity was observed, we broke the tie accordingly).
We evaluated our scoring system in NHANES 2005–6. We computed standard validation measures: the proportion of high risk individuals, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), Youden index (=1-false positive rate-false negative rate=sensitivity+specificity-1), likelihood ratios for a positive test (=sensitivity/(1–specificity)) and for a negative test (=(1–sensitivity)/specificity), and the area under the receiver-operating-characteristic curve (AUC) as a discrimination statistic(26–28). In addition, our final model was re-fitted to ARIC/CHS. We estimated the prevalence of undiagnosed diabetes per individual score in NHANES and ARIC/CHS.
In validation samples using the aforementioned evaluation measures, we compared our new classification rule with the national screening guidelines and other assessment algorithms for undiagnosed diabetes: CDC(11), ADA(22), USPSTF(13), two ADA diabetes risk questionnaires(4, 14), and the Rotterdam model that was derived from a European sample(29). The last model was included to evaluate the generaliziblity and transferability of a validated non-U.S. model to the U.S. population.
We performed three ancillary analyses for checking sensitivity/ robustness and testing the utility of the new screening score in broadened practical contexts. We repeated the NHANES analyses 1) using ‘undiagnosed diabetes and pre-diabetes’ as an expanded endpoint (with ‘normal glucose’ as the reference group); 2) separately for those <45 years old vs. those ≥45 years old, as 45 years is the age threshold proposed by the ADA for universal screening; and 3) using an alternative definition of the endpoint based on Hemoglobin A1c ≥ 6.5%(12, 30). The first analysis may have particular importance for predicting pre-diabetes, the condition for which prevention has been shown to be more beneficial compared to undiagnosed yet manifest disease.
For statistical analyses, SAS version 9.1 software (Cary, NC) was used. For NHANES analyses, survey procedures with options of strata, cluster and weight were used to account for the complex survey design(24, 26). Two-sided hypotheses and tests were adopted for all statistical inferences.
The Clinical and Translational Science Center at Weill Cornell Medical College (UL1-RR024996) provided partial support for data analyses. This funding source did not have a role in the design of our analysis, its interpretation, or the decision to submit the manuscript for publication.
Our sample was comprised of 5,258 participants in the development dataset. Characteristics of participants according to diabetes status are summarized in Table 1. The crude weighted prevalence of undiagnosed diabetes based on fasting glucose in adults aged 20 years or older was 2.8%(2). Diabetic (diagnosed or undiagnosed) participants tended to be older and have lower education and household income than their non-diabetic counterparts. Diabetics also tended to be hypertensive, do less exercise, and have family history of diabetes and personal history of cardiovascular disease. Interestingly, participants with undiagnosed diabetes were more likely to have higher blood pressure, body mass index, waist circumference, total cholesterol, and dyslipidemia than participants with diagnosed diabetes (differences were not formally tested).
Table 2 presents the final regression model derived from the development dataset. Age, sex, family history of diabetes, personal history of hypertension, obesity, and physical activity were statistically significant predictors of undiagnosed diabetes. Age and obesity status needed multiple categories (with scores 0–3 assigned) to capture the risk gradient, while other risk factors were binary (with score 0 or 1 assigned). The six risk factors jointly yielded an AUC of 0.79.
The diagnostic characteristics of different cutpoints for total score were assessed in the development as well as validation NHANES. We selected the cutpoint ≥ 5 to designate an individual as having a high risk for undiagnosed diabetes. This cutpoint defined approximately 35% of the adult population as high risk for undiagnosed diabetes and yielded a sensitivity of 79%, specificity of 67%, PPV of 10%, and NPV of 99%, with an AUC of 0.83 in the validation NHANES dataset (see Table 3). Based on these results, if we assume 1,000 new persons will go through the risk assessment and use the cutpoint of 5, then we can estimate 350 persons (35%) would undergo diagnostic testing, 31 new cases of diabetes would be identified, and 6–7 persons with diabetes would remain untested and undetected(31). If a lower cutpoint of 4 is used, then approximately 510 persons (51%) would undergo diagnostic testing, and we can expect 41 cases of diabetes newly identified and fewer than 3 diabetes cases untested and undetected.
When our final prediction model was re-fitted to ARIC/CHS, consistent results were obtained: all of the risk factors were significant (with p values ≤ 0.001) and the magnitude of the associations was comparable, with an AUC of 0.74 (see Appendix Table 1, available at www.annals.org).
Figure 1 depicts the prevalence of undiagnosed diabetes for individual total scores in NHANES and ARIC/CHS. A monotonic (quadratic) relationship was clearly observed. ARIC/CHS showed higher disease prevalence than NHANES, probably due to older ages in the ARIC/CHS populations (≥ 45 years old). Table 4 summarizes the performance characteristics of the existing guidelines or scores and our own method. Our screening score (cutpoint ≥ 5) tended to identify smaller proportions of people being at high risk but resulted in higher overall test accuracy (reflected in Youden index), PPV, and likelihood ratio for a positive test, compared to other methods. NPV was high (≥0.96) for all methods. Among existing methods, the Rotterdam model (developed from a European sample) and the new ADA questionnaire seemed to perform best.
We performed three ancillary analyses, detailed in the Methods. We again used cutpoint scores of 5 and 4; results below are for a cutpoint of 5 with values in parentheses reflecting a cutpoint of 4. In the first ancillary analysis, discrimination ability was somewhat reduced as anticipated when the endpoint combined undiagnosed diabetes and pre-diabetes together (AUC=0.72), yielding a sensitivity of 57 (vs. 73)%, specificity of 74 (57)%, PPV of 56 (50)%, and NPV of 74 (78)%. In the second ancillary analysis using only participants ≥ 45 years old, we had a sensitivity of 88 (97)%, specificity of 40 (20)%, PPV of 9 (8)%, and NPV of 98 (99)% with an AUC=0.73. Using only participants < 45 years old, we had a sensitivity of 35 (76)%, specificity of 93 (80)%, PPV of 6 (5)%, and NPV of 99 (100)% with an AUC=0.83; this analysis may be limited due to a small number of diabetes cases. Lastly, when we used hemoglobin A1c in place of fasting glucose for the diabetes definition, we obtained a sensitivity of 80 (91)%, specificity of 63 (47)%, PPV of 6 (5)%, and NPV of 99 (99)% with an AUC of 0.78. PPV is directly proportional to the prevalence of the disease/condition (26, 32), explaining why our score, like previous methods, yielded lower PPV for these outcomes.
Finally, a sample questionnaire that can be used for community screening for undiagnosed diabetes or pre-diabetes is provided in Figure 2.
Clinical trials demonstrate that high risk individuals can reduce their risk of diabetes by more than half when they follow a well-structured, intensive, life style modification program(6, 8, 18). Therefore, early diagnosis could be crucial to reduce the global burden of diabetes. Widespread blood glucose testing may not be the best way to identify undiagnosed diabetes in large community or resource limited settings. Indeed, existing recommendations for diabetes screening that rely on blood testing are not widely followed, resulting in 30% of diabetics going undiagnosed(4).
Our goal was to develop a screening score that can be used in a wide variety of community settings and clinical encounters (including patient waiting rooms or internet) via a simple pencil-and-paper method. Our new diabetes score appeared to perform better than existing methods by quantitative criteria. We believe that it also has good feasibility characteristics – as a simple (with 6 easily answered health-related questions) and efficient (with minimal time needed for survey and no need for a calculator with the maximum score less than 10) screening score with which patients and health care providers can assess their or their patients’ need for formal diabetes testing.
We found that the national guidelines for diabetes screening did not perform very well. The three diabetes risk assessment scores showed lower overall accuracy and tended to select larger proportions of people for diabetes screening compared to our new score. Low specificities of existing methods have been reported previously(33–35). The screening criteria recommended by different organizations were developed using different frameworks and purposes, e.g., to enhance efficiency of screening or to target those who could benefit most from screening. So although they differ in numerical performance characteristics (e.g., sensitivity and specificity) based on our analysis, they may be more appropriate for those purposes.
The primary endpoint in our study was undiagnosed diabetes rather than the composite outcome of undiagnosed diabetes and pre-diabetes, but the same questionnaire may well be justified for these closely related outcomes (a disease and its precursor) with different cutpoints (5 for diabetes and 4 for pre-diabetes) based on the evidence obtained from our ancillary analyses. In addition, our score is for prediction of currently undiagnosed diabetes and not for incident diabetes in the future. However, strong consistency in risk factors for the prediction of prevalent and incident events in diabetes and other chronic diseases has been reported(36–38), and we expect that the same set of risk factors in our model play important roles in the prediction of future diabetes or pre-diabetes. Nonetheless, other laboratory or behavioral/lifestyle variables could be useful in predicting future events rather than current events(18, 21, 31, 39–41).
A risk prediction approach that can capture a continuous risk spectrum is a popular tool that has been used to identify important risk factors and to estimate average risk; results can be used in decision making about public health and clinical care. Risk prediction has even been proposed as an alternative to diagnosis for some diseases(42). We believe that ideal risk assessment methods or prediction models should be derived from large representative samples of a target population and consist of fixed and modifiable risk factors together. Simplicity and user-friendliness (including optimal presentation), in addition to accuracy, are keys for successful implementation and utilization, especially for lay persons(25, 39). To achieve these goals, we 1) adopted a statistical method that yields a systematic scoring system and accounts for design effects of the study appropriately (i.e., logistic regression model suited for complex survey data); 2) carefully selected a parsimonious set of predictors (guided not only by numerical and scientific evidence but also by feasibility perspectives); 3) chose categorized variables in intuitive or well-accepted ways (e.g., using deciles for age and obesity definition); and 4) emphasized an educational purpose of the screening score, highlighting the important risk factors to motivate high-risk people to be screened or to modify health behaviors (e.g., combining body mass index and waist circumference together, rather than using height, weight and waist separately). This combination of factors may explain the enhanced properties of our new score.
For this study, we tried to identify all existing screening guidelines or risk assessment scores for prevalent undiagnosed diabetes available for the U.S. population and one best-suited score for non-U.S. population for comparisons. We found that there are 3 national guidelines and 2 scores/questionnaires for diabetes screening in the U.S., whereas many prediction models exist for incident diabetes. Our search for the best suited non-U.S. model was guided by recent comparison studies(36, 43); we selected the Rotterdam model as it was developed for prevalent undiagnosed diabetes, has been externally validated in different samples, and only requires routinely available demographic or health information in its simple scoring system.
This study does have some limitations. First, some variables that are parts of existing methods (e.g., gestational diabetes) were not available in the databases we used. Therefore, some caution should be exercised in making comparisons between our and others’ methods. Nonetheless, we believe that the vast majority of key information was available and utilized, minimizing the unfairness in the comparisons. Second, we could not incorporate oral glucose tolerance test results because these data were not collected in the newer NHANES (1999–2006) and in the baseline visits in ARIC and CHS. Thus, we defined the outcome based solely on the FPG. The FPG is a recommended screening test, however, and the lack of oral glucose tolerance test data has not been shown previously to affect the stability of diabetes risk assessment methods(4, 44). Our results seemed to be robust to different definitions of the endpoint, either based on FPG or Hemoglobin A1c (e.g., AUC=0.79 vs. 0.78).
Although the lay population is increasingly appreciating the danger of diabetes and its complications, more education is still needed in community and clinical settings. In that sense, although further validation of our screening score in other samples is important, this newly developed algorithm could still have immediate applications. In addition to its use in clinical encounters, targeted screenings, and health education programs, the screening score can be applied by health plans to existing databases for case-finding. The new algorithm can also potentially help identify optimal populations for enrollment in clinical trials that test new strategies to prevent or manage diabetes.
In conclusion, we envision our screening score to serve as a method for identifying individuals in need of formal diabetes screening and calling for more attention to pre-diabetes. A self-assessment method that helps people decide whether they should seek medical care for diabetes testing may serve as one way to address the lack of interaction with health care facilities/providers that may underlie the high percentage of the population with undiagnosed diabetes, particularly the underserved. Although a consensus on diabetes screening has not yet been reached(45, 46), we believe a priority in formal screening for undiagnosed diabetes should be given to those who are at high risk. This new diabetes screening score could help identify these high risk individuals, while patients and caregivers alike await more definitive evidence-based recommendations(47, 48).
The authors thank the staff and participants of the NHANES, ARIC and CHS studies for their important contributions to research and data sharing.
Financial Support: Dr. Bang and Ms. Edwards were partially supported by the Clinical and Translational Science Center at Weill Cornell Medical College (UL1-RR024996).
|Variable||Odds Ratio (95% CI)||P-value||Log (Odds Ratio)|
|Age in years*|
|Family history of diabetes|
|Yes||1.9 (1.7–2.2)||< 0.0001||0.66|
|History of hypertension|
|Yes||2.3 (2.0–2.6)||< 0.0001||0.83|
|Not overweight or obese||reference||---||---|
|Obese||3.6 (2.9–4.5)||< 0.0001||1.28|
|Extremely obese||8.8 (6.2–12.4)||<0.0001||2.18|
else if (30≤BMI<40) or (40≤waist<50 for male) or (35≤waist<49 for female) then obese;
else if (25≤BMI<30) or (37≤waist<40 for male) or (31.5≤waist<35 for female) then overweight;
else not overweight or obese.
ARIC = Atherosclerosis Risk in Communities; AUC = area under the receiver-operating-characteristic curve; BMI = body mass index; CHS = Cardiovascular Health Study.
Potential Conflicts of Interest: Dr. Teutsch is a former employee and an option holder in Merck and Co. Inc.
Disclaimer: The NHANES were supported by the Centers for Disease Control and Prevention. The ARIC and CHS studies were supported by the National Heart, Lung and Blood Institute. This manuscript was prepared using public use datasets (for NHANES) and limited access datasets (for ARIC and CHS) and does not necessarily reflect the opinions or views of these studies or agencies.
Reproducible Research Statement: Study protocol: Not available. Statistical code (secondary analysis): Available from Dr. Bang (heb2013/at/med.cornell.edu) upon request. Data set: NHANES are publicly available (at http://www.cdc.gov/nchs/nhanes.htm) and ARIC and CHS are available through a limited-access distribution agreement (http://www.nhlbi.nih.gov/resources/deca/datasets_obv.htm).