Search tips
Search criteria 


Logo of diacareAmerican Diabetes AssociationSubscribeSearchDiabetes Care Journal
Diabetes Care. 2008 August; 31(8): 1670–1671.
PMCID: PMC2494666

Validation of Prediction of Diabetes by the Archimedes Model and Comparison With Other Predicting Models

Michael Stern, MD,1 Ken Williams, MS,2 David Eddy, MD, PHD,3 and Richard Kahn, PHD4


OBJECTIVE—To validate the ability of the Archimedes model to accurately predict the risk of developing diabetes in individuals.

RESEARCH DESIGN AND METHODS—Subjects were randomly selected from the San Antonio Heart Study population. The area under the receiver operating characteristic (aROC) curve derived from the Archimedes model was calculated and also compared with the aROCs from two published multiple logistic regression models designed to estimate diabetes risk.

RESULTS—The aROC for the Archimedes model was 0.818 (95% CI 0.739–0.899) compared with aROCs of 0.869 (0.801–0.936) and 0.870 (0.802–0.937) for the two logistic regression models, respectively. Risk estimates from the logistic models were highly correlated with the estimates derived from the Archimedes model.

CONCLUSIONS—The Archimedes model predicts individual diabetes risk with a high level of sensitivity and specificity, comparable with that of models designed specifically for that purpose. Unlike the latter models, Archimedes also predicts the risk of numerous other health outcomes.

The Archimedes model is a large-scale simulation model of human physiology and health care systems (1). It has been extensively validated by its ability to quite closely replicate a wide variety of aggregate health outcomes in populations (1). The ability of Archimedes to make accurate predictions for individuals, however, has thus far not been validated. Using data from the San Antonio Heart Study (SAHS), we attempted such a validation. We also compared the area under the receiver operating characteristic curves (aROCs) derived from Archimedes with those derived from two other diabetes predicting models, namely, the SAHS predicting model (2) and the Atherosclerosis Risk in Communities (ARIC) predicting model (3).


The SAHS is a prospective cohort study consisting of 3,682 individuals (62% Mexican American and 38% non-Hispanic white) followed for 7–8 years (4). The SAHS predicting model is a multiple logistic regression model with incident diabetes as the dependent variable and a panel of baseline characteristics that are ordinarily available in a routine clinical setting as independent variables (2). The ARIC predicting model is a similarly constructed logistic regression model (3).

The Archimedes model is built from underlying anatomy and physiology and uses scores of ordinary and differential equations to represent metabolic pathways, occurrence and progression of diseases, signs and symptoms, treatments, and outcomes. A practical, free, readily available tool derived from the Archimedes model is the American Diabetes Association's Diabetes PHD (Personal Health Decisions; available at Diabetes PHD can simultaneously predict the risk of diabetes and numerous other outcomes, including the effects of a wide variety of treatments in many different populations (e.g., those with diabetes). It was used here to provide external validation of its prediction of the incidence of diabetes.

Among the 3,228 individuals in the SAHS who were nondiabetic at baseline, 295 developed diabetes over the 7–8 years of follow-up. All the required elements for the Archimedes risk estimation were available in the subjects selected for the present analyses. The present analyses were restricted to the recent cohort 2 of SAHS, which included 1,734 nondiabetic individuals, 195 of whom were diabetic at follow-up. Within the SAHS database, we selected 100 individuals at random, 50 of whom were diabetic at follow-up and 50 who remained free of diabetes at follow-up. This sample size would provide 80% power to detect an aROC significantly (P < 0.05) greater than 0.70 (the low end of acceptable discrimination [5]) if the true aROC was >0.80 and 90% power if the true aROC was 0.83 (benchmark values near that of other established models) (2,3).

The risk of developing diabetes for each individual was determined according to the years of follow-up for that individual (rounded to the nearest year), which ranged from 6–9 with a mean of 7.5. Data from each individual were entered into Diabetes PHD and the results obtained from the graphical output displayed on the computer screen. A second person confirmed the accuracy of the input and, in a random sample of 20 forms, also confirmed the output from Diabetes PHD.

We also estimated the risk of diabetes for the same 100 individuals using both the SAHS diabetes predicting model and the ARIC predicting model. The aROC's and CIs for all three models were computed and compared (6). Finally, we computed the Spearman correlation coefficients between the risk estimates obtained from each pair of predicting models.


The aROC for Diabetes PHD was 0.818 (95% CI 0.739–0.899) and was not statistically different than the aROC of the SAHS model (0.869 [95% CI 0.801- 0.936]) or the ARIC model (0.870 [0.802–0.937]) (Fig. 1). The risk estimates from the SAHS model and ARIC model were highly correlated (r = 0.962), and both correlated well with Diabetes PHD (r = 0.834 and 0.842, respectively).

Figure 1
aROC curves for the PHD, the SAHS predicting model, and the ARIC predicting model.


With an aROC of 0.818, it is evident that the accuracy of Diabetes PHD (i.e., Archimedes) to predict an individual's risk of diabetes is excellent—almost as high as models specifically designed and used only for that purpose. The SAHS model may have had an unfair advantage over Archimedes because it was designed and optimized using the SAHS database and could be overfitted to the subset of SAHS cases selected for this analysis. It was for that reason that we used the ARIC predicting model: the latter was developed in an entirely independent dataset and performed as well as the SAHS model.

Both the SAHS and ARIC models were built from person-specific data and optimized specifically for predicting incident diabetes. In contrast, Archimedes was designed to be used for a very wide range of purposes, calculates many different outcomes, was not built from person-specific data, and was not calibrated to determine the incidence of diabetes. Also, several of the variables Archimedes uses that may have enhanced its predictive capability were not included in this analysis.

This report extends the validation of Archimedes and demonstrates its excellent ability to discriminate between individuals who will or will not develop diabetes. Its utility is comparable with models developed solely for that purpose. Because Diabetes PHD, derived from Archimedes, is freely available on the internet and calculates many additional outcomes, it is a powerful tool that can be reliably used for comprehensive risk assessment and decision making. Diabetes PHD is now widely accessed (~80,000 users per year) for use in comprehensive risk assessment of cardiometabolic disease over a 30-year period and that helps diabetic patients better appreciate the likely benefits of risk factor reduction. Although the tool currently uses complex distributive computing, which limits its speed and capacity, a much more rapid version will soon become available with unlimited capacity. This will allow for widespread promotion.


This study was supported by grants from the National Heart, Lung, and Blood Institute (R01 HL24799 and R01 HL36820).


Published ahead of print at on 28 May 2008.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C Section 1734 solely to indicate this fact.


1. Eddy DM, Schlessinger L: Archimedes: a trial validated model of diabetes. Diabetes Care 26:3093–3101, 2003. [PubMed]
2. Stern MP, Williams K, Haffner SM: Identification of individuals at high risk of type 2 diabetes: do we need the oral glucose tolerance test? Ann Intern Med 136:575–581, 2002. [PubMed]
3. Schmidt MI, Duncan BB, Bang H, Pankow JS, Ballantyne CM, Golden SH, Folsom AR, Chambless LE: Identifying individuals at high risk for diabetes: the Athererosclerosis Risk in Communities Study. Diabetes Care 28:2013–2018, 2005. [PubMed]
4. Burke JP, Williams K, Gaskill SP, Hazuda HP, Haffner SM, Stern MP: Rapid rise in the incidence of type 2 diabetes from 1987 to 1996: results from the San Antonio Heart Study. Arch Intern Med 159:1450–1456, 1999. [PubMed]
5. Hosmer DW, Lemeshow S: Applied Logistic Regression. 2nd ed. Hoboken, NJ, John Wiley and Sons, 2000
6. DeLong ER, DeLong DM, Clarke-Pearson DL: Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44:837–845, 1988. [PubMed]

Articles from Diabetes Care are provided here courtesy of American Diabetes Association