|Home | About | Journals | Submit | Contact Us | Français|
The objective of this study is to propose a non-parametric pharmacokinetic prediction model that addresses the individualized risk of developing type-2 diabetes in subjects with family history of type-2 diabetes.
All selected 191 healthy subjects had both parents as type-2 diabetic. Glucose was administered intravenously (0.5 g/kg body weight) and 13 blood samples taken at specified times were analyzed for plasma insulin and glucose concentrations. All subjects were followed for an average of 13–14 years for diabetic or normal (non-diabetic) outcome.
The new logistic regression model predicts the development of diabetes based on body mass index and only one blood sample at 90 min analyzed for insulin concentration. Our model correctly identified 4.5 times more subjects (54% versus 11.6%) predicted to develop diabetes and more than twice the subjects (99% versus 46.4%) predicted not to develop diabetes compared to current non-pharmacokinetic probability estimates for development of type-2 diabetes.
Our model can be useful for individualized prediction of development of type-2 diabetes in subjects with family history of type-2 diabetes. This improved prediction may be an important mediating factor for better perception of risk and may result in an improved intervention.
Undiagnosed diabetes constitutes approximately 29.3% of total diabetes prevalence . It is clear that developing strategies to screen and identify high-risk individuals should be an important public health goal . Family history of type-2 diabetes is recognized as an important risk factor of the disease [3–5]. Individuals who have a family history of diabetes have two to six times greater risk of developing type-2 diabetes compared to individuals with no family history of the disease . According to NHANES III, from 1999–2002, prevalence of diabetes among individuals who have a first degree relative (parents or siblings) with diabetes (14.3%) was significantly higher (p < 0.001) than that of individuals without a family history of diabetes (3.2%) . According to the same survey, the probability of developing diabetes among individuals who have both parents with diabetes (25.4%) was significantly higher (p < 0.001) than individuals without any family history of diabetes (3.2%) . The diabetes prevalence in subjects with BMI < 25 kg/m2 was 3.1% followed by 5.9% for subjects with BMI from 25 to 29.9 kg/m2 and for subjects with BMI ≥ 30 kg/m2, the prevalence of diabetes was 11.2% . Thus the magnitude of increased risk is greater for family history than for obesity per se.
Although family history is a risk factor for diabetes, various studies have shown that fewer than 40% of people with a family history of the disease actually perceive themselves to be at an increased risk of developing diabetes compared to those with no family history of diabetes [7–11]. Therefore, altering risk perception is a potential target for intervention. This may be achieved by a more accurate and individualized risk assessment. An increased awareness of the risk of developing diabetes, will likely increase an individual's motivation for changing his or her lifestyle to reduce the risk of developing the disease [12,13]. There are no studies reported in the diabetes literature that specifically address a scientific-based manipulation of individualized risk perception in subjects with family history of type-2 diabetes, although studies in other areas have demonstrated a direct relationship between altering risk perception and changing human behavior . In this study, we have proposed and validated a pharmacokinetic predictive model that addresses the individualized risk perception in subjects with family history of type-2 diabetes (both parents type-2 diabetic). It is known that type-2 diabetes is a heterogeneous disorder characterized by a combination of impaired insulin secretion and insulin resistance [14,15]. Our exploratory data analysis is aimed at identification of few plasma insulin concentrations that can improve the prediction of developing type-2 diabetes. This is an important goal because the use of only few (one or two) blood samples analyzed for insulin concentration makes the test considerably more practical than tests requiring many samples for a pharmacokinetic/pharmacodynamic characterization of the glucose–insulin system, such as in the intravenous glucose tolerance test (IVGTT).
Our model is useful for individualized prediction of development of type-2 diabetes in subjects with family history of type-2 diabetes. The improvement in prediction may be an important mediating factor for better perception of risk and may result in an improved intervention.
For this study 191 healthy subjects from age 16 to 59 years with family history of type-2 diabetes (i.e., both parents had type-2 diabetes) were selected [16,17]. At the time of recruitment, no subject had a disease or received any medication known to affect glucose metabolism. Follow-up survey of all the subjects with diabetic parents was conducted after an average of 13 years (12.7 ± 6.4 years) (mean ± SD) to ascertain diagnoses of diabetes. Verification of the absence of type-2 diabetes was based on an oral glucose tolerance test. Subjects were divided into two groups based on the outcome, diabetic and normal (non-diabetic) outcome. Subjects with missing information (i.e., demographics, outcome at the end of follow-up period, insulin or glucose concentrations during IVGTT) were not included in the analysis. We had 30 subjects in the diabetic outcome group (19.7%) and 122 subjects in the normal outcome group (80.3%). The body mass index (BMI) was significantly higher (p < 0.01) in the diabetic outcome group (31.3 ± 7.6 kg/m2) as compared to the normal outcome group (25.3 ± 4.3). The age at entry for subjects in diabetic group (36.3 ± 9.2 years) was not significantly different (p > 0.05) than subjects in normal outcome group (33.6 ± 9.9).
All the participants were instructed to consume a high-carbohydrate diet (250–300 g/day) for at least 3 days before the test and to arrive in the clinical research laboratory at Joslin Diabetes Center (Boston, MA) after an overnight fast. Height and weight were measured and a medical history was taken. After a fasting blood sample was obtained, glucose was administered intravenously (0.5 g/kg body weight), and blood was drawn at 1, 3, 5, 10, 20, 30, 40, 50, 60, 90, 120 and 180 min. Blood glucose concentration (mg/dL) was measured by either ferricyanide  or glucose oxidase method . Serum insulin level (μU/mL) was measured by a double-antibody radioimmunoassay technique .
We combined subjects that developed diabetes and subjects that developed impaired glucose tolerance (IGT) in one group (called diabetic outcome group) and compared this group with the individuals that did not develop diabetes in the follow-up (normal outcome group). Subjects with IGT outcome were pooled with subjects with diabetic outcome because IGT is associated with higher risk of developing diabetes [21,22]. The risk of development of diabetes was 6.3 times higher in subjects with IGT than in normal subjects . The criteria for diagnosis of IGT and diabetes is based on classification by National Diabetes Data Group . The independent contribution of risk factors like BMI and sex was tested in the development of diabetes using univariate logistic regression. Univariate logistic regression was done for each glucose and insulin concentration independently to predict the development of type-2 diabetes. Based on the Wald test statistic obtained from univariate logistic regression, we identified insulin and glucose concentrations that can independently predict the development of diabetes. Accordingly, two different multivariate logistic regression analyses were then done, one for insulin and another one for glucose concentrations. Multivariate regression analysis for insulin concentration included only those insulin concentrations as predictors that individually predicted (p < 0.05) the development of diabetes in univariate logistic regression. The significance of the predictor variable in the model was assessed using the Wald test at 5% level of significance. The multivariate logistic regression for glucose concentrations was done in the same way to identify the glucose concentration that predicted the development of diabetes. Insulin and glucose concentrations that predicted the development of diabetes in multivariate logistic regression along with other predictors like sex and body mass index (BMI) were then used to develop a model for prediction of outcome (diabetes or normal). The predictors that did not predict (p > 0.05) the development of diabetes according to Wald statistics were removed from the model and a new model was proposed in which interaction term was also included besides individual predictors. The addition or deletion of interaction term in the model was based on obtaining the lower value of Akaike information criterion (AIC) .
To investigate the predictive power of our model we used a classification table and receiver operating characteristics (ROC) curve . The classification table cross-classifies the binary response with prediction of whether the subject develops diabetes or not for some cut-off probability value. Specifically, each cut-off probability value is associated with a particular value of specificity and sensitivity. A decision regarding acceptable levels of sensitivity and specificity involves weighting the consequences of analysis . Sensitivity defined as the proportion of subjects who developed diabetes who were correctly predicted to do so (true-positive test) was calculated as
Specificity was defined as the proportion of subjects who did not develop diabetes who were correctly predicted not to develop it (true-negative test) calculated as
We also calculated positive predictive value (PPV) defined as the percentage of subjects correctly predicted to develop diabetes and was calculated as
Negative predictive value (NPV) was defined as percentage of subjects correctly predicted not to develop diabetes and calculated as
Receiver operating characteristics (ROC) curve is a plot of sensitivity on y-axis and (1 − specificity) on x-axis. Each point on the ROC curve will represent particular sensitivity and specificity value corresponding to a unique cut-off probability value. The area under the ROC curve quantifies how well the model correctly distinguishes a subject with diabetes from a subject without diabetes; the larger the area under the curve or closer the value is to 1, the better the performance of the model. The area under ROC curve ranges from 0 to 1. All data analyses were performed using S-Plus Version 7.0 (Insightful Corporation, Seattle, WA).
The accuracy of a model is reflected by its ability to predict outcome from other samples of the target population. So, an appropriate procedure is to test the accuracy with new (future) data. However, this method is expensive and delays the final establishment of the model. Also, testing for accuracy with new data is not practical in the present case. The alternative testing is the use of a bootstrap procedure because it reduces the classification bias as compared to jackknife and cross-validation . To validate the model, we used 250 bootstrap repetitions, where sampling was done with replacement. A bootstrap sample “training sample” is drawn from the original data and the model is fit to this sample and the area under ROC curve is used as a measure of predictive accuracy (θ^). Another bootstrap sample is drawn “test sample” and the area under ROC curve is calculated similar to the calculation for training sample. The process of drawing samples is repeated 250 times and each time the area under ROC curve is calculated. After 250 repetitions, the area under ROC curve for test sample is averaged (θ^*). This produces an average area under ROC curve for training sample (θ^) and also an averaged area under ROC curve value for test sample (θ^*). The difference in the area under ROC curve between test and training sample (θ^ − θ^*) gives the ‘bias’. The lower the value of this bias, the better the predictions will be from our model.
We also compared the results from our model to the results that are obtained by using current knowledge about family history of diabetes (both parents are type-2 diabetic) as a risk factor. To estimate the improvement our model is able to make in current knowledge about prediction of type-2 diabetes in subjects with family history of type-2 diabetes, we randomized a group of 191 subjects with family history of type-2 diabetes and assigned diabetic outcome to 25% of the subjects  (currently known risk for developing diabetes in subjects with both diabetic parents is 25%). Since we know the true outcome for each subject we can construct a classification table that has number of true-positive, true-negative, false-positive and false-negative subjects. From these values we calculated sensitivity and specificity using Eqs. (1) and (2). Comparison of sensitivity and specificity values from current knowledge to the values from our model provides a metric for improvement in prediction of type-2 diabetes using our model. We also calculated positive predictive value (PPV) and negative predictive value (NPV) from Eqs. (3) and (4). All validation analyses were performed using S-Plus Version 7.0.
The length of follow-up observation was 13 years (12.7 ± 6.4 years) on an average, for all the subjects, regardless of the outcome. Tables 1 and and22 shows the results of univariate logistic regression in which each of the risk factor is a predictor and development of type-2 diabetes is the dependent variable (Figs. 1 and and2).2). Body mass index (BMI) and sex independently predicted (p < 0.05) the development of type-2 diabetes (Table 1). Insulin concentrations at time 0, 20, 30, 40, 50, 60, 90, 120 and 180 min predicted (p < 0.05) the development of diabetes while insulin concentrations at time 1, 3, 5 and 10 min did not predict (p > 0.05) the development of diabetes in univariate logistic regression according to Wald test statistics (Table 1). Glucose concentrations at time 0, 1, 3, 5, 10, 20, 30, 40, 50, 60, 90, 120 and 180 min predicted the development of diabetes (p < 0.05) in univariate logistic regression analysis (Table 2). Model with only baseline characteristics (BMI, sex, age, weight and baseline insulin or glucose concentrations) did not predict the development of diabetes. Multivariate logistic regression analysis involved insulin concentrations at time 0, 20, 30, 40, 50, 60 and 90 min to find insulin concentration that significantly (p < 0.05) predicts the development of diabetes. Results from multivariate regression analysis showed that insulin concentration at time 90 min (INS090) best predicted the development of diabetes (p < 0.05) according to the Wald test. Glucose concentration at time 90 min (GLU090) was also found to predict the development of diabetes in multivariate logistic regression analysis (p < 0.05). Insulin concentration at time 90 min was significantly higher (p < 0.05) in diabetic outcome group (46.9 ± 34.3 μU/mL) as compared to normal outcome group (21.5 ± 13.7) and glucose concentration at time 90 min was also significantly higher (p < 0.05) in diabetic outcome group (84.1 ± 17.9 mg/dL) than in normal outcome group (68.8 ± 13.6). A full model was proposed with development of type-2 diabetes as the dependent variable and BMI, gender, insulin and glucose concentrations at time 90 min as predictor variables (Table 3). Glucose concentration at 90 min and gender were not significant (p > 0.05) predictors of development of type-2 diabetes according to the Wald statistics (Table 3) in the full model. This led to a reduced model with only BMI and insulin concentration at 90 min as predictors (Table 4). In this model we also included the interaction term between the predictors. We got slightly lower AIC value after dropping the interaction term between BMI and INS090 (119.5 versus 120). Thus, the final diabetes predictive model was proposed as a combination of BMI and INS090 with the following logistic regression parameters:
where x = −5.84 + 0.040 (insulin concentration at time 90 min in μU/mL) + 0.116 (BMI in kg/m2) (Table 4 and Fig. 3). The calculated cut-off probability (p value) for our model is 0.06, this means that if the probability value calculated from Eq. (5) is ≤0.06, then the subject is predicted to ‘not develop’ diabetes but if the value of p > 0.06, then the subject is predicted to ‘develop’ diabetes. This cut-off value was decided because it gave us a desired combination of high specificity (99%) and moderate sensitivity (54%).
From the cross-classification table, 29 subjects were correctly classified as diabetic (true-positive), the number of subjects falsely classified as diabetic (false-positive) was 1, while 97 subjects were correctly classified as normal (true-negative), and 25 subjects were incorrectly classified as normal (false-negative). Using Eqs. (1) and (2), sensitivity and specificity were calculated to be 54% and 99% respectively for these outcomes. From Eqs. (3) and (4), positive predictive value (PPV) and negative predictive value (NPV) was calculated to be 96.7% and 79.5%, respectively. The area under the ROC curve was calculated to be 0.817. When the predictive model was applied to 250 bootstrap samples, the area under ROC curve was calculated as 0.811 from averaged test samples and 0.815 from the training sample. The bias in area under ROC curve was calculated to be 0.815 − 0.811 = 0.004.
Based on the current knowledge about development of diabetes in subjects with family history of diabetes (both parents diabetic) as the risk factor, a cross-classification table was made. From this table, the number of subjects classified as true-positive was 14, number of false-positive subjects was 38, number of true-negative subjects was 33 and number of false-negative subjects was 106. From these values, the sensitivity and specificity were calculated to be 11.6% and 46.4%, respectively. The positive predictive value (PPV) and negative predictive value (NPV) was calculated to be 26.9% and 23.7%, respectively.
Family history of diabetes provides valuable information because it represents the combination of inherited genetic susceptibilities and shared environmental and behavioral factors . The knowledge of family history can be crucial in the prevention, early detection, and treatment of type-2 diabetes . Studies have shown that type-2 diabetes can be prevented or delayed by adopting simple, healthy lifestyle changes, such as a healthy diet and exercise [29–31]. Obesity is defined as BMI ≥30 kg/m2 but in this analysis, BMI was included as a continuous variable because the predictive value of a continuous variable is higher than a categorical variable. We used insulin and glucose concentrations obtained from an intravenous glucose tolerance test (IVGTT) based on the hypothesis that it may contribute to a better assessment of the risk for developing type-2 diabetes, which is proven in this work. Since glucose and insulin concentrations obtained from the IVGTT test in this study describe the subjects long before the conventional diagnosis of type-2 diabetes, association of these glucose and insulin concentrations with other risk factors provides a new opportunity to predict development of type-2 diabetes. In previous reports, in non-diabetic relatives of subjects with type-2 diabetes, there has been controversy about importance of insulin secretion in predicting development of diabetes [32–35]. Our results show that increased insulin levels is a predictor (p < 0.05) for the development of diabetes in subjects with family history of diabetes (both parents type-2 diabetic). Our finding that increased insulin levels at 90 min is a good predictor (p < 0.05) for development of type-2 diabetes is an interesting result because insulin concentration at 90 min is a part of the second phase of insulin secretion (from 20 to 120 min). It has been proposed that it is a defect in insulin secretion in the first phase and not the second phase that is predictive of development of type-2 diabetes . Also, predictive factors related to first phase insulin secretion such as total insulin AUC after 5, 10 and 20 min of glucose bolus (including AUC over baseline at all three time points) did not predict the development of type-2 diabetes. Increase in BMI predicted the development of type-2 diabetes (p < 0.05). This is expected since obesity increases the risk of developing diabetes. Gender did not play any significant role in predicting the development of type-2 diabetes (p > 0.05).
Our final model states that in subjects with family history of diabetes, both higher BMI and increased plasma insulin levels at 90 min are predictors of diabetes. In validation testing, our model maintained good predicting consistency as evidenced by the value of the area under ROC curve and lower bias in area under the ROC curve calculated by bootstrap sampling.
The logistic regression model provided moderate sensitivity (54%) but very high specificity (99%). For our predictive model, the proportion of subjects predicted to develop diabetes who actually develop it (sensitivity) is important because we want to correctly identify subjects that will develop diabetes. The proportion of subjects predicted not to develop diabetes that actually do not develop it (specificity), was also considered important. The rationale for not ignoring the need for high specificity is that once subjects know that they are less likely to develop diabetes, their life style might become worse with respect to their diet, exercise and other manageable risk factors, and this may put them at higher risk for developing diabetes without this knowledge. Thus, our calculation of the cut-off (classification) probability of 0.06 was based on high specificity (99%) combined with a moderate sensitivity (54%).
Positive predictive value (PPV) of 96.7% means that among subjects with diabetic outcome, 96.7% of subjects were correctly identified as diabetics and negative predictive value (NPV) of 79.5% means that among subjects with normal (non-diabetic) outcome, 79.5% of subjects were correctly identified as non-diabetics. Our logistic regression model is simple to use to predict the development of diabetes based on the probability value (p) and the x value obtained from the model. If p > 0.06 (or x > −2.8), then the subject is predicted to develop diabetes and if p ≤ 0.06 (or x ≤ −2.8), then the subject is not predicted to develop diabetes.
Our model was able to identify 4.5 times more proportion (54% versus 11.6%) of subjects predicted to develop diabetes (true-positive test) and more than two times the proportion of subjects (99% versus 46.4%) predicted not to develop diabetes (true-negative test) than using current knowledge  about development of diabetes in subjects with family history of diabetes (both parents diabetic). This substantial improvement in predictability is remarkable since it is based on a single blood sample. No model in the diabetes literature addresses the individual risk assessment in subjects with family history of type-2 diabetes using pharmacokinetic testing. This study is unique in terms of the predictive model that address the individualized risk in subjects with family history of type-2 diabetes based on their BMI and insulin concentration at 90 min. The model is particularly attractive because it only requires a single blood sample at 90 min to determine the plasma insulin concentration. However, this sample comes from IVGTT test that is less practical than oral glucose tolerance test (OGTT). We would like to encourage future OGTT studies to explore our novel model to make individualized predictions development of type-2 diabetes. We believe that the individualized and more accurate risk assessment possible with our model may provide increased incentives to embark on a modified lifestyle, e.g. proper diet and exercise to reduce the risk of developing type-2 diabetes and should lead to a closer attention to the detection of the disease to ensure early medical intervention.
We greatly appreciate the kind support of this study by Dr. James Warram, Professor Emeritus of Joslin Diabetes Center, Boston, MA, for making data available.