|Home | About | Journals | Submit | Contact Us | Français|
This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Objective To develop and validate a new diabetes risk algorithm (the QDScore) for estimating 10 year risk of acquiring diagnosed type 2 diabetes over a 10 year time period in an ethnically and socioeconomically diverse population.
Design Prospective open cohort study using routinely collected data from 355 general practices in England and Wales to develop the score and from 176 separate practices to validate the score.
Participants 2540753 patients aged 25-79 in the derivation cohort, who contributed 16436135 person years of observation and of whom 78081 had an incident diagnosis of type 2 diabetes; 1232832 patients (7643037 person years) in the validation cohort, with 37535 incident cases of type 2 diabetes.
Outcome measures A Cox proportional hazards model was used to estimate effects of risk factors in the derivation cohort and to derive a risk equation in men and women. The predictive variables examined and included in the final model were self assigned ethnicity, age, sex, body mass index, smoking status, family history of diabetes, Townsend deprivation score, treated hypertension, cardiovascular disease, and current use of corticosteroids; the outcome of interest was incident diabetes recorded in general practice records. Measures of calibration and discrimination were calculated in the validation cohort.
Results A fourfold to fivefold variation in risk of type 2 diabetes existed between different ethnic groups. Compared with the white reference group, the adjusted hazard ratio was 4.07 (95% confidence interval 3.24 to 5.11) for Bangladeshi women, 4.53 (3.67 to 5.59) for Bangladeshi men, 2.15 (1.84 to 2.52) for Pakistani women, and 2.54 (2.20 to 2.93) for Pakistani men. Pakistani and Bangladeshi men had significantly higher hazard ratios than Indian men. Black African men and Chinese women had an increased risk compared with the corresponding white reference group. In the validation dataset, the model explained 51.53% (95% confidence interval 50.90 to 52.16) of the variation in women and 48.16% (47.52 to 48.80) of that in men. The risk score showed good discrimination, with a D statistic of 2.11 (95% confidence interval 2.08 to 2.14) in women and 1.97 (1.95 to 2.00) in men. The model was well calibrated.
Conclusions The QDScore is the first risk prediction algorithm to estimate the 10 year risk of diabetes on the basis of a prospective cohort study and including both social deprivation and ethnicity. The algorithm does not need laboratory tests and can be used in clinical settings and also by the public through a simple web calculator (www.qdscore.org).
The prevalence of type 2 diabetes and the burden of disease caused by it have increased very rapidly worldwide.1 This has been fuelled by ageing populations,2 poor diet,3 and the concurrent epidemic of obesity.4 5 The health and economic consequences of this diabetes epidemic are huge and rising.6 Strong evidence from randomised controlled trials shows that behavioural or pharmacological interventions can prevent type 2 diabetes in up to two thirds of high risk cases.7 8 9 10 Cost effectiveness modelling suggests that screening programmes aid earlier diagnosis and help to prevent type 2 diabetes or improve outcomes in people who develop the condition,11 12 making the prevention and early detection of diabetes an international public health priority.13 14 Early detection is important, as up to half of people with newly diagnosed type 2 diabetes have one or more complications at the time of diagnosis.15
Although several algorithms for predicting the risk of type 2 diabetes have been developed,16 17 18 19 no widely accepted diabetes risk prediction score has been developed and validated for use in routine clinical practice. Previous studies have been limited by size,16 and some have performed inadequately when tested in ethnically diverse populations.20 A new diabetes risk prediction tool with appropriate weightings for both social deprivation and ethnicity is needed given the prevalence of type 2 diabetes, particularly among minority ethnic communities, appreciable numbers of whom remain without a diagnosis for long periods of time.21 Such patients have an increased risk of avoidable morbidity and mortality.22
We present the derivation and validation of a new risk prediction algorithm for assessing the risk of developing type 2 diabetes among a very large and unselected population derived from family practice, with appropriate weightings for ethnicity and social deprivation. We designed the algorithm (the QDScore) so that it would be based on variables that are readily available in patients’ electronic health records or which patients themselves would be likely to know—that is, without needing laboratory tests or clinical measurements—thereby enabling it to be readily and cost effectively implemented in routine clinical practice and by national screening initiatives.
We did a prospective cohort study in a large population of primary care patients from version 19 of the QResearch database (www.qresearch.org). This is a large, validated primary care electronic database containing the health records of 11 million patients registered with 551 general practices using the Egton Medical Information System (EMIS) computer system. Practices and patients contained on the database are nationally representative for England and Wales and similar to those on other large national primary care databases using other clinical software systems.23
We included all QResearch practices in England and Wales once they had been using their current EMIS system for at least a year, so as to ensure completeness of recording of morbidity and prescribing data. We randomly allocated two thirds of practices to the derivation dataset and the remaining third to the validation dataset; we used the simple random sampling utility in Stata to assign practices to the derivation or validation cohort.
We identified an open cohort of patients aged 25-79 years at the study entry date, drawn from patients registered with eligible practices during the 15 years between 1 January 1993 and 31 March 2008. We used an open cohort design, rather than a closed cohort design, as this allows patients to enter the population throughout the whole study period rather than requiring registration on a fixed date; our cohort should thus reflect the realities of routine clinical practice. We excluded patients with a prior recorded diagnosis of diabetes (type 1 or 2), temporary residents, patients with interrupted periods of registration with the practice, and those who did not have a valid postcode related Townsend deprivation score (about 4% of the population).
For each patient, we determined an entry date to the cohort, which was the latest of their 25th birthday, their date of registration with the practice, the date on which the practice computer system was installed plus one year, and the beginning of the study period (1 January 1993). We included patients in the analysis once they had a minimum of one year’s complete data in their medical record.24 For each patient, we determined the right censor date, which was the earliest of the date of diagnosis of type 2 diabetes, date of death, date of deregistration with the practice, date of last upload of computerised data, or the study end date (31 March 2008).
Our primary outcome measure was the first (incident) diagnosis of type 2 diabetes mellitus as recorded on the general practice computer records. We identified patients with diabetes by searching the electronic health record for a diagnosis Read code for diabetes (C10%). As in other studies, we classified patients as having type 1 diabetes if they had a diagnosis of diabetes and had been prescribed insulin under the age of 35 and classified the remaining patients as having type 2 diabetes.25
We examined the following variables for inclusion in our analysis, all of which are known or thought to affect risk of developing diabetes,16 17 18 19 26 27 28 29 and are also likely to be recorded in the patients’ electronic records as part of routine clinical practice: self assigned ethnicity (nine categories); age at study entry (in single years); body mass index (continuous); smoking status (current smoker, not a current smoker); Townsend deprivation score (2001 census data evaluated at output areas as a continuous variable) ranging from −6 in the most affluent to 11 in the most deprived; recorded family history of diabetes in a first degree relative (binary variable yes/no); diagnosis of cardiovascular disease at baseline (binary variable yes/no); treated hypertension at baseline—that is, diagnosis of hypertension plus more than two prescriptions for antihypertensive drugs (binary variable yes/no); systemic corticosteroids at baseline—that is, at least two prescriptions within the preceding six months (binary variable yes/no).
We restricted all values of these variables to those that had been recorded in the person’s electronic healthcare record before the diagnosis of type 2 diabetes (or before censoring for those who did not develop type 2 diabetes). We used Read codes for ethnicity to denote self assigned ethnicity. The Read classification is the coding system in use in general practice in England and Wales (ICD-10 is the equivalent coding system in use in hospitals). We grouped the codes into the English National Health Service standard 16+1 categories for the initial descriptive analysis. We then combined these 16+1 categories into the final nine reporting groups, thereby ensuring sufficient numbers of events in each group to enable a meaningful analysis. The “white or not recorded” category comprised British, Irish, and other white background, as well as those whose ethnicity was not recorded. We designated this as the reference category. We combined the group for whom ethnicity was not recorded with the white ethnic group; assuming the study population is comparable to the United Kingdom population, 93% or more of people without ethnicity recorded would be expected to be from a white ethnic group. The “other including mixed category” comprised “white and black Caribbean,” “white and black African,” “white and Asian,” “other mixed,” “other black, and other ethnic group.” The “other Asian” category included Read codes for East African Asian, Indo-Caribbean, Punjabi, Kashmiri, Sri Lankan, Tamil, Sinhalese, Caribbean Asian, British Asian, mixed Asian, or Asian unspecified.
For body mass index and smoking status, we used the values recorded closest to the study entry date. We used body mass index rather than waist circumference, as the latter is not well recorded on clinical computer systems in the UK.
We calculated crude incidence rates of type 2 diabetes according to age, ethnic group, and deprivation in fifths. We then directly age standardised the incidence rates by ethnic group and deprivation by using the age distribution in five year bands of the entire derivation cohort as the standard population. We also used the same method to age standardise the means of continuous variables and proportions with risk factors by ethnic group.
We used a Cox proportional hazards model in the derivation dataset to estimate the coefficients and hazard ratios associated with each potential risk factor for the first ever recorded diagnosis of diabetes for men and women separately. As in a previous study,30 we used the Bayes information criterion to compare models.31 This is a likelihood measure in which lower values indicate better fit and in which a penalty is paid for increasing the number of variables in the model. We used fractional polynomials to model non-linear risk relations with continuous variables where appropriate.32 33 We tested for interactions between each variable and age and between smoking and deprivation and included significant interactions in the final model. Continuous variables were centred for analysis.
We used multiple imputation to replace missing values for smoking status and body mass index, and we used these values in our main analyses. We fitted our final model on the basis of multiply imputed datasets by using Rubin’s rules to combine estimates of effects and standard errors of estimates to allow for the uncertainty caused by missing data.34 Multiple imputation is a statistical technique designed to reduce the biases that can occur in “complete case” analysis along with a substantial loss of power and precision.35 36 37 The imputation technique involves creating multiple copies of the data and replaces missing values with imputed values on the basis of a suitable random sample from their predicted distribution. Multiple imputation therefore allows patients with incomplete data to still be included in analyses, thereby making full use of all the available data, and thus increasing power and precision, but without compromising validity.38 We used the ICE procedure in Stata to obtain five imputed datasets (further details are available from the corresponding author).39
We took the regression coefficient (that is, the log of the hazard ratio) for each variable from the final model and used these as weights for the new disease risk equations for type 2 diabetes. We combined these weights with the baseline survivor function for diagnosis of diabetes evaluated at 10 years and centred on the means of continuous risk factors to derive a risk equation for 10 years’ follow-up. We have presented the Townsend coefficients in standard deviation units so that this can be applied in a non-UK setting where other indices of deprivation might apply.
We compared our final model (model A) with three other models in order to determine the additional contribution to the fit (using the Bayes information criterion in which lower values indicate better fit) and performance of the model of including both ethnicity and deprivation in the algorithm. Our first supplementary model (model B) included all the variables except for deprivation and ethnicity, the second model (model C) included deprivation but not ethnicity, and the third (model D) included ethnicity but not deprivation.
We tested the performance of the final algorithm (the QDScore) in the validation dataset. We calculated the 10 year estimated risk of acquiring type 2 diabetes for each patient in the validation dataset by using multiple imputations to replace missing values for smoking status and body mass index, as in the derivation dataset.
We calculated the mean predicted risk and the observed risk of diabetes at 10 years and compared these by 10th of predicted risk. The observed risk at 10 years was obtained by using the 10 year Kaplan-Meier estimate. We calculated the Brier score (a measure of goodness of fit where lower values indicate better accuracy40) by using the censoring adjusted version adapted for survival data,41 D statistic (a measure of discrimination where higher values indicate better discrimination),42 and an R2 statistic (a measure of explained variation for survival data, where higher values indicate that more variation is explained).43 We also calculated the area under the receiver operator curve, where higher values indicate better discrimination. We also compared the performance of the QDScore with the Cambridge risk score,16 which includes age, sex, body mass index, smoking status, corticosteroids, antihypertensive treatment, and family history of diabetes.
We calculated the proportion of patients in the validation sample who had an estimated 10 year risk of diagnosed diabetes of ≥10%, ≥15%, ≥20%, ≥30%, ≥40%, and ≥50% by age, sex, ethnic group, and deprivation according to the QDScore.
We used all the available data on the QResearch database and therefore did not do a pre-study sample size calculation. We used Stata (version 10) for all analyses and chose a significance level of 0.01 (two tailed).
Overall, 531 UK practices met our inclusion criteria, of which 355 were randomly assigned to the derivation dataset and 176 to the validation dataset. We excluded 20 practices: four practices had not completely uploaded all their electronic data for the relevant study period, seven practices were from Scotland, and nine practices were from Northern Ireland.
The derivation cohort contained 2594578 patients, of whom 53825 had type 1 or type 2 diabetes before the start of the study and were therefore excluded leaving 2540753 patients (1283135; 50.50% women) aged 25-79 years and free of diabetes at baseline for analysis. The validation cohort contained 1261419 patients aged 25-79, of whom 28587 had a previous diagnosis of type 1 or type 2 diabetes leaving 1232832 patients for analysis (50.49% women).
Overall, we studied 3773585 patients contributing 24079172 person years, of whom 115616 patients (78081 in the derivation cohort and 37535 in the validation cohort) had a new diagnosis of type 2 diabetes during follow-up. Table 11 compares the characteristics of eligible patients in the derivation and validation cohorts. Although this validation cohort was drawn from an independent group of practices, the baseline characteristics were very similar to those for the derivation cohort. Overall, 898461 patients (23.81% of 3773585) had ethnicity recorded, and 122736 (13.66%) of these were from a non-white ethnic group. Practices in areas where the proportion of patients from a non-white ethnic group is higher according to the 2001 census (such as London (28.9%), East Midlands (6.5%), and West Midlands (11.3%)) also have higher rates of completeness of recording of ethnicity on the QResearch database (40.1%, 21.4%, and 30.1% for the above areas).
Table 11 shows that 78.97% of women in the derivation cohort had body mass index recorded and 90.00% had smoking status recorded; 78.03% had both body mass index and smoking status recorded. For men, the corresponding figures were 71.19%, 83.24%, and 70.12%. Overall, 22.97% of women and 29.88% of men had either smoking or body mass index imputed by multiple imputation (data were not imputed for ethnicity—all patients with missing ethnicity were treated as white/not recorded). Similar figures were observed for men and women in the validation cohort, where multiple imputation was also used.
Table 22 shows the characteristics of men and women with complete data for smoking and body mass index compared with those who had missing data. Women with missing data had different patterns of risk factors—for example, women with complete data for body mass index were more likely to have a family history of diabetes, to be recorded as current smokers, and to have treated hypertension. They also had a lower 10 year observed risk of diabetes compared with women with missing body mass index data. Women with complete data for smoking were more likely to have a diagnosis of cardiovascular disease, a diagnosis of treated hypertension, and a family history of diabetes. The 10 year observed risk of diabetes was lower than for women whose smoking status was missing. The pattern was similar for men for most risk factors, except that the observed risks of diabetes were lower among men with missing data.
Table 33 shows the crude and age standardised rates of type 2 diabetes by sex, deprivation, and ethnicity in the derivation cohort. The age standardised rates for the white reference group were 4.13 (95% confidence interval 4.08 to 4.17) per 1000 person years for women and 5.31 (5.26 to 5.36) per 1000 person years for men. The crude and age standardised incidence rates of type 2 diabetes in the derivation cohort varied widely between ethnic groups, as shown in table 33.. Age standardised rates were significantly higher for men in every ethnic group compared with the white reference group, except for Chinese men. In women, age standardised incidence rates were higher for every group compared with the white reference group. The highest age standardised rates were in South Asians, and significant differences existed between the South Asian groups. For example, the rate for Bangladeshi women was 18.20 (12.93 to 23.47) per 1000 person years and that for Bangladeshi men was 19.34 (14.28 to 24.4) per 1000 person years. For Pakistanis, the corresponding rates per 1000 person years were 11.19 (9.16 to 13.21) for women and 13.22 (11.24 to 15.21) for men.
We also found a marked difference in the age standardised incidence rates of type 2 diabetes by deprivation, with a more than twofold difference for women when comparing the most deprived fifth (6.39 (6.25 to 6.54) per 1000 person years) with the most affluent fifth (3.00 (2.93 to 3.08) per 1000 person years). A similar, but less steep gradient was seen for men. The rates seen in the validation cohort were similar to those for the derivation cohort (data not shown).
Table 44 shows the age standardised distribution of risk factors across each of the main ethnic groups. Substantial heterogeneity exists across the ethnic groups for risk factors, and the distribution also differs between men and women within ethnic groups. The notable results include substantial differences in the age standardised prevalence of smoking among men of Bangladeshi (46.04%, 95% confidence interval 43.16% to 48.92%), Caribbean (40.45%, 38.99% to 41.91%), Pakistani (32.82%, 31.29% to 34.35%), white/not recorded (33.49%, 33.40% to 33.58%), Chinese (26.63%, 24.23% to 29.03%), Indian (22.71%, 21.60% to 23.81%), and black African (17.95%, 16.76% to 19.14%) origin. Smoking rates were lower for women in each ethnic group compared with men but varied widely between women from different groups.
Treated hypertension was highest among black Caribbean and black African men and women and more than twice as high as that for the white reference group. Recorded family history of diabetes was highest among black Caribbean women (32.63%, 31.41% to 33.85%) and Indian men (29.95%, 28.78% to 31.11%), which was more than three times that for the white reference group who had the lowest rates (11.32%, 11.27% to 11.38% for women and 8.07%, 8.02% to 8.12% for men).
Bangladeshi men and women had the highest age standardised mean deprivation scores, followed by those of black African and black Caribbean origin. Indians and the white reference group had the lowest mean deprivation scores, as shown in table 44.
The highest mean body mass index was seen among black African women (age standardised mean 28.44, 28.29 to 28.58) compared with 25.47 (25.46 to 25.48) for women in the white reference group. The lowest value was in Chinese women (age standardised mean 22.87, 22.68 to 23.06). Similar patterns, although slightly less marked, were seen for men across the ethnic groups. Finally, 9.70% (7.76% to 11.65%) of Bangladeshi men had a recorded diagnosis of cardiovascular disease at baseline, which was more than twice that for men in the white reference group (4.54%, 4.50% to 4.57%) and more than four times that found in Chinese men (2.26%, 1.15% to 3.37%).
Table 55 shows the results of the Cox regression analysis for the QDScore. After adjustment for all other variables in the model, we found significant associations with risk of type 2 diabetes in both men and women for age, body mass index, family history of diabetes, smoking status, treated hypertension, use of corticosteroids, diagnosed cardiovascular disease, social deprivation, and ethnicity. We therefore included these variables in the final model and risk prediction algorithm.
We found significant heterogeneity of risk of type 2 diabetes by ethnic group compared with the white reference population, having adjusted for age, body mass index, deprivation, family history of diabetes, smoking status, treated hypertension, diagnosed cardiovascular disease, use of corticosteroids, and diagnosed cardiovascular disease, as shown in table 44.. For example, among Bangladeshis, the adjusted hazard ratio for women was 4.07 (95% confidence interval 3.24 to 5.11) and that for men was 4.53 (3.67 to 5.59). These were significantly higher than the increased hazard ratios in Pakistani women and men (2.15, 1.84 to 2.52; and 2.54, 2.20 to 2.93). Both Pakistani and Bangladeshi men had significantly higher hazard ratios than Indian men. Black African men and Chinese women had increased risks compared with the corresponding white reference group. The only groups to have significantly lower risks than the white reference group were black African women (0.81, 0.66 to 0.98) and black Caribbean women (0.80, 0.70 to 0.92).
The fractional polynomial terms selected for inclusion in the model were as follows. For age in women the two terms were (age/10)½ and (age/10)3. For body mass index in women, the two terms were (bmi/10) and (bmi/10)3. For men, the two age terms were log(age/10) and (age/10)3 and the two terms for body mass index were (bmi/10)2 and (bmi/10)3. Figure 11 shows the estimated adjusted hazard ratios by age and body mass index for these fractional polynomial terms in men and women.
We identified significant interactions between age and body mass index, age and family history of diabetes, and age and smoking status. We therefore included these interactions in the final model, and the general direction of the effects was that body mass index and family history of diabetes tended to have a greater impact on risk of diabetes at younger ages, as shown in fig 11.. Smoking had a more complex relation with age; the risk peaked in middle age for both men and women.
In a comparison of models, the median Bayes information criterion for women for our final model (model A) was 875203, for the model without deprivation and ethnicity (model B) it was 876400, for the model without ethnicity (model C) it was 875270, and for the model without deprivation (model D) it was 876198, indicating that the model that incorporated both ethnicity and deprivation was superior to the other three. For men, the corresponding figures were 1086755, 1087745, 1087034, and 1087369, similarly supporting the inclusion of both ethnicity and deprivation into the final model.
Table 66 shows the results for the validation statistics for men and women after application of the QDScore and the Cambridge risk score in the validation dataset. The QDScore shows higher levels of discrimination than the Cambridge risk score. For example, in women the D statistic for the QDScore was 2.11 (95% confidence interval 2.08 to 2.14) compared with 1.88 (1.85 to 1.91) with the Cambridge risk score; a 0.1 difference in the D statistic indicates an important difference in prognostic separation between two risk algorithms.41 The QDScore explained a higher proportion of the variation—it explained 51.53% of the variation in women and 48.16% of that in men. The corresponding values for the Cambridge risk score were 45.77% and 41.82%. The Brier score, however, was slightly lower for the Cambridge risk score in both men and women.
Figure 22 compares the mean predicted scores from the QDScore with the observed risks at 10 years within each 10th of predicted risk in order to assess the calibration of the model in the validation sample. The close correspondence between predicted and observed 10 year risks within each model 10th suggests that the model was well calibrated. For example, in the top 10th of risk, the mean predicted risk was 18.31% (95% confidence interval 18.24% to 18.38%) in women and the observed risk was 18.82% (18.39% to 19.26%). The ratio of predicted to observed risk in this tenth was 0.97, indicating almost perfect calibration (a ratio of 1 indicates perfect calibration—that is, no under-prediction or over-prediction). We found similar results for men, with a ratio of 0.99 in the top 10th of predicted risk.
Table 77 shows the percentages of men and women in the validation dataset with a 10 year predicted risk of being diagnosed as having type 2 diabetes according to a range of thresholds and by age band. For example, at the 10% threshold, 10.60% of women and 15.06% of men had a 10% or higher predicted risk of being diagnosed as having type 2 diabetes over 10 years. This varied markedly by age such that 21.43% of women aged 55-59 and 30.99% of women aged 65-69 had a 10% or greater risk of being diagnosed as having type 2 diabetes over 10 years. The corresponding figures for men were 33.28% and 44.08%.
Tables 88 and 99 show the 10 year risk of type 2 diabetes among men and women of different ethnic groups and for those living in the most deprived and affluent areas. For example, 33.83% of Bangladeshi women had a 10 year risk of being diagnosed as having diabetes of 10% or more compared with 10.48% of women in the white reference group, and 15.03% of women in the most deprived fifth had a 10% or higher risk of developing diabetes over the next 10 years compared with 6.52% of women in the most affluent fifth. The difference between affluent and deprived fifths is more marked for women than for men; the corresponding figures are 15.65% for men in the most deprived fifth and 13.21% for men in the most affluent fifth.
Overall, almost half (15545/32450; 47.9%) of cases of diabetes occurred in the top 10th of the distribution (risk of ≥10.38%) and almost 70% (22476/32450) occurred in the top fifth (risk of ≥5.98%).
The QDScore is the first diabetes prediction algorithm developed and validated by using routinely collected data to predict the 10 year risk of developing type 2 diabetes. Our final model includes both deprivation and ethnicity as well as age, sex, smoking, treated hypertension, body mass index, family history of diabetes, current treatment with corticosteroids, and previous diagnosis of cardiovascular disease. The QDScore does not require any laboratory testing or clinical measurements and so can be used in many settings, including by individual members of the public who have access to a computer. This risk prediction tool might be used to identify and proactively intervene in people identified as having an increased risk. This algorithm, like other algorithms that predict cardiovascular disease,30 44 relies on routinely collected data and has the advantage that it is readily implementable. Furthermore, it is likely to reduce, rather than exacerbate, widespread and persistent health inequalities. The QDScore performed well compared with the Cambridge risk score. Assuming that the effectiveness and cost effectiveness of suitable interventions shown in randomised controlled trials extend to unselected patients from primary care,7 8 9 10 the QDScore could be used to identify patients at increased risk of diabetes who might benefit from interventions to reduce their risk.
The traditional method for identifying patients at increased risk of type 2 diabetes has involved the detection of impaired glucose tolerance requiring an inconvenient and expensive oral glucose tolerance test. Targeted screening of higher risk groups has been proposed as a more cost effective solution,45 as the risk factors for diabetes and cardiovascular outcomes overlap considerably.46 Less expensive and more practical methods of identifying patients at increased risk are needed; these should ideally be based on models developed from contemporaneous data in ethnically and socioeconomically diverse populations obtained from the clinical setting in which these models will subsequently be applied. Simple clinical models using readily available data can offer similar discrimination to more complex models using laboratory data or biomarkers,17 and clinical models that do not need clinical measurements may have a further utility in settings where clinical measurements are not available or are too costly to collect.47 UK datasets derived from family practices have the advantage of having large and broadly representative populations with historical data tracking back well over a decade in most practices. These databases also contain data on many of the key variables known to be associated with risk of type 2 diabetes, such as age, sex, ethnicity,28 48 49 smoking,16 24 50 body mass index,16 17 28 48 family history of diabetes,16 17 28 48 49 treated hypertension,16 17 current use of corticosteroids,16 and social deprivation.51 Deprivation is not only strongly associated with increased prevalence of diabetes and diabetes related risk factors such as diet, obesity, and smoking but is also associated with poorer outcomes and intermediate measures such as achievement of lipid targets.51
Particular strengths of our study are the use of a large representative population from a validated database, our prospective cohort design, and the substantial numbers of patients with self assigned ethnicity for use in the analysis. We have modelled interactions with age and included these in the final model, so our algorithm takes account of the differential effect of three key variables (family history of diabetes, body mass index, and smoking) at different ages.
Another important strength of the QDScore is that all the variables used in the algorithm will either be known to an individual patient or are collected as part of routine clinical practice and recorded within an individual patient’s primary healthcare record in most economically developed countries. This means that the algorithm can be used by patients for self assessment in a web based calculator (www.qdscore.org) similar to the one available for self assessment of cardiovascular disease (www.qrisk.org). Alternatively, it can be implemented within clinical computer systems used in primary care and be used to stratify the practice population (aged 25-79) for risk on a continuing basis without the need for manual entry of data. Although no widely agreed thresholds for classification of patients at “high risk” exist, the QDScore could act as a basis for a systematic programme to identify patients at increased risk for intervention or to aid earlier diagnosis. Importantly, appropriate weighting for ethnicity and social deprivation should furthermore help to avoid widening health inequalities associated with introduction of systematic programmes of disease prevention activities.
Despite its strengths, our study has limitations compared with an ideal study. In the ideal study, a large representative and tightly phenotyped primary care cohort would be assembled and followed longitudinally over the course of a decade. No patients would be lost to follow-up, and all patients would be subjected to repeated oral glucose tolerance tests throughout follow-up to confirm or refute the diagnosis of type 2 diabetes. Although such a study would be very welcome, it would take at least 15 years to carry out and report, it would be unlikely to be feasible in routine primary care, and calibration would be inaccurate in more socially and ethnically diverse populations with different baseline risks. Our study offers a practical alternative approach, which can be implemented into primary care in a cost effective manner, while acknowledging the potential biases and their likely impact.
One limitation of our study is that the main outcome was type 2 diabetes diagnosed by a clinician and recorded on the clinical computer system. The outcome was not formally validated, and we have not used the results of laboratory tests to confirm the diagnosis. However, this diagnosis would be unlikely to be recorded if the patient did not have diabetes—other studies of similar databases have shown good levels of accuracy for common chronic conditions, especially those that are now included in the UK quality and outcomes framework.52
Undiagnosed diabetes is a well recognised problem and is not specifically considered by our study. It is estimated to affect approximately 3% of the population according to the health survey for England.53 Some evidence suggests that South Asian women are more likely to have undiagnosed diabetes than are the general population, so our hazard ratios might underestimate the association for these patients.54 The risk factors for diagnosed diabetes are very similar to those for undiagnosed diabetes.55 Nevertheless, most previously undiagnosed cases are likely to have been included in the identified high risk groups and to have been picked up by systematic further evaluation, because risk stratification improves yield.55
Our study might have been affected by recording bias if a patient diagnosed as having diabetes was not recorded as having diabetes on the practice computer system. The recording bias could lead to misclassification of patients either at baseline or at follow-up and is part of the justification for having a targeted approach. Any misclassification bias of the outcome, if non-differential, would tend to bias the hazard ratio towards one and reduce discrimination.
Recording of a positive family history of diabetes was higher among women than among men. This could reflect recording bias or information biases resulting from differences in family history among women or greater opportunity for the information to be recorded as women tend to have higher consultation rates than men. Our study might have been affected by an ascertainment bias caused by differential testing of patients for diabetes by ethnic group or in those with specific risk factors. This could lead to increased rates of detection among patients with specific risk factors, including South Asian ethnicity, a family history of diabetes, or obesity—increased awareness among patients and clinicians might increase the likelihood of testing and therefore of clinical diagnosis. The effect of this would be to increase the apparent strength of the association between the risk factors and incident diabetes. Nonetheless, our hazard ratios for the risk factors in the model are generally of a similar magnitude to those found in other studies which tested for diabetes in the entire study cohort.56 In addition, the assessment and recording of these factors in clinical practice is becoming increasingly routine and complete, so limiting the effect of this potential bias.
Another potential limitation of our study is that 25% of patients had missing values for either body mass index or smoking status. Patients with complete data tended to have different risks than those with missing data. We therefore used the technique of multiple imputation to substitute missing values for smoking and body mass index, rather than excluding these patients, as this is a less biased approach that makes the most efficient use of available data. The differences in risk factors and in the observed risks of diabetes between patients with and without missing data support the use of multiple imputation rather than a complete case analysis.
Clinicians had recorded our predictor variables on the clinical computer system before the diagnosis of type 2 diabetes, so these will not have been subject to recall bias. We have used the entire population registered with the QResearch practices contributing to the database from England and Wales. Consequently, the population is unlikely to be affected by selection bias, in contrast to the selection bias that inevitably occurs when patients are individually recruited to clinical cohorts or clinical trials.57 We have included a proxy measure of material deprivation, the Townsend score,58 which is based on the patient’s postcode at the level of the output area (corresponding to around 150 households) and is a composite score comprising lack of a car, unemployment, over-crowding, and non-home ownership. Some people living within an output area will not be typical of the other residents, resulting in some misclassification. Deprivation is likely to be associated with other factors known to increase risk of diabetes, such as poor diet, lack of exercise, and increased alcohol intake, and so will account at least in part for some of the effect of these factors.51 Lastly, we included treated hypertension as a predictor as both blood pressure and some antihypertensive drugs (such as thiazides) may have contributed to the increased hazard ratios associated with this variable.
We used self assigned ethnicity in our analyses, as reported by patients to their general practices, which has advantages over analyses in which ethnicity is assigned by an informant rather than the patient, is imputed geographically, or is related to country of birth. The last of these is particularly problematic as increasing numbers of people from minority ethnic groups are now being born in the UK. We have also been able to disaggregate the South Asian groups and report on them separately, which answers concerns with studies that tend to combine them into one group when they differ in risk factor exposure, disease rates, and outcomes. One important limitation is that only one quarter of patients overall had self assigned ethnicity recorded. Among those with a recorded value, 13.66% were recorded as from a minority ethnic group, which is higher than the estimated figure for 2006 based on the 2001 census, indicating over-representation of practices from ethnically diverse areas, that practices in ethnically diverse areas are more likely to record ethnicity, or most likely a combination of both. We have assumed that where patients have self assigned ethnicity recorded (as Bangladeshi, for example) this is accurate and the patient was indeed Bangladeshi. Where patients did not have ethnicity recorded, we have assumed they were white. Any misclassification arising from these assumptions is most likely to affect the reference category of “white or not recorded,” but because of the mix of the populations of England and Wales, less than 7% of such patients are likely to be from a non-white ethnic group. This misclassification error is likely to be non-differential and if so will tend to underestimate the relative effect of ethnicity on risk of type 2 diabetes rather than generating spurious associations. Misclassification would also tend to reduce levels of discrimination and underestimate risk in some misclassified patients. We restricted all values of variables in the model to those that had been recorded in the person’s electronic healthcare record before the diagnosis of type 2 diabetes (or before censoring for those who did not develop type 2 diabetes) in order to avoid recording bias.
We validated the QDScore in a separate sample of general practices from those used to develop the score. The QDScore has good discrimination (that is, ability to separate out people who did and those who did not subsequently develop type 2 diabetes) and explains approximately 50% of the total variation in times to diagnosis of diabetes. The D statistic, which is a measure of discrimination appropriate for survival type data, was higher than in our cardiovascular disease algorithm and that reported in some other studies.30 42 This increases the likelihood that the algorithm will more accurately predict risk for an individual patient. An important limitation of our validation is that a degree of over-optimism could exist as, although we have used a completely physically discrete set of general practices for the validation, these practices use the same clinical computer system (EMIS) as those used to derive the algorithm. This system is, however, currently in use in 60% of UK general practices, so the diabetes clinical risk algorithm is at least likely to perform well for well over half of the UK’s population. A more stringent test of performance would involve practices using a different clinical computer system; however, recording of ethnicity in other general practice databases is at present likely to be too low for a meaningful comparison, as EMIS has more practices in ethnically diverse areas. Nonetheless, our previous algorithm for cardiovascular disease, developed with similar methods and the same database,44 has subsequently performed well on another database containing primary care data from practices using a different clinical computer system.23
Our study has good face validity, as the prevalence of established risk factors reported here corresponds to that reported elsewhere.59 We found a significant heterogeneity of risk factors, incidence rates, and hazard ratios for type 2 diabetes across the ethnic groups. The high prevalence of a recorded family history among South Asians may reflect a true increased rate or could be due to differences in what constitutes a first degree relative (for example, cousins may be regarded as siblings). Of particular interest are the significant differences in hazard ratios between the South Asian groups; Bangladeshi men and women had higher risks than Pakistanis, who in turn had higher risks than Indians.
Routinely collected data from electronic primary healthcare records have been used to develop other risk prediction algorithms. For example, data from 531 general practices was used to develop and validate the QRISK2 cardiovascular disease risk tool, which is being implemented in clinical settings in the UK.30 44 The Cambridge diabetes risk score was developed by combining data from two different general practice samples. The first sample consisted of half of the participants recruited for the study, in which patients were tested for diabetes by using an oral glucose tolerance test in one general practice in Cambridgeshire. The second sample consisted of half of the incident cases of diabetes identified over a 12 month period from 41 practices in the south of England.16 20 The combined data from a total of 650 patients, including 126 cases of diabetes, were then used to derive a risk score designed to identify patients with undiagnosed diabetes at a point in time.16 20 The score was then validated in the remaining half of the recruited patients from the practice in Cambridgeshire. The Cambridge risk score has since been applied to a prospective cohort to estimate the risk of incident diabetes in 25000 people from Norfolk.60
One advantage of the QDScore is the use of a larger and more representative cohort, which is more likely to generalise to the UK. Another advantage is the inclusion of both deprivation and self assigned ethnicity, which are independently associated with risk of incident diabetes; this is likely to help with the problems identified with the Cambridge risk score in its performance in ethnically diverse populations.20 The QDScore explained significantly more of the variation and had improved discrimination compared with the Cambridge risk score. Overall, almost half (15545/32450; 47.9%) of cases of diabetes occurred in the top 10th of the distribution and almost 70% (22476/32450) occurred in the top fifth based on the diabetes clinical risk score. This compared with 27.3% and 50% for the top 10th and fifth reported in the Cambridge risk score paper.16 We cannot determine the calibration of the Cambridge risk score, as it does not give a measure of absolute risk over a given time period.
Our validation has some limitations. Although our validation cohort consisted of separate practices and patients, the practices used the same clinical computer system (EMIS) and so there may be a degree of over-optimism. Future studies could test the performance of the QDScore in other databases based on practices using a different clinical computer system or in cohorts in which formal diagnostic testing may be possible.
We did not do comparisons with other prospective studies that have developed a risk prediction score for which laboratory tests are needed (such as measurement of high density lipoprotein cholesterol,17 49 triglycerides,17 49 or fasting glucose17) or that have included variables which are difficult to measure consistently and reliably such as waist circumference and which, unlike body mass index, are not routinely recorded in general practice.18 49 Other diabetes scores have been developed within specific ethnic groups (for example, Mexican Americans,48 Japanese Americans28), but we have too few patients in the UK in these ethnic groups to allow a meaningful comparison to be made within this analysis. Nonetheless, our receiver operator curve statistic of 0.85 for women and 0.83 for men is substantially higher than those in many studies, which have reported values ranging between 0.71 and 0.8016 27 28 49 60; it is very comparable to the three studies reporting the highest receiver operator curve statistics, with values of 0.85 and 0.86.17 18 48 Lastly, although data on fasting and random glucose are recorded to some extent within primary care electronic health records,25 61 we did not think that these were suitable for use in a prediction score, as they are the basis for making diagnoses of diabetes in this context rather than being recorded in a representative sample of patients at baseline. In addition, we were interested to develop a score that did not require laboratory measurements.
Simple risk algorithms have performed well in comparison with more complex clinical evaluations in studies of diabetes and cardiovascular disease.17 47 This algorithm to predict risk of type 2 diabetes has the unique advantage of including both ethnicity and social deprivation, can be derived without laboratory measurements, and thus is suitable for use both in clinical settings and for self assessment. The QDScore could be used to identify patients at high risk of diabetes who might benefit from interventions to reduce their risk.
We acknowledge the contribution of Egton Medical Information System (EMIS) and practices using EMIS and contributing to the QResearch database.
Contributors: JH-C initiated and designed the study, obtained approvals, prepared the data, did the analysis and interpretation, and wrote the first draft of the paper. CC contributed to the development of the protocol, to the design, analysis, and interpretation, and to drafting the paper; she also did some of the primary analyses with JH-C. JR, PB, and AS contributed to the protocol, interpretation, and drafting the article. All authors approved the final draft. JH-C is the guarantor.
Funding: This study received no external funding. The authors did the work either in their personal time or during the course of their normal employment. The corresponding author (JH-C) and CC had access to all the data in the study, and all authors agreed and share responsibility for the decision to submit for publication.
Competing interests: JH-C is co-director of QResearch, a not for profit organisation, which is a joint partnership between the University of Nottingham and EMIS. JH-C is also director of ClinRisk Ltd, which produces software to ensure the reliable and updatable implementation of clinical risk algorithms within clinical computer systems to help to improve patients’ care. EMIS is the leading supplier of information technology for 60% of UK general practices and may implement the QDScore within its clinical computer system. AS chairs the Equality and Diversity Forum of the National Clinical Assessment Service and is co-investigator on an MRC/NPRI funded randomised controlled trial aiming to prevent onset of type 2 diabetes in South Asians in the UK; he is also a co-investigator on the MRC Edinburgh Translational Medicine Methodology Hub. QResearch does analyses for the Department of Health and other government organisations. All research using QResearch is peer reviewed and published. This work and any views expressed within it are solely those of the co-authors and not of any affiliated bodies or organisations.
Ethical approval: The proposal was approved by the Trent Multi Centre Research Ethics Committee.
Cite this as: BMJ 2009;338:b880