|Home | About | Journals | Submit | Contact Us | Français|
Randomized trials have examined short-term effects of lifestyle interventions for diabetes prevention only among high-risk individuals. Prospective studies have examined the associations between lifestyle factors and diabetes in healthy populations but have not characterized the intervention. We estimated long-term effects of “hypothetical” lifestyle interventions on diabetes in a prospective study of healthy women, using the parametric g-formula.
Using data from the Nurses’ Health Study, we followed 76,402 women from 1984 to 2008. We estimated the risk of type 2 diabetes under 8 “hypothetical” interventions: quitting smoking, losing weight by 5% every 2 years if overweight/obese, exercising at least 30 minutes a day, eating less than 3 servings a week of red meat, eating at least 2 servings a day of whole grain, drinking 2 or more cups of coffee a day, drinking 5 or more grams of alcohol a day and drinking less than 1 serving of soda a week.
The 24-year risk of diabetes was 9.6% under no intervention and 4.3% when all interventions were imposed (55% lower risk [95% confidence interval= 47% to 63%]). The most effective interventions were weight loss (24% lower risk), physical activity (19%) and moderate alcohol use (19%). Overweight/obese women would benefit the most, with 10.8 percentage points reduction in 24-year risk of diabetes. The validity of these estimates relies on absence of unmeasured confounding, measurement error, and model misspecification.
A combination of dietary and non-dietary lifestyle modifications, begun in mid-life or later in relatively healthy women, could have prevented at least half of the cases of type 2 diabetes in this cohort of US women.
Diabetes is a major cause of death and disability worldwide,1,2 and its prevalence has increased substantially in most regions of the world in the last three decades.3 Complications of diabetes put a major economic burden on health systems in both developed and developing countries.4-6 Primary prevention of type 2 diabetes, which constitutes more than 90% of cases, is a major concern for health systems worldwide.
Several randomized trials have examined the effect of physical activity, smoking cessation, weight loss and healthy diet on the incidence of type 2 diabetes in high-risk participants over relatively short periods of about 3 years.7-15 Overall, these randomized trials reported about 50% reduction in the incidence of type 2 diabetes after intensive lifestyle modifications.16-18 Prospective observational studies have mostly examined the long-term association between lifestyle and type 2 diabetes incidence19-22 in relatively healthy populations, but these studies did not specify the corresponding interventions or the time of their initiation.
Devising an informed policy for diabetes prevention requires estimating the effect of lifestyle interventions initiated in mid-life or later (as randomized trials did) over long periods and in relatively healthy populations (as observational studies did). We applied the parametric g-formula to data from the observational Nurses’ Health Study to estimate the 24-year risk of type 2 diabetes under various hypothetical lifestyle interventions that start in mid-life or later.
The Nurses’ Health Study is a prospective cohort study that started in 1976 by enrolling 121,700 US female nurses aged 30 to 55 years. A questionnaire was mailed to collect data on sociodemographic, lifestyle and dietary factors and on history of diseases and treatments. Biennial questionnaires have been mailed since then to update information on risk factors and disease incidence. More details on this cohort are available elsewhere.23
We used the 1984 questionnaire as baseline, because a detailed 131-item food frequency questionnaire (FFQ) was distributed in that year. Women were excluded from our analysis if they had a diagnosis of diabetes before 1984 or did not have information on diabetes diagnosis date. In addition, we excluded participants who left more than 70 items blank on the baseline FFQ, or who reported unusual total energy intakes (i.e., energy intake <500 or >3500 kcal/day). We also excluded participants without baseline information on date of birth, body weight and height, smoking status, physical activity, and dietary variables, and those who had cancer or cardiovascular disease at baseline (see eFigure 1 for a flowchart of participant selection). After these exclusions, 76,402 women were available for the analysis. Women were followed until the occurrence of type 2 diabetes, death, incomplete follow-up (i.e. not returning a questionnaire), or administrative end of follow-up in June 2008, whichever happened first.
Diet was measured using a semi-quantitative FFQ in 1984, 1986 and every 4 years afterwards. The FFQ asked about the usual intake of various food items, including alcoholic drinks, during the past 12 months. The reproducibility and validity of the questionnaire have been shown elsewhere. 24 Dietary data recorded in the 1980 questionnaire was used to adjust for pre-baseline diet. Physical activity was reported in 1980, 1982, 1986, 1988 and every 4 years thereafter, using a validated questionnaire25 on type, frequency and intensity of each activity. We summed the duration of moderate or vigorous activities per week (i.e. requiring at least 3 Metabolic Equivalent of Task [MET] scores/hour, including brisk walking). Body weight was self-reported in each biennial questionnaire and height was self-reported in 1976.
We truncated the values of dietary risk factors, body mass index (BMI) and physical activity in each period at the 99th percentile to prevent implausible values from affecting our analyses. Sensitivity analyses using various thresholds based on expert knowledge, or setting values above the 99th percentile to missing and carrying the last observed value forward, did not change the results materially.
Details of diabetes ascertainment has been described elsewhere.26 Briefly, a supplementary questionnaire was mailed to participants who reported a diagnosis of diabetes. A case of type 2 diabetes was considered confirmed if, according to the National Diabetes Data Group criteria,27 at least one of the following was reported on the supplementary questionnaire: 1) one or more classic symptoms plus fasting plasma glucose levels of ≥ 7.8 mmol/L or random plasma glucose levels of ≥ 11.1 mmol/L; 2) at least two elevated plasma glucose concentrations on different occasions (fasting plasma glucose ≥ 7.8 mmol/L, random plasma glucose levels of ≥ 11.1 mmol/L, or concentrations of ≥ 11.1 mmol/L after two hours or more shown by oral glucose tolerance testing) in the absence of symptoms; or 3) treatment with insulin or oral hypoglycaemic agents. The diagnostic criteria changed in June 1998: according to the American Diabetes Association criteria,28 a fasting plasma glucose of 7.0 mmol/L instead of 7.8 mmol/L was considered the threshold for the diagnosis of diabetes. Only confirmed cases were included in the analysis. We excluded cases designated as gestational or secondary diabetes. The validity of the supplementary questionnaire has been previously documented by reviewing medical records and assessing undiagnosed diabetes in a random sample of women.26,27,29 Deaths were identified by reports from next of kin or postal authorities, or by searching the National Death Index. At least 98% of deaths among the study participants were identified.30
We considered 8 hypothetical interventions and their combinations, based on the evidence from both randomized trials and observational studies on their potential effect for diabetes prevention:19,20,31-35 quitting smoking, losing 5% of BMI every 2 years if overweight or obese (defined as BMI ≥ 25 kg/m2), exercising at least 30 minutes a day (moderate or vigorous), eating at least 2 servings of whole grain per day, drinking at least 2 cups of coffee per day, drinking at least 5 grams of alcohol per day, eating at most 3 servings of red meat per week (including unprocessed and processed), and drinking at most 1 serving of soda per week. All interventions started at baseline in 1984 and continued until the end of follow-up. Except for the weight loss intervention, which imposed a gradual decline in body weight, all other interventions imposed a minimum or maximum threshold on the level of a risk factor. For these interventions, values beyond the threshold were set to the threshold level.
The intensities of the interventions were chosen to match current public health guidelines (e.g. Center of Disease Control and Prevention guidelines for physical activity, World Health Organization definition of overweight and obesity) or were selected to reflect feasible public health interventions. We estimated the effects of more intensive interventions for weight loss and physical activity in sensitivity analyses.
We used the parametric g-formula,36 a generalization of standardization for time-varying exposures and confounders, to estimate the 24-year risks of type 2 diabetes under the selected lifestyle interventions. The parametric g-formula has been previously used to estimate the effect of multiple lifestyle interventions on the risk of coronary heart disease.37 If all time-varying confounders have been correctly measured and modeled at all time-points, the g-formula can be used to consistently estimate the standardized risk of type 2 diabetes under hypothetical interventions. We used a Kaplan-Meier estimator that incorporates censoring by death and loss to follow-up.38 For the exact formula see electronic appendix eText.
A simplified description of the process of estimating risks using parametric g-formula is as follows: we start by fitting regression models for all potential confounders and for the disease outcome using data on the entire study population. We then use these models to simulate the risk of the disease under various interventions in five steps: (1) take the observed joint distribution of covariates at baseline; (2) estimate the joint distribution of time-varying covariates at the next time-point using the parametric models; (3) “intervene” by setting the values of some covariates to values determined by the hypothetical interventions; (4) estimate the predicted probability of the outcome using these new values; (5) repeat steps 2 through 4 for the entire duration of follow-up to estimate the predicted risk of the disease under the selected interventions. See papers by Taubman et al39 and Young et al40 for a more detailed description.
More formally, the standardized cumulative risk estimated by the g-formula is a weighted average of the risks of type 2 diabetes conditional on the specified intervention values and the observed confounder history. The weights are the probability density functions of the time-varying confounders, which are estimated via parametric regression models. We approximated the weighted average by using a Monte Carlo simulation of 10,000 individuals with the baseline values of covariates sampled from their empirical distribution. The values of time-varying covariates for each 2-year interval were drawn from the distribution estimated via the regression models after setting the values of lifestyle factors to those specified by the interventions.
To increase comparability, the models included calendar year for each period and the following potential baseline confounders: age, history of diabetes in first-degree relatives, smoking and oral contraceptive use prior to 1980, marital status, education, husband’s education, employment, and stress in daily life and work, as well as pre-baseline values of the variables corresponding to the 8 selected intervention (i.e. BMI, smoking, exercise, and intake of meat, whole grain, coffee, alcohol and soda). These latter adjustments enabled us to estimate the effect of changes in lifestyle in middle-aged women as opposed to the life-long effect of a healthy lifestyle.
We modeled the distribution of 20 time-varying covariates: multivitamin use, aspirin use, statin use, post-menopausal hormone use, smoking, physical activity, soda intake, coffee intake, red meat intake, whole grain intake, alcohol use, BMI, high blood pressure, high serum cholesterol, coronary heart disease, stroke, angina or coronary artery bypass graft, cancer, menopause and osteoporosis (see eTable 1).
We compared the estimated risks under various interventions with the risk under no intervention to calculate the population risk ratio and population risk difference. The population attributable risk is one minus the population risk ratio. To estimate the 95% confidence intervals (CIs), we used non-parametric bootstrapping with 500 samples.41 We also computed the proportion of women who were hypothetically intervened on in any period and the average proportion of women intervened on in each 2-year period. The latter measures overall adherence in the observed data to the hypothetical intervention among those following the intervention up until the previous period. We examined the possibility of effect modification by conducting the analysis separately in subsets of the study population defined at baseline according to age, BMI, and family history of diabetes. All analyses were conducted using SAS 9.2 (Cary, NC). The SAS macro and its documentation are available at http://www.hsph.harvard.edu/causal/software.
Table 1 shows the baseline characteristics of the 76,402 women who met all eligibility criteria in 1984. Mean age was 49.9 years and mean BMI was 24.4 kg/m2; 25% of women were current smokers and 19% had a family history of diabetes. During 24 years (1.4 million person-years) of follow-up, there were 6,044 cases of type 2 diabetes and 8,260 deaths. 17,690 participants did not reach the end of follow-up (i.e. did not return a questionnaire). The models collectively performed very well in estimating the risk-factor distributions under no intervention. For example, the mean difference between the observed and simulated number of cigarettes smoked per day was less than 0.4 cigarettes during follow-up; similarly, the point estimate of the ratio of the observed to simulated BMI was never smaller than 0.99 or larger than 1.00 (see eFigures 2-9). More importantly, the simulated 24-year risk of diabetes was the same as the observed risk at 9.6%, which is a necessary condition for no model misspecification. The coefficients of the models used in the simulations are presented in eTable 2.
Table 2 shows the 24-year risk of diabetes under various hypothetical lifestyle interventions. Among the non-dietary lifestyle interventions, weight loss was estimated to reduce the risk by 24% (95% CI= 22% to 26%), exercise by 19% (6% to 30%), and quitting smoking by 0% (-1% to 2%) compared with no lifestyle intervention. Among the selected dietary interventions, drinking at least 5 grams of alcohol a day was estimated to reduce the risk by 19% (12% to 23%), eating less than 3 servings of red meat per week by 8% (5% to 11%), and drinking at least 2 cups of coffee a day by 3% (0% to 6%). The mean alcohol intake under the hypothetical intervention “drink at least 5 grams of alcohol a day” ranged from 9.2 to 11.7 grams a day during follow-up, which is equivalent to 1 drink a day.
We estimated that the 24-year risk of type 2 diabetes would be reduced by 39% (29% to 47%) under the 3 non-dietary interventions, by 29% (21% to 35%,) under the 5 dietary interventions, and by 55% (47% to 63%) under all 8 interventions. The estimated 24-year risk of type 2 diabetes under all interventions was 4.3% (3.6% to 5.1%). Of all participants, 25% maintained a BMI of 25 kg/m2, fewer than 11% followed each of the dietary interventions, and 0% followed all 8 interventions for the whole duration of follow-up (9% followed all 8 interventions at some point during follow-up).
Table 3 presents the results of the analyses for more extreme weight-loss interventions. We estimated that reducing BMI at 5% per 2-year period down to 23 kg/m2 would reduce the risk of diabetes by 53% (51% to 56%), while reducing BMI at a faster pace of 10% every 2 years would reduce the risk by 60% (57% to 63%). Combining this latter intervention with the 7 other lifestyle interventions would reduce the risk of type 2 diabetes by 72%. We estimated a risk reduction of 16% (8% to 22%) under a hypothetical intervention in which participants exercise for 30 minutes/day if they have a normal BMI (i.e., BMI < 25 kg/m2) and 1 hour/day if they are overweight or obese (i.e., BMI ≥ 25 kg/m2). A more intensive intervention to engage in at least 1 hour/day of moderate or vigorous activity, regardless of BMI, would reduce the risk of type 2 diabetes by 15% (7% to 22%).
To evaluate the sensitivity of the estimates to our analytic decisions, we conducted analyses that varied the ordering of the variables measured in each questionnaire, estimated censoring due to incomplete follow-up, considered a different intervention on alcohol where women would drink 5 to 10 grams/day of alcohol, and included women who did not return 1 or 2 questionnaires by carrying their last reported values forward. The estimates of relative risks and risk differences did not change materially.
The effects of lifestyle interventions were stronger in younger women (< 50 years old) and those who were overweight or obese at baseline, in both the risk ratio and risk difference scales (Table 4). For all 8 interventions combined, the 24-year risk of type 2 diabetes would be reduced by 10.8 percentage points in women who were overweight or obese compared with only 1.9 percentage points in women who had a BMI of < 25 kg/m2 at baseline (Table 4). Women with a family history of diabetes were also estimated to benefit more from these interventions: the risk difference for all 8 interventions in this group was 7.3% versus 4.8% in those without a family history of diabetes.
Our results suggest that, in this cohort of US women, 55% of cases of type 2 diabetes could have been prevented by a combination of dietary and non-dietary lifestyle modifications. Our estimates are particularly relevant for health policy because they quantify the 24-year impact of lifestyle interventions that start in mid-life or later in relatively healthy women.
A beneficial effect of lifestyle modification on diabetes risk had been previously found in several randomized trials. However, these studies were designed to evaluate a short-term effect (over approximately a 3-year period) in overweight participants with impaired glucose tolerance. Both the Diabetes Prevention Program7 and the Diabetes Prevention Study8 found that a combined intervention on diet and physical activity reduced diabetes risk by 58% compared with general guidance or written advice. The risk of diabetes in the control group was 29% at 3 years in the Diabetes Prevention Program and 23% at 4 years in the Diabetes Prevention Study, which are much higher than our 9.6% risk at 24 years in a population of healthy US women.
Our estimates may not be generalizable to other populations with different distributions of risk factors, as the g-formula standardizes the risk to the distribution of risk factors in the particular population under study. For example, we estimated no reduction in risk of type 2 diabetes if all women had quit smoking but only 25% of women in our population were smokers. When we compared the risk had everyone been “forced” to smoke 20 cigarettes a day to the risk had everyone quit smoking, the estimated population risk ratio was 1.1. Also, the magnitude of our estimates is specific to the set of interventions that was considered. Though our results are generally consistent with previous analyses of prospective studies,34,35,42,43 differences do exist because previous studies did not specify the time of initiation of the lifestyle interventions and considered more extreme weight loss.19-22 For example, a previous analysis of the Nurses’ Health Study cohort classified women as low-risk if they met 5 criteria (BMI < 25 kg/m2, diet high in cereal fiber and polyunsaturated fat and low in trans-fat and glycemic load, at least half an hour/day of moderate-to-vigorous physical activity, no current smoking, at least a half-serving of an alcoholic beverage/day) between 1986 and 1996. Compared with the rest of the cohort, women in the low-risk group had an average two-year relative risk of diabetes of 0.09 (95% CI= 0.05 to 0.17). The authors estimated that some hypothetical intervention (different from the ones we considered here) on the above 5 factors could have prevented 91% of diabetes cases.
Our analysis has several strengths. The Nurses’ Health Study cohort has collected detailed data on usual dietary intakes, physical activity, body weight and smoking every 2-4 years, and diabetes diagnosis was validated. We included 24 years of follow-up and had a large number of cases, which allowed meaningful subgroup analyses. By applying the parametric g-formula39,44 we could estimate the effect of hypothetical interventions starting in middle-age or later while (i) appropriately adjusting for time-varying confounders affected by prior exposure, (ii) generating adjusted estimates of absolute risk and population attributable risk, and (iii) estimating effects of interventions individually and in various combinations.
Like other analyses of observational data, the validity of our results relies on the key assumptions of no residual confounding, no measurement error, and no model misspecification. The possibility of residual confounding cannot be logically excluded despite adjustment for many potential confounders. A certain degree of measurement error is expected for lifestyle variables and may have contributed to bias. We had to rely on self-reported weight and height; a validation study on a small sample of the study population showed a 0.96 correlation between self-reported and measured weight.45 Though the prediction of our models under the hypothetical interventions cannot be directly evaluated (0% of women followed all interventions during the entire follow-up), our models provided accurate predictions under no intervention, a necessary condition for no model misspecification.
We tried to simplify the public health interpretation of our estimates by comparing interventions with well-defined start times and exposure values, by using food items rather than dietary scores19 or the Healthy Eating Index,46 and by including total caloric intake in the models so that our hypothetical interventions imply replacing the selected food item with other foods that are usually taken as a substitute in the study population. However, the interpretation of our estimates for weight loss and exercise is complicated because there are multiple versions of the interventions. For example, participants may lose weight by reducing their caloric intake or by using weight loss medications; similarly, the selected duration and level of exercise may be achieved by increasing frequency or duration of different types of activities.47
In summary, our results suggest that 55% of cases of diabetes that occurred during 24 years of follow-up in this large prospective study of US women could have been prevented by 8 lifestyle interventions initiated in mid-life or later. The most effective intervention in this population was losing weight, followed by eating a healthy diet and engaging in regular moderate or vigorous physical activity.
We are grateful to Roger Logan and Jessica Young for their technical assistance, to Walter Willett and Meir Stampfer for their comments on a previous draft, and to the Channing Laboratory, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, and to all the women enrolled in the Nurses’ Health Study.
Source of Funding: This work was funded through National Institute of Health (NIH) grants HL080644, DK58845, CA87969.
Conflicts of Interest: Authors have no conflicts of interest to declare.
SDC Supplemental digital content is available through direct URL citations in the HTML and PDF versions of this article (www.epidem.com). This content is not peer-reviewed or copy-edited; it is the sole responsibility of the author.
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.