|Home | About | Journals | Submit | Contact Us | Français|
We examined whether a hypertension risk prediction model based on clinical characteristics and blood biomarkers might improve upon risk prediction based on current blood pressure alone.
A prospective cohort of 14,822 normotensive women aged 45 and older were followed over 8 years beginning in 1992 for the development of hypertension. Among a randomly selected two-thirds sample (N=9,427), hypertension prediction models were developed using 52 potential predictors and compared to a model based on blood pressure alone. Each prediction model was validated in the remaining one-third (N=5,395).
In the development cohort, the best prediction model for incident hypertension included age, blood pressure, ethnicity, body mass index, total grain intake, apolipoprotein B, lipoprotein(a), and C-reactive protein (Bayes Information Criteria [BIC] =8788). While this model was superior to a model based on blood pressure alone (BIC=8957), it was only marginally better than a simplified model including age, blood pressure, ethnicity, and body mass index (BIC=8820). In the validation cohort, the simplified model demonstrated adequate calibration, a c-index similar to that of the best model (0.703 vs 0.705), and when compared to the model based on blood pressure alone, reclassified 1499 participants to hypertension risk categories that proved to be closer to observed risk in all but one instance.
In this prospective cohort of initially normotensive women, a model based on readily available clinical information predicted incident hypertension better than a model based on blood pressure alone.
The National High Blood Pressure Education Program has recommended both a population-based and an intensive, targeted approach to the prevention of hypertension and its complications.1 However, defining the appropriate group for the targeted interventions, especially pharmacologic,2 has been challenging.3
While risk stratification is implied in the targeted approach to prevention1 and explicitly based on blood pressure categories in the Seventh Report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure guidelines,4 we are aware of only one attempt to develop a comprehensive tool for use in a clinical setting which accurately identifies individuals at high risk for developing hypertension.5 In addition to including difficult to obtain measures such as clinic-based parental history of hypertension, the additional predictive ability of novel biomarkers for hypertension beyond readily available clinical information was not assessed. Beyond blood pressure, additional potentially modifiable independent risk factors for incident hypertension have been identified including obesity, dietary factors, exercise, lipid levels, and inflammatory biomarkers.1, 6–11 Furthermore, risk stratification may be especially relevant for those with normal blood pressure, as they are currently not recommended for targeted blood pressure interventions.
We developed and tested a series of hypertension risk prediction algorithms in a large prospective cohort of initially normotensive women followed for 8 years for the development of clinically overt hypertension. Our goal was to generate a hypertension prediction model that balanced simplicity with predictive accuracy and to compare that model to one based on current blood pressure alone in women with baseline systolic blood pressure less than 130 mmHg and baseline diastolic blood pressure less than 85 mmHg.
Study participants were members of the Women’s Health Study (WHS), a trial of vitamin E and aspirin in the primary prevention of cardiovascular disease and cancer in women aged 45 years or older.12, 13 Beginning in 1992, the WHS recruited U.S. female health professionals who were followed prospectively for incident cardiovascular disease, including hypertension. All participants in the WHS provided written informed consent, and the study was approved by the institutional review board of the Brigham and Women’s Hospital (Boston, Massachusetts).
Among the WHS participants, 28,345 provided blood samples that were stored in liquid nitrogen until the time of analysis. We excluded women who reported at baseline either an elevated systolic (> 140 mm Hg) or diastolic (>90 mm Hg) blood pressure or a physician diagnosis of hypertension and those in the upper half of the pre-hypertensive range of blood pressure (130 to 139 mmHg systolic or 85 to 89 mmHg diastolic), leaving 17,150 women with baseline systolic blood pressure less than 130 mmHg and baseline diastolic blood pressure less than 85 mmHg. An additional 208 women with less than 8 years of follow-up information on incident hypertension were excluded from the primary analysis.
For our analyses, we randomly assigned two-thirds of the women to a model derivation cohort which, after excluding those with missing data on candidate predictors, was made up of 9,427 women. The remaining women were reserved as an independent validation cohort of 5,395 women with complete information on the selected predictors.
Candidate predictors of hypertension risk were selected from a diverse group of demographic, lifestyle, clinical, and biochemical domains. Baseline age, race/ethnicity, diabetes, smoking status, hormone therapy use, height, weight, alcohol use, exercise frequency, parental history of myocardial infarction before 60 years, history of migraines, treatment for cholesterol, multivitamin use, and menopausal status were collected by questionnaire. Body mass index (BMI) was calculated as the weight in kilograms over the square of the height in meters. Baseline blood pressure was also assessed by questionnaire in categories of <110, 110–119, 120–129, 130–139, 140–149, 150–159, 160–169, 170–179 and ≥180 mm Hg for systolic blood pressure and <65, 65–74, 75–84, 85–89, 90–94, 95–104, and ≥105 mm Hg for diastolic blood pressure. An additional dietary questionnaire, sent at the time of randomization, was used to calculate food group and nutrient intake.14–16
Plasma biomarkers were analyzed in a core laboratory facility for total cholesterol, high low density lipoprotein (HDL and LDL) cholesterol, lipoprotein(a), apolipoproteins A-I and B100, high-sensitivity C-reactive protein (CRP), soluble intercellular adhesion molecule-1, fibrinogen, creatinine, hemoglobin A1C, and homocysteine concentration. Lipid ratios and estimated glomerular filtration rate17 were also calculated. The core laboratory was certified by the National Heart, Lung and Blood Institute/Centers for Disease Control and Prevention Lipid Standardization Program.
Incident hypertension was ascertained by annual questionnaire through March of 2004 using methods previously described in detail.8 Briefly, participants were classified as hypertensive after reporting either a new physician diagnosis at year 1, 3, or annually thereafter; a new hypertensive treatment at year 1, 3, or 4; a systolic blood pressure of 140 mmHg or greater at year 1 or 4; or a diastolic blood pressure of 90 mmHg or greater at year 1 or 4.
A logistic model with an outcome of incident hypertension at 8 years was chosen as the primary modeling strategy. Model selection in the development cohort used the Bayes Information Criteria18 (BIC) for model building and to ensure parsimony. The BIC combines the model likelihood with a penalty for the number of predictors used and can be compared across non-nested models, with lower numbers indicating a better fit. This approach met our aim to find a clinically useful, and thus relatively simple, model.
Fifty two potential predictors were evaluated for inclusion in the best model. In addition to those listed in Table 1 we considered factors ascertained by the dietary questionnaire16 (intake of red and total meat, dairy, whole, refined and total grains, sweets, nuts, percent of calories from saturated and total fat, potassium, calcium, magnesium, sodium, fiber and vitamin D, and a composite DASH diet concordance score19), estimated physical activity in Kcal/week, and additional combinations of lipid measures including ratios, as well as the randomized trial assignments to vitamin E and aspirin. We also examined potential transformations for each continuous variable. We used stepwise logistic regression, allowing both forward and backwards steps, to select the model with the lowest BIC (Inclusive Model).
Once the lowest BIC model was identified, we created a second model, in which we removed predictors not easily obtainable during a brief office visit, including the detailed dietary information, and limited the blood-based biomarkers to standard cholesterol measurements. Using this limited predictor group, we again selected the lowest BIC model (Simplified Model With Lipids). We also selected the lowest BIC model based only on the same group of readily available clinical predictors without any blood-based biomarkers (Simplified Model). We examined interactions between the predictors in all of the selected models to ensure that there were no interaction effects missed which could improve prediction. Finally, we generated a model using only baseline blood pressure for comparison (Blood Pressure Only Model).
The primary measure of discrimination used was Harrell’s c-index,20 analogous to the area under the receiver operator characteristic curve. This measure assesses the ability of the risk score to rank women who develop incident hypertension higher than women who do not. General calibration was assessed using the Hosmer-Lemeshow goodness-of-fit test 21 to compare the average predicted risk to the observed risk across deciles of predicted risk. If model calibration was poor, meaning that the predicted risk did not match the observed risk, further division into absolute risk categories was not performed.
Risk reclassification22, 23 was also assessed by dividing the predicted 8-year risk for each model into categories of less than 10%, 10% to less than 20%, 20% to less than 30%, and 30% or higher and then comparing the assigned categories for a pair of models. These categories correspond to the 4- year risk categories previously designated as low (<5%), medium (5 to 10%) and high (>10%),5 with an additional very high category. We then calculated the proportion of participants who were reclassified by the comparison model as compared to the reference model. Reclassification was considered correct if the actual event rate for the reclassified group was closer to the comparison category than the reference. We computed the Hosmer-Lemeshow statistic for the reclassification tables,24 which assesses agreement between the observed and predicted risk within the reclassified categories. We also computed the Net Reclassification Improvement (NRI),25 which compares the shifts in reclassified categories by observed outcome, and the Integrated Discrimination Improvement (IDI),25 which directly compares the average difference in predicted risk for women who go on to develop hypertension with women who do not for the two models.
All assessment measures were examined both in the development cohort and the validation cohort. All analysis was done using R, version 2.6.0.
The distributions of selected potential clinical predictors in the derivation (N=9,427) and validation (N=5,395) cohorts are displayed in Table 1. During the 8 years of follow-up time, 1935 incident hypertension cases occurred in the derivation cohort (21%) and 1068 cases occurred in the validation cohort (20%).
As shown in Table 2, the Inclusive Model in the development cohort included systolic and diastolic blood pressure, being of Black or Hispanic race/ethnicity, age, BMI, CRP, apolipoproteinB, lipoprotein(a) and total grain intake. This model had the lowest BIC and, in the development cohort, the highest c-index and best calibration. The Simplified Model With Lipids, which included systolic and diastolic blood pressure, being of Black or Hispanic race/ethnicity, age, BMI, and total to HDL cholesterol ratio, had a slightly lower BIC and higher c-index than the Simplified Model but was the least calibrated. The Blood Pressure Only Model had the highest BIC and lowest c-index and was well calibrated. Beta coefficients were similar across the models.
In the validation cohort, the c-indices were virtually identical for the Inclusive Model, the Simplified Model With Lipids, and the Simplified Model (0.705, 0.705, and 0.703, respectively [Table 3]) and all three of these discriminated better than the Blood Pressure Only Model (c-index 0.676). However, the Inclusive Model and the Simplified Model With Lipids were no longer calibrated (Hosmer-Lemeshow p-value of 0.002 and 0.008, respectively) and their effect on risk reclassification could not be assessed.
Risk reclassification for the models which remained calibrated in the validation cohort, the Simplified Model and the Blood Pressure Only Model, is shown in Table 4. The Simplified Model reclassified 27.8% of the participants when compared to the Blood Pressure Only Model; of the 1499 participants reclassified, all but 1 were placed into more accurate risk categories (99.93%). When the NRI was calculated, the Simplified Model generated a 5.6% improvement (p-value 0.001). The IDI also showed a statistically significant (p-value <0.001) improvement of 0.017 using the Simplified Model. Furthermore, the Simplified Model remained calibrated within the reclassification table (reclassification Hosmer-Lemeshow p-value 0.70) while the Blood Pressure Only Model did not remain calibrated (p-value <0.001), indicating that the observed and predicted risk estimates did not agree for the latter model.
Table 5 provides a clinical example of risk estimates using Simplified Model and the Blood Pressure Only Model. As shown, using the Blood Pressure Only Model results in a 15.6 percent predicted risk of developing hypertension for a woman with a blood pressure of 115/70 mm Hg. However, using the Simplified Model and including information on her age, BMI and race/ethnicity substantially increases the range of predicted risk. For example, a 50 year old Caucasian woman with a BMI of 20 kg/m2 would have a predicted risk of only 10.4 percent, whereas a 60 year old Black or Hispanic woman with a BMI of 30 kg/m2 would have an increased predicted risk of 36.4 percent.
For clinical application, the BOX contains the formula for calculating 8-year risk of hypertension using the Simplified Model.
Using 9,427 initially normotensive women for model development and an additional 5,395 initially normotensive women for testing and validation, we developed risk prediction models for incident hypertension and compared them to predictions based only on blood pressure and to observed incident hypertension over 8 years of follow-up. Neither the Inclusive Model, which included multiple blood biomarkers (apolipoproteinB, lipoprotein(a), and CRP) and dietary information, nor the Simplified Model With Lipids, which included total and HDL cholesterol, was well calibrated in the validation set, likely due to over-fitting. However, the Simplified Model, which included age, BMI and race/ethnicity, offered substantial improvement in terms of discrimination and reclassification over the use of current blood pressure alone. This improvement was consistent across a range of risk categories from low to very high.
Our study systematically examined a variety of possible predictors of hypertension to find a balance of simplicity, parsimony, and predictive accuracy. Many of the selected predictors have previously been shown to be independently associated with incident hypertension, including current blood pressure, age and BMI 26 as well as lipid measures7, 8, 27. Elevated rates of hypertension in Black and Hispanic women have also been previously reported.28, 29 Grain intake was a component of the DASH diet30 and whole grain intake has been previously shown to be inversely associated with incident hypertension in the Women’s Health Study.31 C-reactive protein has also been previously shown to be associated with incident hypertension in multiple cohorts.9, 11, 32, 33 Other factors, such as lipoprotein(a) and apolipoproteinB have not been as well studied.
Our study benefits from a large sample size and number of hypertensive cases, prospective design, and information on a wide range of potential predictors of incident hypertension. Additionally, we were able to use a separate validation cohort for testing and comparison of the models generated in the derivation cohort. The importance of this approach is reflected in the poor calibration of the Inclusive Model and the Simplified Model With Lipids in the validation cohort despite being calibrated in the development cohort. However, there are two main limitations to our study. First, our cohort only included female health professionals, primarily Caucasian, and our results will need to be validated in other populations. Second, BMI, blood pressure measurements and hypertension status were self-reported. While this may create additional measurement variability, we do not believe that this limits the validity of our results. Self-reported and measured BMI have been shown to be consistent in similar cohorts.34 Accuracy of reported hypertension status in a WHS sub-sample, 35 as well as a similar cohort of nurses,36 suggests little misclassification. Additionally, self reported blood pressure in physicians had a similar correlation with measured blood pressure37 to the correlations found with repeated measured values.38 Finally, self-reported and measured blood pressure were found to have similar predictive ability for cardiovascular outcomes in pooled analysis of the results of 61 cohort studies,39 and the relationship of blood pressure to cardiovascular risk in the Women Health Study has been found to be consistent with non-self-report cohorts.40
Given the limitations of our study population and design, replication and validation in other communities is crucial. One source of comparison is the recently published hypertension risk score presented by the Framingham Heart Study investigators, which our study compliments and expands upon.5 Despite differences in population, potential predictors, and statistical approaches, both our data and those from Framingham demonstrate that a simple model that incorporates at least current blood pressure, age, and BMI is superior to a prediction model based on current blood pressure alone. Our model includes information on ethnicity, while the Framingham algorithm includes clinic-based data on parental hypertension. However, in both studies current blood pressure, age, and BMI, the three factors common to both algorithms, accounted for the majority of the overall risk of future hypertension.
These data from the Women’s Health Study suggest that hypertension risk prediction calculated from a few readily available clinical factors offers better calibration than more complicated models and improvement in risk stratification over blood pressure alone. While biomarkers and lifestyle factors may be useful in illuminating the complex etiology of hypertension and in determining preventive strategies, they did not appear to provide additional predictive information in determining the future risk of hypertension. By contrast, the Simplified Model (as shown in Box) could be used with readily available information during an office visit to predict the risk of hypertension in women. Additional studies are needed to assess whether use of hypertension risk prediction tools improves prevention in clinical practice.
where A = 0.51 (if systolic blood pressure is between 110 and 119 mm Hg) + 1.20 (if systolic blood pressure is between 120 and 129 mm Hg) + 0.30 (if diastolic blood pressure is between 65 and 74 mm Hg) + 0.67 (if diastolic blood pressure is between 75 and 84 mm Hg) + 0.58 (if race/ethnicity is Black or Hispanic) + 1.28 * natural logarithm (age) + 1.92 * natural logarithm (Body Mass Index)
This study was supported by grants HL43851 and CA47988 from the National Heart Lung and Blood and National Cancer Institutes (Bethesda, MD) and by the Donald W Reynolds Foundation (Las Vegas, NV).
Funding Sources: This study was supported by grants HL43851 and CA47988 from the National Heart Lung and Blood and National Cancer Institutes (Bethesda, MD) and by the Donald W Reynolds Foundation (Las Vegas, NV).
Conflict of Interest Statement: None of the authors has any conflict of interest to disclose. All of the authors meet the criteria for authorship, had access to the data and a role in writing the manuscript, and accept responsibility for the scientific content of the manuscript.