|Home | About | Journals | Submit | Contact Us | Français|
Greater treatment intensification (TI) improves hypertension control. However, we do not know the ideal way to measure TI for research and quality improvement efforts. We compared the ability of different TI measures to predict blood pressure (BP) control.
We enrolled 819 hypertensive outpatients from an academic, urban hospital. Each patient was assigned 3 scores to characterize TI. The any/none score divides patients into those who had any therapy increases during the study vs. none. The Norm-Based Method (NBM) models the chance of a medication increase at each visit, then scores each patient based on whether they received more or fewer medication increases than predicted. The Standard-Based Method (SBM) is similar to NBM, but expects a medication increase whenever the BP is uncontrolled. We compared the ability of these scores to predict the final systolic blood pressure (SBP). The any/none score showed a paradoxical result: any therapy increase was associated with SBP 4.6 mm/Hg higher than no increase (p < 0.001). The NBM score did not predict SBP in a linear fashion (p = 0.18); further investigation revealed a U-shaped relationship between NBM score and SBP. However, the SBM score was a strong linear predictor of SBP (2.1 mm/Hg lower for each additional therapy increase per ten visits, p < 0.001). Similarly, SBM predicted dichotomized BP control, as measured by SBP < 140 mm/Hg (OR 1.30, p < 0.001).
Our results suggest that SBM is the preferred measure of treatment intensity for hypertension care.
Improving cardiovascular outcomes will require valid approaches to measuring quality of care. Measuring the quality of hypertension management is an especially important goal, because improved blood pressure (BP) control has great potential to improve cardiovascular outcomes. One possible measure of the quality of hypertension care is the intensity of clinical management when BP is uncontrolled. As early as 1979, the Hypertension Detection and Follow-up Program demonstrated that compared to usual care, an algorithmic, stepped-care approach to treating hypertension improves BP control and reduces morbidity and mortality.1 More recently, investigators have also shown that in observational settings, patients who receive more intensive management for hypertension have better BP control.2, 3 Because of this demonstrated importance, there is increasing interest in measuring treatment intensification (TI) in the management of hypertension. A valid measure of TI could be used to profile providers, and could be an important element of research and quality improvement efforts in hypertension care. However, there is no consensus regarding the best way to measure TI, and at least three methods have been used in previous studies.2-4
The first method examines whether a patient has had any medication increases during a period of time (vs. none).4, 5 This approach has two flaws. First, it cannot distinguish gradations of TI, only any intensification vs. none. Second, it fails to account for confounding by indication, the phenomenon wherein patients with the most severe disease receive more intensive medical therapy.6 As might be expected, therefore, previous studies using this method have found the paradoxical result that greater TI seems to worsen blood pressure (BP) control.4 However, this method has remained in use through 2008,4,5 so it is important to evaluate its validity.
The other two approaches can measure gradations of TI, and do not produce paradoxical results, suggesting that they are not confounded by indication. One of these approaches relies on a norm-based method (NBM) for defining care as more or less intensive, while the other relies on a standard-based method (SBM). The NBM, described by Berlowitz et al,2 first derives a model to predict the probability of a dose increase at each visit according to various visit characteristics, then compares observed vs. predicted dose changes to characterize each patient’s care as more or less intensive than expected. The SBM, described by Okonofua et al,3 simply compares the number of dose changes observed to the number of occasions on which the BP was 140/90 mm/Hg or higher. In this system, a dose change is essentially “expected” whenever the BP is uncontrolled. Some have noted that SBM has certain inherent advantages over NBM, because it is easier to calculate and interpret.7 However, NBM incorporates a more nuanced view of clinical decision-making, because it allows for the possibility that factors other than the BP may influence the decision to intensify therapy, as well as the possibility that gradations of BP may exert differential influence on this decision. If NBM were the most valid measure of TI, as measured by BP control, it might be preferred, despite difficulties of calculation and interpretation.
However, different methods of measuring TI have not been directly compared regarding their ability to predict BP control over time. Because TI is a measure of process of care, linking it to blood pressure control outcomes can demonstrate its validity and utility.8, 9 We therefore used data from a study of hypertensive patients at an academic urban safety-net medical center to address two questions: 1) To what extent do these different measures of TI identify the same patients as having received more or less intensive management, and 2) Which, if any, of these three measures of TI best predicts BP control over time? Whatever our results, we expected them to inform future efforts to measure TI in the management of hypertension.
This report is a secondary analysis of data from a randomized trial designed to test whether a clinician-directed curriculum about patient-centered counseling could improve doctor-patient communication, adherence to therapy, and blood pressure control (ClinicalTrials.gov Identifier: NCT00201149). Patients were enrolled from seven outpatient primary care clinics at Boston Medical Center, an inner-city safety net hospital affiliated with the Boston University School of Medicine. The study was approved by the Institutional Review Board of Boston University Medical Center. We identified all patients of White or Black race, age 21 and older, with outpatient diagnoses of hypertension on at least three separate occasions between August 2004 and June 2006.
Using this “universe” of 10,125 hypertensive patients from seven clinics, study staff tracked these patients’ clinic visits over a 19 month period, and, as they presented for care, approached 3526 of them to request participation in the study. All willing respondents were then asked a series of questions and administered a cognitive screen to determine eligibility. A total of 1082 patients were excluded. Reasons included seeing a medical student at their visit (n = 257), use of a daily medication dispenser (because it might invalidate collection of adherence data, n = 247), cognitive impairment according to our cognitive screen (n = 199), ethnicity other than White or Black (n = 149), unable to speak English (n = 71), not prescribed antihypertensive medication (n = 61), participation in another hypertension study (n = 30), hearing impairment (n = 16), and other (n = 52), leaving 2444 eligible patients. Of those, 654 patients overtly refused to participate and 920 patients responded that they did not have time to participate that day. Total enrollment was therefore 870 patients.
The primary outcome was each patient’s final systolic blood pressure (SBP) value, drawn from the clinical record of Boston Medical Center. We chose SBP rather than diastolic blood pressure (DBP) as our primary outcome, because many more patients have poorly-controlled SBP.10 However, we also examined several secondary outcomes of hypertension care, including DBP and dichotomized measures of SBP, DBP, and overall BP control.
Automated data from Boston Medical Center’s electronic medical record (EMR) were examined. Our database included all prescriptions written, as well as all clinical BP values recorded within the study period. The unit of analysis was a visit to the primary care clinic, as identified by a date on which a BP value was recorded. When there were multiple BP values recorded on one date, we chose the one with the lowest SBP; if two values were tied, we selected the one with the lower DBP.
We recorded the patient’s initial regimen of antihypertensive medications, i.e. the regimen prior to study inception. One of the authors (AJR) manually reviewed all prescriptions for each patient to see when the BP regimen was increased. An increase in medication was defined as either a new medication being added to the regimen or an increase in the dose of an existing medication. The period between each two BP values was assigned a 1 if the regimen was increased during that period, or a 0 if it was not. Multiple increases during a single period were counted as a 1. Dose changes occurring after the final visit were not recorded. A subset of 42 patients, representing 495 (5%) of all clinic visits, were randomly selected for blind re-abstraction by another author (DRB). Agreement between the two reviewers was excellent (kappa = 0.93, 95% CI 0.87 – 0.98).
We collected patient demographic data, including race (Black or White), gender, and age. Using both ICD-9 codes and EMR problem lists from the electronic medical record, we noted whether the patients had the following comorbid conditions, all of which could impact the blood pressure, the use of antihypertensive medications, or the perceived urgency of controlling hypertension: benign prostatic hypertrophy, cerebrovascular disease, congestive heart failure, chronic kidney disease, coronary artery disease, diabetes mellitus, hyperlipidemia, obesity (BMI > 30), peripheral vascular disease, and tobacco use.
The any/none score was “1” if the patient had at least one therapy increase during the study; otherwise, it was “0”. The any/none score does not account for the number of visits or the degree of blood pressure elevation.
To create the NBM score,2 we began by deriving and validating a model to predict medication increases at each visit. The unit of analysis was each individual clinic visit; the outcome was whether or not the medications were increased at the visit. Our hypotheses regarding likely predictors were derived from our clinical judgment as well as our experience with the strongest predictors in previous, similar models.2, 11, 12 We considered the following possible predictors: SBP at the current and the previous visit, DBP at the current and the previous visit, number of days since the previous visit, whether the medications were increased at the previous visit, and the entire list of variables described above under “Covariates”.
We initially screened variables using recursive partitioning (CART modeling),13 using the R statistical package, version 2.6 (R Foundation, 2007). This method assigns each clinic visit into one of several categories according to several important predictors; each category is characterized by a particular frequency of medication increase. The important variables and cutoff values are empirically determined by the modeling procedure.
Having used CART to screen variables, we proceeded to derive and validate our predictive model using logistic regression. The dataset was split 60/40, with the larger subset used for derivation and the smaller for validation. We tried all candidate variables in our models, focusing particularly on those identified as important by CART modeling. In selecting cutoff values for continuous variables, we were guided by the output from CART model results and results of bivariate analyses. There were five predictors in the final model: 1) current SBP 2) current DBP 3) Days since last visit 4) DBP at previous visit and 5) whether the medication was adjusted at the last visit (see Appendix A for model details). The c-statistic was 0.74 in the derivation set and 0.72 in the validation set; the Hosmer-Lemeshow test indicated good model fit (p = 0.59 in the derivation set and 0.44 in the validation set).
We then calculated the total number of expected medication changes for each patient in the dataset by summing probabilities over all of their visits. For example, if a patient had 3 visits, with predicted probabilities of a medication change of 0.20, 0.30, and 0.50, then exactly one medication change would be expected over this 3-visit period. We assigned each patient an NBM score, using the following formula:
NBM scores are between -1 and 1, with 0 as the midpoint of the score. A score of zero indicates a precise match between observed and expected medication increases, with positive numbers indicating more medication increases than expected and negative numbers indicating fewer increases than expected. As an example, over a 10-visit period, a patient might have a total of 5 predicted medication increases using NBM. If this patient actually had 3 visits with medication increases, the NBM score would be -0.2, indicating 2 fewer medication increases than expected per ten visits. If the patient had 6 visits with therapy increases, the NBM score would be 0.1, indicating 1 more medication increase than expected per ten visits.
We also created an alternative NBM score for each patient, based solely on the results of our CART model (Appendix B), as in the original paper by Berlowitz, et al.2 Results obtained using this score were not meaningfully different from our main NBM score, and are not shown.
For the SBM analysis,3 the expected number of medication increases was the number of occasions on which the recorded BP was 140/90 mm/Hg or higher. Using this number, and the number of occasions on which the medication was intensified, each patient was assigned a score between -1 and 1. To make comparisons with NBM more straightforward, we reversed the polarity of the SBM score from what was originally described by Okonofua, et al.3 Therefore, we computed the SBM score using the following formula:
For example, a patient with 5 elevated BP values over 10 visits would have a predicted value of 5 therapy increases. If this patient actually had 3 visits with medication increases, the score would be 3/10 – 5/10 = -0.2, or two fewer therapy increases than expected per ten visits. If the patient had 6 visits with therapy increases, the score would be 6/10 – 5/10 = 0.1, or one more therapy increase than expected per ten visits.
We recognize that for patients with diabetes or chronic kidney disease, current guidelines set a lower BP target (i.e. 130/80 mm/Hg).14 We therefore created an alternative SBM score only for patients with a low BP target. For this alternative SBM score, a medication increase was expected on each occasion when the recorded BP is 130/80 mm/Hg or higher, as opposed to 140/90 mm/Hg for the main TI score. We divided the sample into patients with the higher and the lower BP thresholds and repeated our analyses for each group using the appropriate TI score. Results of this sensitivity analysis were similar to our main analysis, and are not shown.
Each patient was assigned three scores to measure TI in their hypertension care: any/none, NBM, and SBM. We examined the degree to which these three measures of TI were inter-correlated. For comparisons involving the any/none score, we used t-tests to compare means of the other two scores when the any/none score was “any” vs. “none”. We compared the NBM and SBM scores using Spearman correlation (due to the non-Gaussian distribution of the SBM score), as well as dividing them into quartiles and constructing a 4 × 4 table.
We then examined the predictive validity of these three scores for the main dependent variable, the final SBP (continuous), as well as several secondary measures of BP control, including final DBP (continuous) and whether the final SBP was <140 mm/Hg (categorical). For the any/none score, we compared the “any” group to the “none” group using t-tests or chi-square, as appropriate. For the NBM and SBM scores, we used linear or logistic regression to model the relationship between the score and the BP outcomes, as appropriate. We repeated these analyses, controlling for patient-level covariates. We also divided the NBM and SBM scores into quartiles and performed ANOVA tests regarding the ability of the quartiles to predict the final SBP. For all analyses except the CART modeling, we used SAS, version 9.1 (SAS Institute, Cary, NC). The authors had full access to the data and take responsibility for its integrity. All authors have read and agree to the manuscript as written.
Of 870 patients enrolled in the study, 51 were excluded from this analysis because they had 2 or fewer BP values. Therefore, 819 patients with hypertension, managed at Boston Medical Center, constituted our study population (Table 1). The mean follow-up time was 24 months; on average, patients visited the clinic once every 2 months. The mean age was 59.6 years, 34% of patients were male, and most (58%) were of Black race. Considering their relatively young age, the population had a high burden of comorbidity: 54% had hyperlipidemia, 33% had diabetes, 13% had coronary artery disease, and 59% were obese. Most patients (74%) were receiving two or more antihypertensive medications at study inception. The population was characterized by relatively well-controlled hypertension at baseline: the mean BP was 134/80 mm/Hg, and 55% of patients were below 140/90 mm/Hg.
After excluding the initial and final clinic visits for each patient (which were not analyzed regarding therapy increases), therapy was increased at 835 of the 9828 clinic visits (8.5%). 406 patients (50%) had at least one therapy increase during the study; among patients with at least one therapy increase, the mean number of increases was 2.1 and the median was 2.0. We calculated NBM and SBM scores for each patient in the database. NBM scores were narrowly distributed (median -0.04; Interquartile Range (IQR) -0.06, 0.05; 5th and 95th percentiles -0.13, 0.25). SBM scores were more widely distributed (median -0.25; IQR -0.50, -0.05; 5th and 95th percentiles -0.80, 0.05).
Before examining the scores as predictors of BP control, we compared their classification of patients. The mean NBM score was 0.09 when the any/none score was “any”, vs. -0.07 when it was “none” (p < 0.001). In contrast, the SBM score did not differ meaningfully between the two groups (-0.27 vs. -0.30, p = 0.15). The Spearman correlation between the NBM and SBM scores was 0.44, a fairly low correlation for two scores that are intended to measure the same construct. We also divided the NBM and SBM scores into quartiles and compared their classification of patients (Table 2). The kappa statistic for agreement between these two scores was 0.14 (95% CI 0.09 – 0.18). Extreme differences in quartile classification were not uncommon; for example, there were 209 patients (26%) whose quartile classifications differed by more than one category.
Any therapy increase (vs. none) was examined as a predictor of the final BP. Patients with at least one therapy increase had a mean final SBP of 135.2 mm/Hg, compared to a mean final SBP of 130.6 mm/Hg among patients who had no therapy increases (p < 0.001). As expected, because this measure does not control for confounding by indication, it produces a paradoxical result (therapy increases are associated with a higher final BP).
NBM score was a poor predictor of BP control (Table 3). In a linear regression, the NBM score was not a significant predictor of the final SBP (model R2 = 0.001, p = 0.28). Adding patient-level covariates improved the model fit somewhat. We investigated further by dividing the NBM score into quartiles (Table 4). A U-shaped relationship, rather than a linear relationship, was observed between NBM score quartiles and the final SBP. The NBM score also performed poorly as a predictor of the final DBP (R2 = 0.002, p = 0.19), and as a predictor of whether the final SBP would be below 140 mm/Hg (OR = 1.06 per change of 0.1, c statistic 0.56, p = 0.28).
In contrast to the NBM score, the SBM score was an excellent predictor of the final blood pressure (Table 5). In a linear regression, the beta coefficient was -2.1, indicating that for each 0.1 of the SBM score (one more therapy increase per ten visits), the final SBP was 2.1 mm/Hg lower (R2 = 0.12, p < 0.001). Adding covariates to the model improved its fit by a similar margin as with the NBM model, but SBM persisted as a powerful predictor of the final SBP. In additional stratified analyses, SBM performed similarly in males and females, in White and Black patients, and among subgroups of patients with particularly severe comorbid conditions such as chronic kidney disease, congestive heart failure, and peripheral vascular disease.
We investigated further by dividing the SBM score into quartiles (Table 4). A strong linear relationship was observed between SBM score quartiles and final SBP (p for linear trend < 0.001). The SBM score was also a predictor of the final DBP (beta coefficient -0.8, p < 0.001) and of whether the final SBP would be below 140 mm/Hg (OR = 1.30 per change of 0.1, c-statistic 0.70, p < 0.001).
Optimizing approaches to measuring the quality of care delivered to patients with chronic diseases is an important research goal. This is particularly true for measuring treatment intensity in the care of hypertension, because we have decades of evidence showing that more intensive treatment improves blood pressure control.1-3 We therefore compared the predictive criterion validity of three approaches of measuring TI in hypertension care. We found that the any/none measure produces paradoxical results because it does not account for confounding by indication. To our surprise, we found that the NBM score was not predictive of BP control. Further investigation demonstrated that the NBM score appeared to have a U-shaped relationship with BP outcomes, complicating its use as a predictor and calling into question its validity as a measure of TI, which is meant to be monotonic.
In contrast, the SBM score was a powerful predictor of the final BP, a relationship which remained undiminished after controlling for covariates. It is important to remember that the beta coefficient we found for the effect of SBM upon final SBP, -2.1 mm/Hg, was for each additional therapy increase per ten visits, a relatively small difference in management. Larger differences in management would obviously improve BP control much more. For example, the difference in final SBP between the highest and lowest quartiles of TI (125 mm/Hg vs. 141 mm/Hg) suggests an effect of considerable magnitude and clinical significance.
We had expected to find that the any/none measure performs poorly as a measure of TI, because previous studies have shown that a failure to account for confounding by indication produces paradoxical (or attenuated) results.4-6 We had also expected to find that NBM is superior to SBM as a predictor of BP control, because it incorporates a more nuanced representation of clinical decision-making. The apparent lack of predictive criterion validity for NBM in our study contrasts with the findings of earlier studies, particularly the original paper by Berlowitz, et al.2 This difference may be attributable to improved BP control: mean initial BP was 134/80 mm/Hg in our study vs.146/83 in the earlier study.2 NBM may have worked better in an era of mediocre BP control, while SBM may be more suited to pursuing what are ultimately smaller improvements in BP.
Our study has several limitations. First, TI is not universally accepted as an ideal theory to understand poor control of asymptomatic, chronic conditions, especially when it is presented as “clinical inertia”,15 the obverse of TI. Some studies have suggested that on deeper inspection, what seems to be clinical inertia could also be attributed to “competing demands”,16, 17 “clinical uncertainty”,18 or “appropriate inaction”.19 Other studies have explored the relationship between TI and adherence,4, 5, 20-23 or the patient and visit-level predictors of TI.10, 17, 24-26 This study did not include specific measures of adherence, competing demands, patient complexity, clinical uncertainty, or appropriate inaction, although we did account for the burden of comorbid disease, which relates to several of these concepts (competing demands and patient complexity). However, because we compared multiple measures of TI using the same database, we can be assured that unmeasured covariates would have been equally true for all comparisons. In addition, while refinements to the TI concept are always welcome, our study reinforces the notion that TI, as embodied in the SBM score, is an important determinant of BP control.
Second, our study compared different methods of measuring TI in hypertension care. It should not be assumed, however, that SBM would also be the ideal system for measuring TI in the care of diabetes or hyperlipidemia; future research should address those questions. Finally, our data were drawn from an academic urban hospital, which may limit generalizability. The clinicians at Boston Medical Center may have managed hypertension differently than non-academic clinicians. Similarly, the BP control in this cohort was quite good; it is possible that the SBM score may work particularly well in such a setting. In addition, many of the patients in our study were immigrants, ethnic minorities, and of low socioeconomic status. However, given the relatively good BP control achieved among this population, the challenges these patients face in their everyday lives do not seem to threaten the generalizability of our findings.
We have known for 30 years that more intensive management leads to better hypertension outcomes, in both clinical trials and observational settings.1-3 What we have lacked is consensus about the best method to measure TI in the care of hypertension. Our study found that any/none and norm-based measures were not valid measures of TI, while a standard-based measure was an excellent predictor of BP control. Unless these results are challenged by other studies, standard-based measurement should be the preferred method of characterizing TI in future studies of hypertension care. SBM can now serve as the basis of research and quality improvement efforts to improve the process and outcomes of hypertension care.
The authors thank Mark Glickman, PhD, for his help with the CART models, and Al Ozonoff, PhD, for other statistical advice.
Funding Sources: This research was supported by a grant from the National Institutes of Health (HL072814, NR Kressin, PI). Dr. Rose is supported by a career development award from the Department of Veterans Affairs, Health Services Research and Development Service. Dr. Kressin is supported by a Research Career Scientist award from the Department of Veterans Affairs, Health Services Research & Development Service (RCS 02-066-1). The views expressed in this article are those of the authors and do not necessarily represent the views of the Department of Veterans Affairs.
Disclosures: The authors have no conflicts of interest to disclose.