|Home | About | Journals | Submit | Contact Us | Français|
The current emphasis on comparative effectiveness research will provide practicing physicians with increasing volumes of observational evidence about preventive care. However, numerous highly publicized observational studies of the effect of prevention on health outcomes have reported exaggerated relationships that were later contradicted by randomized controlled trials. A growing body of research has identified sources of bias in observational studies that are related to patient behaviors or underlying patient characteristics, known as the healthy user effect, the healthy adherer effect, confounding by functional status or cognitive impairment, and confounding by selective prescribing. In this manuscript we briefly review observational studies of prevention that have appeared to reach incorrect conclusions. We then describe potential sources of bias in these studies and discuss study designs, analytical methods, and sensitivity analyses that may mitigate bias or increase confidence in the results reported. More careful consideration of these sources of bias and study designs by providers can enhance evidence-based decision-making.
Practicing clinicians face a substantial challenge when attempting to interpret data from observational studies that report the effects of prevention on patient health outcomes. Numerous high-profile descriptive studies of preventive screening tests, behaviors, and treatments have reported dramatically reduced mortality or improved health outcomes. However, many of these findings were later thrown into question when randomized controlled trials (RCTs) indicated contradictory results. In some cases, the flawed observational studies were the source of evidence for broad practice recommendations.1 While it would be a mistake to ignore all evidence from observational studies—there are many questions that will never be answered by RCTs—clinicians must be careful when interpreting observational studies demonstrating what seem to be surprisingly large beneficial effects of preventive therapy.
With the investment of over $1 billion in comparative effectiveness research, clinicians will be faced with increasing volumes of complex results. Proper interpretation will require familiarity with a host of sources of bias in observational research. Bias results when features of a study’s design lead to estimates that do not accurately reflect the relationship between the study variables. In this review, we explore a specific subset of these sources of bias—confounding in observational studies resulting from patient-level tendencies to engage in healthy behaviors or physician’s perceptions of the health of patients. A recent body of research has emerged examining these sources of bias, and their effect on the interpretation of observational research findings. In this paper, we provide a brief review of observational studies that have appeared to reach incorrect conclusions due to healthy user and other related types of bias. We describe the sources of bias in these studies and discuss study designs, analytic methods, and sensitivity analyses that may mitigate bias or increase confidence in the results reported. We offer guidance to physicians to encourage a more critical review of the literature with the goal of enhancing evidence-based, rational clinical decision-making.
The best known example of a divergence between observational and RCT evidence is the story of hormone replacement therapy (HRT) and cardiovascular disease. In 1985, investigators from the Nurses Health Study reported that post-menopausal women taking HRT had one third the risk of coronary heart disease as women not taking HRT.2 A series of observational studies reporting similar findings followed.1 On this basis, both the American Heart Association and the American College of Physicians recommended HRT for prevention of coronary heart disease in post-menopausal women, and by 2001, an estimated 15 million women were filling HRT prescriptions annually.3 The HERS trial, which found no overall cardiovascular benefit of HRT in women with existing coronary disease and some evidence of early increased risk, was published in 1998.4 The results of Women’s Health Initiative, a more definitive RCT of HRT in post-menopausal women published in 2002, reported a 29% increase in the incidence of CHD among women randomized to HRT.5 While debate continues about the risks and benefits of HRT for specific subgroups, the original observational studies seemed to overstate the benefits of preventive therapy.
Similar stories can be told about a wide array of vitamins and prescription drugs. Observational studies have suggested that vitamins B, C, and E, and beta-carotene consumption all reduce cardiovascular mortality, only to be overturned by subsequent RCT evidence suggesting no benefit.6–8 The apparent protective effects of fiber and folic acid intake on the incidence of colorectal cancer found in observational studies also were not supported by RCTs.9 Observational studies of statin use and hip fracture have consistently reported a protective effect, with a 23% reduction estimated in a recent meta-analysis.10 However, published secondary analyses of trial data have not found a protective effect of statins on hip fracture.10 The reported benefit of statins on Alzheimer’s disease,11 sepsis,12 and cancer13 have also been questioned.
Another topic of recent debate is the magnitude of the benefit of influenza vaccination on mortality among elderly patients. Observational studies have typically reported 40%–50% reductions in all-cause mortality.14,15 However, the observation that influenza vaccination appears to protect patients against mortality prior to the start of the flu season has cast doubt on these findings,16 as have results indicating that improved statistical adjustment greatly reduces the apparent benefit.17
The deviations in results of observational studies and RCTs examining the effectiveness of preventive care have fueled a field of inquiry to seek new sources of bias in observational research that can explain these deviations and methods to help adjust for them. In this review, we discuss sources of bias related to patient health-seeking behavior and physician perceptions of patient health, an area gaining increased attention in the literature. We do not perform an exhaustive review of all sources of bias in observational research.
The healthy user effect The healthy user effect is best described as the propensity for patients who receive one preventive therapy to also seek other preventive services or partake in other healthy behaviors.18 Patients who choose to receive preventive therapy may exercise more, eat a healthier diet, wear a seatbelt when they drive, and avoid tobacco. As a result, an observational study evaluating the effect of a preventive therapy (e.g., statin therapy) on a related outcome (e.g., myocardial infarction) without adjusting for other related preventive behaviors (e.g., healthy diet or exercise) will tend to overstate the effect of the preventive therapy under study. The healthy user effect has been widely cited as a likely source of bias in observational studies of HRT. Studies indicate that women who took HRT were more likely to engage in healthy behaviors such as regular exercise, a healthy diet, abstinence from alcohol, and maintenance of a healthy weight as compared to non-users.2 The apparent protective effect of HRT on cardiovascular disease likely reflects these unmeasured differences in patient characteristics.
The healthy adherer effect Similarly, the healthy adherer effect arises when patients who adhere to preventive therapy are more likely to engage in other healthy behaviors than their non-adherent counterparts. For example, it has been observed that patients who adhere to one chronic medication are more likely to adhere to other therapies22 and more likely to receive recommended cancer screening tests and immunizations.23,24 In a study of elderly patients initiating statins, patients who filled two or more statin prescriptions during a 1-year ascertainment period were more likely than patients who filled only one prescription to receive prostate-specific antigen tests [hazard ratio (HR) =1.57, fecal occult blood tests (HR =1.31), screening mammograms (HR =1.22), influenza vaccinations (HR =1.21), and pneumococcal vaccinations (HR =1.46) during follow-up, even after adjusting for commonly measured covariates.23
This phenomenon can result in biased estimates of the effect of adherence on clinical outcomes. The most striking example is the observation that patients adherent to placebo in RCTs had lower rates of mortality than non-adherent patients.25,26 A recent study of patients initiating statin therapy in British Columbia provides another illustration. In this study, more adherent patients were less likely to have motor vehicle accidents (hazard ratio, 0.75) and workplace accidents (hazard ratio, 0.77) than less adherent patients even after controlling for typical sociodemographic and health characteristics.24 Failing to account for behaviors that correlate with medication adherence will lead researchers to conclude that preventive medication use and adherence to preventive medications are more strongly associated with outcomes than is the case.
Confounding by functional status or cognitive impairment Cognitive impairment limits some patients’ interest in, or ability to visit, their physician, and severe functional impairment may serve as a barrier to travel to a physician’s office. One study of elderly female patients found that patients with higher levels of functional impairment, who presumably are sicker than those without such impairment, also had significantly lower rates of breast and cervical cancer screening.27 This type of confounding has been cited as an explanation for the large observed protective effect of influenza vaccination on mortality risk. A nested case-control study found that functional limitations, such as the need for assistance bathing, were much more common in patients dying during the flu season than in age-matched controls and were associated with a decreased probability of receiving an influenza vaccination. Adjusting for these factors attenuated the estimated protective effect of flu shots on all-cause mortality from 41% to 29%.17 Observational studies that do not account for functional status or cognitive impairment will overstate the effect of a preventive therapy if sicker patients disproportionately do not receive preventive therapies.
Confounding by selective prescribing Physicians frequently decide not to prescribe preventive therapy to patients who are frail or who have terminal or acute illness, both in hospitalized patients and in the outpatient setting.28 One study evaluated characteristics of patients who received statins and found that “frail” patients experienced a 26%–33% reduced odds of receiving lipid-lowering therapy compared to patients who were not frail and that this relationship could, at least partially, explain previously documented relationships between statin use and mortality.29 This is another factor that has been suggested in explaining the effect of flu shots on mortality. Several studies have observed that patients who are hospitalized during the fall when flu shots are being administered have low rates of vaccination30,31 and high rates of mortality.31 Similarly, patients who develop serious comorbid conditions, such as cancer or end-stage renal disease, may be more likely to discontinue a preventive therapy.
While no methodology will completely eliminate bias in observational research, a number of approaches can be used to minimize bias and affirm the validity of the results. Critical readers of the observational literature must assess whether the authors have adequately minimized bias through appropriate study design and statistical analysis (Table 1).
Prevalent versus new user designs Studies comparing prevalent users of preventive therapy with non-users are often problematic. In contrast to patients who are new users of therapy, a prevalent user population is likely to be enriched with patients who are adherent to and tolerant of the medication under study. These patients are likely different from apparently similar non-users, as they have initiated the preventive therapy and remained adherent to it. For this reason, “new user” designs, where treatment initiators are compared to similar non-initiators, are viewed as preferable. Even with a new user study design, healthy adherer effects can still bias an as-treated or on-treatment analysis (i.e., an analysis where patients who discontinue treatment are censored from the analysis) if non-adherence is potentially informative. For this reason, an intention-to-treat analysis (where patients who were once treated are analyzed as always treated) should be conducted with adjustment for other potential confounders.
Active comparator While new user designs can guard against healthy adherer bias, healthy user bias can still occur. One way of reducing differences between patients initiating a preventive therapy and non-initiating comparators is to select the comparison group from patients who initiated a different preventive therapy.32 Comparisons between populations of patients that have all received preventive therapy will be more interpretable than studies where one group received no therapy at all, as those comparisons are more likely to be biased by unmeasured patient characteristics.
Improved statistical adjustment Unfortunately, there is no simple way to control for most biases described here using typical health care datasets, the source of most epidemiologic studies of prevention, because many variables of interest are unmeasured. However, we may able to identify proxies for many important hypothetical unmeasured confounders. For example, the use of preventive services such as age- or condition-specific use of vaccines, mammography, or colonoscopy may help to control for the healthy user effect. Similarly, we may control for the healthy adherer effect by adjusting for past adherence to other chronic medications. We may be able to develop a functional or cognitive capacity score that incorporates use of medications for dementia as well as length of stays in nursing homes and rehabilitation centers. Further predictors of selective prescribing may also be used in multivariable analyses to more fully control for confounding. There is an important opportunity for researchers to create and validate measures that better adjust for confounding in studies of preventive therapies. New computing-intensive techniques provide the ability to simultaneously adjust for hundreds of covariates and may better control for these types of confounding.33
Sensitivity/secondary analyses In the absence of better indices to control for these biases, a number of diagnostic tests can be used. One approach is to evaluate “negative control outcomes,” events that should be unaffected by the treatment under study but may be related to the phenomenon causing bias (e.g., healthy user effect). In an unbiased analysis, there would be no association between the treatment and these control outcomes. An example of such a design was a recent study evaluating the effect of flu shot receipt in patients before the flu season began, which found a strong association between flu shots and mortality, suggesting that the improved outcomes were not due to the influenza vaccines themselves.16
Negative control exposures may also be used. In such a design, the exposure would not be expected to have a biological effect on the study outcome, but might be influenced by health status or health-seeking behavior in the same way the study exposure is. As an example, one study of the effects of statin and beta-blocker adherence on mortality in post-MI patients adjusted for adherence to calcium channel blockers (the negative control exposure). This was included on the grounds that calcium channel blockers do not have a demonstrated benefit on mortality post-MI, but the adherence behavior itself may be related to the outcome.34
Instrumental variable methods or quasi-experimental designs can also be used to generate unbiased estimates in some situations.23,35,36 Dose-response relationships can be tested to better assess the strength of the relationship between the preventive therapy and the outcome in question. Additionally, researchers can restrict the study to a population that is likely to be more homogenous with regard to health-seeking status to assess if there is a difference in result. A study of the effect of flu shots restricted the population to those that had an optometry visit in the past year, dramatically reducing the strength of the observed relationship between flu shots and mortality.17
When interpreting epidemiologic studies of prevention in the scientific literature, we recommend a healthy skepticism when encountering what seem like surprising large beneficial effects of preventive therapies. Readers must first assess the plausibility of results. For example, one argument against the plausibility of the finding that flu shots reduce mortality by 50% is the observation that the flu season is only associated with a 10% increase in all-cause mortality.37
It also is important to consider whether the sources of confounding listed here could potentially result in bias, and if so whether they were considered in the analysis. When evaluating observational studies of preventive services, clinicians should ask themselves if the service may have been administered differentially between the study groups due to baseline health or health-seeking characteristics of patients. If so, clinicians should conduct a quick checklist prior to interpreting the results: Did the authors consider this source of bias? Do the absolute values of the effect size make sense? Are they clinically meaningful? Prior to interpreting the results, clinicians should assess whether the authors conducted sensitivity analyses to explore the effect of these biases on the results. Did they look at only new users of a preventive service or all users, and did they require controls to have used an active comparator? What types of sensitivity analyses were performed to adjust for these sources of bias? Did the authors use instrumental variables, consider negative control exposures or outcomes, or limit the sample to a more homogeneous subsample as sensitivity analyses, and did these attenuate the relationships?
Consideration of these sources of potential bias, and efforts to control for these biases are essential to better evaluate the credibility of this literature. While randomized controlled trials are required before bringing new preventive therapies to market, we often rely on observational data to more fully assess the risks and benefits of therapies. A more comprehensive approach to reading this literature will help providers to better assess the quality of the information being purveyed and should help to determine how conclusive the results seem to be. Intriguing findings that have not thoughtfully accounted for these sources of bias should lead clinicians to call for specific prospective RCT evidence to enhance clinical decision-making regarding preventive services. Clinicians should be skeptical when interpreting results of observational studies of preventive services that have not accounted for healthy user and related biases. Such skepticism should enhance a clinicians ability to prioritize the available evidence and hopefully will improve clinical decision-making.
This work is supported by a career development award from the National Heart, Lung and Blood Institute (K23HL090505-01) for Dr. Shrank. Dr. Brookhart is supported by a career development award from the National Institute of Health (AG-027400).
Conflict of Interest None disclosed.