|Home | About | Journals | Submit | Contact Us | Français|
Despite the high prevalence of alcohol consumption in the US, ‘mainstream’ physicians generally consider it to be peripheral to most patient care. This may be due in part to a dearth of rigorous research on alcohol’s effect on common diseases.
To evaluate this issue, we examined six systematic reviews, four of which were conducted as part of a research initiative supported by the Robert Wood Johnson Foundation, the Program of Research to Integrate Substance Use Information into Mainstream Healthcare (PRISM). PRISM aimed to assimilate and improve the evidence on the medical impact of alcohol (and other drugs of abuse) on common chronic conditions.
From these reviews, we summarize the methodological limitations of research on alcohol’s impact on development and/or clinical course of depression, hypertension, diabetes, bone disease, dementia, and sexually transmitted diseases. The studies included in these reviews were largely fair to good quality, and few were in primary care settings. Syntheses were hampered by the myriad of definitions of alcohol consumption from any/none to seven levels and a plethora of types of alcohol use disorders.
We recommend more high-quality observational and experimental studies in primary care settings as well as a more standard approach to quantifying alcohol use and to defining alcohol use disorders.
A 2005 Institute of Medicine (IOM) report warned that physicians’ failure to screen for and address the effects of alcohol on health and disease compromised the quality of care for Americans1. The IOM report is responsive to warning signs from national survey data. For example, according to 2006 National Center for Health Statistics data, 61% of Americans drink alcohol, and 33% of drinkers report binging (i.e., five or more drinks/day)2. Despite the high prevalence of alcohol consumption, often at levels deemed to be unhealthy3, most physicians pay little attention to their patients’ alcohol consumption or to its effects on common diseases4–6. When physicians do ask about alcohol use, they record it in the social history section of the chart so alcohol use information is usually not accessible when diagnosing and treating diseases.
The limited relevance and quality of research on the effects of alcohol use on common clinical conditions seen in primary care practice may contribute to this lack of attention. Thus, with support from the Robert Wood Johnson Foundation, the Program of Research to Integrate Substance Use Information into Mainstream Healthcare (PRISM) was launched in 2004 to assimilate and improve the evidence on the medical impact of alcohol (and other drugs of abuse) on common chronic conditions. To these ends, we commissioned systematic reviews about the risks and benefits of alcohol use on a variety of common medical conditions that internists manage on a daily basis. In addition, other critical reviews have recently been published on the impact of alcohol use disorders on common clinical conditions. In this commentary, we identify themes from the critiques of the evidence from six reviews and offer suggestions to improve the validity and relevance of the research on the impact of alcohol consumption on common clinical conditions.
For this methodological critique, we included six reviews: four reviews commissioned by PRISM7–10 and two identified in a search of the literature. We searched PubMed, MEDLINE, and Cochrane Libraries for all English language systematic reviews published from 1/2004 through 5/2009 using the search terms “alcohol drinking/adverse effects” AND “systematic reviews.” The search found 37 unique studies, of which 211–12 met the following criteria shared by the PRISM reviews: examined the impact of alcohol on the development, clinical course, or management of common diseases; performed explicit, reproducible searches; critiqued methods in identified studies; and written for a general audience (i.e., published in non-addiction specialty journals).
As shown in Table 1, most of these six systematic reviews initially identified large numbers of potentially relevant studies, but only small fractions were deemed eligible for inclusion in the final review. Most identified studies employed a prospective cohort design, but experimental trials were also conducted for several of the review topics. A meta-analysis was not conducted by two reviews because of the heterogeneity of research designs, the myriad ways of defining alcohol consumption, and the diversity of the outcome measures.
In regard to the relevance of this research to primary care practice, the depression review included seven studies from primary care settings, but four of these were combined with psychiatric care settings. The other systematic reviews did not specifically address whether the studies were conducted in primary care patients. However, in four reviews, some evidence came from observational cohorts drawn from general populations that are likely to be relevant to primary care practice.
Four reviews employed previously published quality assessment tools to evaluate the rigor of identified studies (Table 1). The depression study reported that several randomized trials were of excellent quality7, but, based on the Quality Score Index score13, the observational studies on alcohol and depression were of lower quality than studies included in other systematic reviews14. The hypertension review focused primarily on critiquing the method of measuring blood pressure8. The diabetes and bone reviews judged the eligible cohort studies to be primarily of “fair quality” because most were deficient in adjustment for confounders. For example, in the diabetes review, the authors noted failure to adjust for waist-to-hip ratio, family history of diabetes, and race, while, for the bone disease review, the authors found that most studies did not adjust for obvious correlates of bone disease such as estrogen use9–10. The dementia review did not report serious flaws in selected studies, but they employed a measure that did not provide an overall research quality grade11. The STD review primarily critiqued the types of alcohol measures and STD outcomes in identified studies12.
The overriding methodological concern noted by the systematic reviews was the heterogeneity of the alcohol use measures employed by the research studies (Table 2). The review of alcohol’s effect on blood pressure was least affected by this variability because all of the included experimental trials reported on the quantity of alcohol that was administered. But these trials frequently did not describe study subjects’ usual alcohol consumption prior to the study. Further, a dose-response relationship of alcohol use on blood pressure was not estimated by this review, presumably because of the variability in amounts and timing of the doses of administered alcohol.
Studies included in the other five reviews used a myriad of quantities, time frames, and terms to describe patterns of alcohol use. Literally dozens of approaches were used to define the amount of alcohol consumed, ranging from a simple yes/no measure to an idiosyncratic seven-category measure. The units of analysis in summary measures also varied from grams of alcohol per day to “drinks” per varying time periods. The diversity of approaches used to examine alcohol consumption is best demonstrated by the STD review where the authors were forced to examine four different summary categories of alcohol use measures12. The depression review included only studies of persons with “alcohol problems” but had to create a detailed table of the definitions for the many terms used for these problems including: at risk, hazardous, harmful, abuse, dependence, and alcoholism7. In other reviews, the spectrum of alcohol consumption in the included studies ranged from rare to heavy, but both the diabetes and bone disease reviews noted that few studies included women consuming larger amounts of alcohol9–10. In addition, five reviews critiqued many studies because the ‘non-drinker’ category combined persons who never drank alcohol with former users who may have previously suffered from an alcohol use disorder. Finally, few studies addressed the type of alcohol consumed.
Despite the ubiquity of alcohol consumption in the US and its potential to affect the diagnosis, management, outcomes, and costs of common chronic diseases, outside of the PRISM-sponsored reviews, we found few additional reviews in medical journals that systematically synthesized the evidence of the impact of alcohol drinking on diseases that are routinely treated by primary care and other mainstream physicians. The best evidence identified by these reviews was generally of mediocre quality as judged by established research quality measures. Except for hypertension, the reviews found few randomized clinical trials regarding the effect of alcohol on the selected diseases. Admittedly, randomized trials of alcohol consumption over an extended time frame may be impractical for some diseases. However, there is good reason for physicians to be skeptical of the results of simple observational designs concerning this or any other topic.
Beyond the fundamental design issues regarding the studies in these reviews, there are other important methodological flaws including limited adjustment for potential confounders. However, the most serious but potentially correctible flaw across all these reviews is the unacceptable variability in approaches to measure the quantity, frequency, and duration of alcohol use. To standardize experimental research with alcohol, Brick has proposed several mathematical approaches to determine ounces of pure (100%) alcohol provided in a single drink and over the course of an entire trial15. Observational studies included in these reviews frequently analyzed the effect of alcohol as quantified by a ‘standard drink.’ The US Department of Agriculture (USDA) defines a standard drink as 13.7 g (0.6 ounces) of pure alcohol, which correlates to a 12-oz can of beer, 5-oz glass of wine, or 1.5-oz glass of distilled spirits16. However, in Australia, for example, a standard drink has 10 g of alcohol17, making syntheses of the health effects of alcohol from international studies more challenging. The USDA and the National Institute on Alcohol Abuse and Alcoholism (NIAAA)18 set the standard for “non-problematic” drinking for a healthy adult man as no more than 14 drinks per week and, for healthy women, as no more than 7 drinks per week. However, these two federal agencies differ on the amount that would be considered excessive for a single day. The USDA recommends only one drink for women and two for men, whereas the NIAAA defines excessive consumption (binging) as more than three drinks for women and four for men. Special attention to binge drinking is highly relevant to defining alcohol use disorders because researchers using the Behavioral Risk Factor Surveillance System showed that adding information about binging to an average daily alcohol consumption measure increased the relative prevalence of “heavy drinking” by up to 42%, depending on how binging is measured19. In addition to the need for standardized measures of alcohol consumption, researchers also must routinely assess the pattern, quantity, and time frame/duration of alcohol consumed at baseline by subjects in a trial, as noted in the hypertension review8.
Other non-standard approaches abound in research on the effects of alcohol consumption above acceptable levels. Terms used to describe these patterns of alcohol consumption include: alcohol problems, unhealthy alcohol use, alcohol use disorders, excessive drinking, alcohol abuse, alcohol dependence, and harmful drinking. The depression review provided a separate table just to clarify the definitions of the multiple measures used in the identified studies7. Current Diagnostic and Statistical Manual of Mental Disorders (DSM) IV categories add to the confusion with 15 codes used to describe alcohol consumption patterns, such as abuse, dependence, withdrawal, alcohol-related disorder, and intoxication20. Hopefully, DSM V will correct this morass of terms that fail to even offer a category for potentially excessive alcohol consumption without currently evident negative health effects, such as for a woman drinking two glasses of wine a night. Research increasingly supports the use of the AUDIT instrument as a valid and reliable measure to identify a full range of alcohol use disorders ranging from abstinence through non-problem use to dependence21. Thus, standardization of the instruments used to define alcohol use disorders must also be a goal.
A serious threat to validity in studies of the health effects of alcohol occurs when the reference group of non-drinkers includes current abstainers who have suffered health consequences and ceased to drink22. Even the “never drinkers” may be heterogeneous because one study reported that over half of the persons who claimed to be lifetime abstainers had previously reported heavy to problematic use alcohol23. Among drinkers, under-reporting alcohol consumption may present an even greater threat to the validity of research on the health effects of alcohol. Klatsky and colleagues reported that persons with hypertension who reported one to two alcoholic drinks a day were 75% more likely to have high liver transaminase enzymes than persons reporting no or less frequent use24. These laboratory results suggest alcohol-related liver injury in some of these ‘moderate’ alcohol drinkers. These challenges reinforce the need to continue to develop innovative approaches to reduce the stigma of reporting about alcohol use and to improve the validity of self-report data. A final dimension that most studies have not attempted to address is the type of alcohol consumed because it adds another layer of complexity in addition to the multiple alcohol use measures.
Based on the evidence synthesized by these reviews, we offer the following recommendations. First, wherever possible, research on health effects of alcohol needs to use experimental designs with a standard measure of pure alcohol consumed over a specific time frame. Specifically, we suggest the reporting of standard (NIAAA) drinks per day of beer or wine and of spirits over a 1-week period—permitting the calculation of total ounces of alcohol consumed within a report period that should be easy for participants to remember.
Second, observational studies need to be conducted in primary care populations using standard measures of quantity/frequency as well as maximum daily drinking (binging). Specifically, we recommend the use of valid and reliable measures such as the AUDIT or the shorter AUDIT-C.
Third, we recommend standardization of terms for alcohol use disorders based on the NIAAA guidelines, but, in the near future, DSM V may offer a better codification of these terms.
Fourth, the rigor of observational studies needs to be improved by measuring alcohol consumption at multiple points in time and by assessing a broad array of key confounders. The list of potential confounders varies by study outcome, but must at least include measures of tobacco and other drug use because of their correlation with alcohol use and the independent, powerful effects they can have on health25. Because alcohol use may promote health or can have powerful negative effects that cause disease, research in the future needs to be more rigorous, relevant, and practical for clinical practice.
Both authors gratefully acknowledge the support of the Robert Wood Johnson Foundation for PRISM and the support of the Betty Ford Foundation for the development of this paper. Dr. Turner receives support from Pfizer, Inc., through a grant to the University of Pennsylvania for unrelated research. Dr. McLellan reports no sources of other support or conflicts of interest.
Supported by the Robert Wood Johnson Foundation and the Betty Ford Foundation