|Home | About | Journals | Submit | Contact Us | Français|
Because many HIV care providers fail to detect patients’ hazardous drinking, we examined the potential use of the AUDIT-C, the first three of the 10 items comprising the Alcohol Use Disorders Identification Test (AUDIT), to efficiently screen patients for alcohol abuse. To perform this examination, we used Item Response Theory involving individual AUDIT items and AUDIT instruments completed by patients (N=400) at a Designated AIDS Center in New York City. At various AUDIT-C cutoff scores, specificities and sensitivities were then examined using the AUDIT as a “gold standard.” For cutoff scores on the AUDIT from 4–8, cutoff scores on the AUDIT-C of 3 and 4, respectively, resulted in sensitivities between .94–.98, and .81–.89, respectively, and in specificities between .82–.91, and .91–1.0, respectively. In busy HIV care centers, the AUDIT-C with cutoff scores of 3 or 4 is a reasonable alternative to the full AUDIT as an alcohol screening instrument.
Excess alcohol use is common among HIV patients,1–4 with many drinking above established national guidelines.5 According to the National Institutes of Health in the United States, healthy men up to age 65 should have no more than 4 drinks in a day and no more than 14 drinks in a week, and healthy women should have no more than 3 drinks in a day and no more than 7 drinks in a week. 5 Even lower limits apply for patients with health conditions like HIV that are exacerbated by alcohol use.5
Unfortunately, HIV care providers often fail to detect excess alcohol use among their patients.1 This is especially unfortunate in view of the relationship between excess drinking and increased morbidity and HIV disease progression;6–9 missed or off-schedule doses of antiretroviral medication;10,11 and sexual practices that place these patients and their sexual partners at increased risk for sexually transmitted infections.12,13 If HIV patients’ excess alcohol use is not identified, providers remain unaware of the need to intervene and counsel them to reduce alcohol consumption in order to limit its harms.
A promising approach to increase the likelihood of identifying excess drinking among HIV patients is to implement routine screening for excess alcohol use in the HIV care setting. One possible screening instrument is the highly regarded and widely used 10-item Alcohol Use Disorders Identification Test (AUDIT), designed and developed by the World Health Organization to screen for potentially harmful and hazardous drinking patterns in primary health care.14 A large body of research has explored the AUDIT’s psychometric properties, factor structure, and cutoff scores. In one study in the United States in which inner city general medical clinic patients aged 18 to 84 completed the AUDIT, there was very high sensitivity (.96) and specificity (.96) when AUDIT results were evaluated against DSM-III-R criteria for alcohol abuse and dependence.15 While other studies have shown considerable variation in the AUDIT’s sensitivity and specificity, most studies have found this sensitivity and specificity to be .7 or more.16 With regard to the AUDIT’s factor structure, analyses often support 2 factors,17–19 with the first 3 items representing a ‘consumption’ factor, and the remaining 7 items representing an ‘adverse consequences of drinking’ factor. Although correlated, these factors represent separate dimensions, indicating the possibility, for example, of a high level of consumption together with or without the existence of alcohol-related problems or adverse consequences. Scores of each item on the AUDIT vary from 0 to 4, with the item scores totaled to provide a summary score ranging from 0 to 40. Volk and colleagues20 identified a cutoff score of 4 as optimal to screen for “at-risk” drinking (i.e., any pattern of use or alcohol-related consequences that rules out non-problem drinking - e.g., drinking in excess of national guidelines, meeting the criteria for hazardous and harmful use, or meeting the criteria for abuse or dependence). A cutoff score of 8 on the AUDIT has been recommended more frequently,14 and with this cutoff score, the AUDIT has been used to screen various groups of male and female HIV patients for harmful and hazardous alcohol use patterns. These individuals include veterans with HIV infection;1 HIV patients in outpatient and infectious disease clinics;2,21 women with HIV infection who have a history of childhood sexual abuse;22 and HIV patients who have alcohol problems, including gay and bisexual men and patients currently taking HIV medications.12,23
Although the AUDIT has been shown to be superior to the widely used, briefer, 4-item, CAGE screening instrument in identifying active hazardous or harmful drinkers in a variety of populations,24–26 the AUDIT’s longer length has discouraged its even more widespread use. In particular, while documented to take 2 minutes when administered by a health care provider trained in its administration,27 the experience of HIV care providers indicates that it takes considerably longer with their patients.28 In order to save time, some have suggested using the first 3 of the 10 AUDIT items constituting the consumption factor as a stand-alone screening measure for hazardous or harmful drinking. Support for this approach comes from the fact that Cronbach alpha coefficients have been found to vary from .69 to .81 on the consumption factor of the AUDIT in a variety of populations.29,30 In addition, there is a frequent association between heavy recent alcohol consumption and the development of alcohol related adverse consequences,31–33 and a large proportion of the total AUDIT score (90% in a population-based sample in Sweden34) is typically obtained from the 3 consumption factor items. The use of these 3 consumption factor items (AUDIT-C) rather than the full 10-item AUDIT as a screening tool cuts the time of administration by a factor of three. This may be especially useful when time or other resources do not permit administration of the full AUDIT.35 In fact, the AUDIT-C has generally been found to be adequate in order to detect heavy drinking and/or alcohol abuse or dependence in general practice, primary care, and veteran populations.35,36 This shorter version of the AUDIT may be especially welcome in busy HIV primary care settings if it is able to identify the vast majority of individuals with harmful and hazardous alcohol use patterns as would be identified with the full AUDIT instrument. However, we currently lack information about the usefulness of this abbreviated AUDIT version among HIV patients.
To address this gap in our understanding, the AUDIT was administered to all HIV patients appearing for their annual examinations (N=400) over a consecutive six month period in a hospital-based HIV care center in New York City. In addition to reporting patients’ scores on the 10-item AUDIT, we examine the extent to which the AUDIT-C would yield comparable screening results regarding patients’ harmful and hazardous drinking patterns to those obtained when the full AUDIT is used. For this examination, using a variety of cutoff scores for the AUDIT-C and the 10-item AUDIT, we determine the sensitivities and specificities of the AUDIT-C, with the full AUDIT serving as the “gold standard.”
Data were collected in 2007 from HIV patients (N=400) receiving care at a Designated AIDS Center (DAC) in New York City. DACs are comprehensive, hospital-based, state licensed HIV treatment centers providing both inpatient and outpatient care. They utilize interdisciplinary teams and provide case management services, emphasizing quality improvement in order to provide a high level of clinical and support services.
The DAC at which the data were collected was participating in a larger study that was evaluating a state-of-the-art training on alcohol reduction for HIV patients. The 3 hour, National Institute on Alcohol Abuse and Alcoholism (NIAAA)-funded training, an adaptation of NIAAA’s manualized alcohol screening and brief intervention intended for clinicians,5 was created expressly for DAC providers. It emphasized the AUDIT instrument as a brief, psychometrically strong screening tool for alcohol consumption and its consequences, and participating providers practiced the use and scoring of the AUDIT during the training. At the training’s conclusion, providers were encouraged to implement the AUDIT with their patients.
A 30-minute interview with the director of the DAC elicited basic information about the DAC’s patients (e.g., the number served each year, their gender and race/ethnicity), the number of staff, and the alcohol reduction and elimination policies, procedures, and/or services that existed at the DAC before the training took place. A physician or a physician-assistant administered the AUDIT to every patient who appeared for an annual comprehensive examination between June, 2007 and December 2007 (N=400). At the request of the research team, the DAC director provided copies of each of these AUDITs for further analysis, with all patient identifiers removed (including gender, race/ethnicity, etc.). The study received approval from the Institutional Review Boards (IRBs) of the National Development and Research Institutes and New York University.
We analyzed the 10 items on the AUDIT using Item Response Theory (IRT) (also known as latent trait theory). (An introduction to IRT can be found at edres.org/irt). IRT is a body of theory describing the application of mathematical models to data from questionnaires and tests as a basis for measuring abilities, attitudes, or other variables. IRT models assume that there is an underlying (latent) distribution of people along a dimension (such as hazardous and harmful patterns of alcohol consumption), and that each item is an imperfect indicator of where people lie on this dimension. In IRT, discrete item responses are viewed as observable manifestations of a hypothesized trait, construct, or attribute that are not directly observed, but which must be inferred from the manifest responses. Items may be questions that have incorrect or correct responses, statements on questionnaires that allow respondents to indicate level of agreement, or patient symptoms scored present or absent.
IRT is based on the idea that the probability of a discrete outcome, such as a particular response to an item, is a mathematical function of person and item parameters. The person parameter is called a latent trait or ability; it may, for example, represent the extent to which a person exhibits a hazardous drinking pattern. The most important item parameters concern the location of the item on the underlying scale; for dichotomous items this would be called item difficulty. (The terminology was developed in the context of cognitive testing, and is used even though not strictly appropriate in all contexts, such as ours.) The item difficulty indicates where a person who had a 50 percent chance of answering the item positively (“correctly” in terms of cognitive tests) would be situated on the latent trait. For the AUDIT items, which have ordered response categories, the comparable item characteristics are thresholds; each item threshold indicates the place on a latent trait where a person crosses from answering in one category (e.g., “never”) to the next (e.g., “monthly or less”). Low thresholds indicate that many people will answer in higher categories of an item; higher thresholds indicate that most people will be in lower categories, and fewer in higher categories. In our analyses, we allowed the distance between thresholds to vary from item to item. This means that some items might have thresholds close together, indicating that a small change in the underlying trait results in a large change in the response; such an item is said to have high discrimination. Other items could have thresholds spread further apart, so that a larger change in the underlying trait is needed to change response categories.
In IRT, statistical theory and item parameter estimates from a data set are used to provide information about the psychometric properties of a given assessment, and the quality of estimates. Overall, IRT is intended to provide a framework for evaluating how well assessments work, and how well individual questions on assessments work.
Notably, IRT extends the concept of reliability, or precision in measurement, recognizing that precision is not uniform across the entire range of assessment scores. In particular, there is generally more error for scores in the outer range of the distribution of scores than those near the middle of the range. In IRT, reliability is replaced by item and test information, with each item reducing uncertainty about the person’s standing on the trait. This information is a function of the model parameters, with more information implying less error of measurement. Items with a high level of discrimination contribute a great deal of information, but over a narrow range, while less discriminating items provide less information but over a wider range.
Items with several response options provide a great deal of information if there is wide and even spacing between options. “Wide and even spacing” means that on the underlying dimension, (i) the point at which people transition from endorsing answer option “1” on an item to endorsing answer option “2” on that item is fairly far from the point at which they transition from endorsing answer option “2” to answer option “3” (and so on), and (ii) the distance between such points is about the same for all adjacent pairs of transition points. Items with response options that are widely and evenly spaced provide a great deal of information about where a person should be placed on the underlying dimension, compared with items that lack such characteristics. Narrow spacing indicates that at least one response option is chosen very infrequently, and is therefore useful only for the few people who fall in exactly the right place on the underlying (latent) dimension. One important implication of the wide and even spacing for a group of assessment items is that summing these items (as is usually done in scale construction) should be a reasonable procedure because item information functions are additive and the test information function is the sum of the information functions of the items on the assessment. In this way, IRT enables a determination of whether adding the scores on individual items to get a total score is justified.
By itself, the information measure is difficult to interpret in a non-technical manner, but it is a useful basis for comparison with possible subscales formed from selected items, or for different methods of scoring each item. Overall, for each individual, the amount of information indicates how accurately that person can be placed on the underlying dimension; for the whole instrument, the information is a summary of the amount of information for individuals. Of interest in the current work is how much information is lost by not using all of the items on the 10-item AUDIT.
To perform the IRT analyses, we used IRT software (the ltm program written in the R language37), to determine the characteristics of each AUDIT item individually, and of the AUDIT instrument as a whole. We examined whether it was appropriate to assign consecutive integers (0–4) to the answer options for the items with five answer choices when the AUDIT is used in a population of HIV patients. We also determined how well the shortened version of the AUDIT, the AUDIT-C, would perform in this population.
We also created receiver operating characteristic (ROC) curves38 to illustrate the relationship between sensitivity and specificity using various cutoff points on the AUDIT and the AUDIT-C. Such curves are useful when there is a binary classification of categories or a binary classification formed from continuous data based on an established threshold (cutoff) value. Such curves involve plotting sensitivity (x 100) on the vertical axis and (1-specificity) (x100) on the horizontal axis. This can also be thought of as a plot of the fraction of true positives [true positives/(true positives + false negatives)] versus the fraction of false positives [false positives/(true negatives + false positives)].39 Perfect test performance (on both sensitivity and specificity) would be indicated by a point in the upper left-hand corner of the plot. Chance performance is a diagonal line from lower left to upper right. The graphical approach makes it relatively easy to understand the inter-relationships between the sensitivity and specificity of a particular measurement.40
In the participating DAC, a total of 39 staff served 1100 patients each year. At the time of data collection, two thirds (66%) of patients were male, 28% were African American, 54% were Latino, and 16% were White. Before the training, alcohol screening at the DAC was limited to several non-standardized questions concerning current drinking.
After the training, the DAC director saw the implementation of the AUDIT as an intervention in and of itself. He opted to have physicians and physician-assistants administer the tool in his DAC for a six month period beginning in June 2007 to all patients who appeared for their annual exams. The AUDIT was introduced to these patients with the following words: “We’re doing a systematic evaluation to see how much people are drinking. I’m going to go over these questions.” No patient during these 6 months refused to respond to the AUDIT questions.
To examine responses on the AUDIT and the AUDIT-C, analyses were first conducted on each of the items and on the scales as a whole using Item Response Theory (IRT). Results indicated that AUDIT items 1, 2, and 3, which comprise the AUDIT-C, performed well in that (i) each item provided a large amount of information (in the technical sense of IRT) about the underlying dimension of hazardous and harmful patterns of alcohol consumption, and (ii) each option on the items provided useful information. In particular, the options for responses were widely and approximately evenly spaced (with the exception of the two highest response options for item 2), suggesting that summing these three AUDIT-C items (as is usually done in scale construction) should be a reasonable procedure for HIV patients.
Naturally, using only three items out of ten will lose some information in the ability to accurately place a person on the underlying dimension of harmful and hazardous drinking patterns. In this case, using an information measure common in IRT,41 information obtained from the score on the 3-item AUDIT-C was about 40 percent of the information available in the total 10-item AUDIT score. However, much of that information loss might be at the lower end of the underlying dimension, primarily differentiating within those who would have a negative screen and therefore not affecting screening results. Thus, an investigation of the sensitivity and specificity of the AUDIT-C was conducted for the purpose of classifying respondents as having a positive or negative screen.
As can be seen in Table 1, of the 400 individuals who completed the 10-item AUDIT, 70 (17.5%) scored at least 4, 54 (13.5%) scored at least 5, 42 (10.5%) scored at least 6, 34 (8.5%) scored at least 7, and 27 (6.8%) scored at least 8. Thus, depending on the specific value of the cutoff score between 4 and 8 to indicate a positive screen, between 6.8% and 17.5% of the 400 individuals who completed the 10-item AUDIT screened positive for at-risk drinking. With cutoff scores of 3 or 4 on the AUDIT-C, between 14.2% and 23.7% of the 400 HIV patients would have been classified as having a positive screen for at-risk drinking. Notably, of individuals who scored 2 or less on the AUDIT-C, only 1.3 percent would get totals of 4 or more on the full AUDIT, and 0.3 percent would score 8 or higher on the full AUDIT. Using a cutoff score of 3 or greater on the AUDIT-C would therefore miss few people if used as a screening measure.
Screening results using the AUDIT-C were compared to those that would occur using the full AUDIT instrument with several possible criteria (i.e., cutoff scores on the AUDIT from 4 to 8). Sensitivities and specificities were computed for a variety of AUDIT-C scores using full AUDIT scores ranging from 4 to 8 as the “gold standard;” these are summarized in Table 1. Using the data in Table 1, receiver operating characteristic (ROC) curves were also plotted (see Figure 1). Several interesting results are apparent in the plot. Regardless of the cutoff score used on the AUDIT (between 4 and 8), the ROC curves are very similar for the most part, being little affected by the cutoff score used on the AUDIT. The exception is the extreme left-hand part of the plot, which shows (as might be expected) that for points with great specificity (where the cutoff score would be the same for both the AUDIT and the AUDIT-C), higher cutoff scores result in lower sensitivity. For example, with a cutoff score of 4 on both the AUDIT and the AUDIT-C, sensitivity is .81 and specificity is 1.0 [with (1-specificity) therefore 0.0]. In order to increase sensitivity, specificity must be lowered; there is always a tradeoff between the two. Several possible combinations of cutoff points on the AUDIT and AUDIT-C give sensitivity or specificity values of at least .9, with the other (at least) near .9. Of special note, for any cutoff score on the AUDIT from 4 to 8, a cutoff of 3 on the AUDIT-C gives a sensitivity of at least .94 and a specificity of at least .82, while a cutoff of 4 on the AUDIT-C gives a sensitivity of at least .81 and a specificity of at least .91. Viewed another way, a cutoff score of 3 or more on the AUDIT-C has sensitivity between 94% and 98%, and specificity between 82% and 91% depending on the actual AUDIT cutoff score between 4 and 8. In addition, depending on the specific cutoff scores from 4 to 8 on the AUDIT, a cutoff score of 4 or more on the AUDIT-C has sensitivity between 81% and 89%, and specificity varying from 91% to 100%.
Routine alcohol screening in the HIV care setting is an important strategy to identify patients with harmful and hazardous drinking patterns so that they can be supported in reducing alcohol-related harms. Unfortunately, a variety of barriers, including lack of training in how to screen patients for such patterns of alcohol use, how to manage patients with a positive screen, and time constraints limits its routine use. 42–48 One possible time-saving approach is to have patients self-administer the screening items either using Audio Computer Assisted Self Interviewing (ACASI) or paper-and-pencil forms.49 Such administration may also be helpful in terms of reducing the social desirability of responses: compared with interviewer-administered surveys, patients report more socially undesirable responses about drinking behavior with self-administered questionnaires, whether paper-and-pencil or computer-assisted.50,51 However, limited literacy among HIV patients prevents many from completing paper-and-pencil instruments without assistance, suggesting that provider administration is preferable.28 In addition, a variety of logistical issues regarding ACASI administration (e.g., lack of computer literacy, limited availability of computers and printers)49 tend to hamper self-administration opportunities.
Our findings support the use of the AUDIT-C, the first 3 items of the AUDIT, as a time-saving alternate approach to alcohol screening with the full AUDIT in an HIV population. In particular, using Item Response Theory, we determined that the 3 AUDIT-C items performed well in that each item provided a large amount of information about the underlying dimension of hazardous and harmful patterns of alcohol consumption, that each option on the items provided useful information, and that it was therefore reasonable to sum the 3 items to form a scale. In addition, for a cutoff of 8 on the full AUDIT, a cutoff of 3 on the AUDIT-C gave a sensitivity of .96 and a specificity of .82, while a cutoff of 4 on the AUDIT-C gave a sensitivity of .89 and a specificity of .91. For cutoff scores of 4 to 7 on the full AUDIT, a cutoff of 3 on the AUDIT-C gave sensitivities between .94 and .98, and specificities between .83 and .91, while a cutoff of 4 on the AUDIT-C gave sensitivities between .81 and .88 and specificities between .93 and 1.0. Our results are consistent with those of Caviness and colleagues52 whose examination of the specificity and sensitivity of the AUDIT-C relative to the 10-item AUDIT in a sample of 1,751 incarcerated women also suggests that cutoff values of 3 or 4 were acceptable. Whether 3 or 4 is an optimal cutoff score on the AUDIT-C would depend on the relative concerns about false positives and false negatives as reflected in the sensitivities and specificities.
Our findings also suggest that a substantial proportion of patients receiving care at the DAC have hazardous or harmful drinking patterns. With cutoff scores of 3 or 4 on the AUDIT-C, our results indicate that between 14.2% and 23.7% of the 400 screened DAC patients would have been classified as having a positive screen for at-risk drinking. In addition, between 6.8% (cutoff score of 8) and 17.5% (cutoff score of 4) of the 400 individuals at the DAC had a positive screen for at-risk drinking using the full AUDIT. In their analyses using the full AUDIT with a cutoff score of 8 to identify at-risk drinking among HIV patients, other researchers have found the proportion of at-risk drinkers to vary between 3% and 24%.1,2,21,22
We acknowledge a number of limitations to the research. First, we did not validate the AUDIT-C results against “gold standard” diagnostic interviews (like the DSM-IV) for alcohol abuse and/or dependence. In addition, because the AUDIT-C questions are a component of the full AUDIT, the “gold standard” in our study, the measures tested are not independent. Of note, studies conducted by Dawson and colleagues53,54 using the DSM-IV as the “gold standard” have found similar sensitivities (.88 to .93) but lower specificities (.66 to .72) from those determined in our study. Other limitations include the lack of individual level data (e.g., gender, age), and the fact that all of the AUDIT scores were collected from one DAC. However, because the AUDIT was administered by physicians and physician-assistants to all HIV patients appearing for their annual visit over a consecutive 6-month period, some major sources of potential bias in the data have been eliminated.
In spite of these limitations, our results suggest that the 3-item AUDIT-C provides an acceptable alternative to the use of the 10-item AUDIT to screen for a hazardous and harmful pattern of alcohol consumption among HIV patients. Acknowledging the limited time available in busy HIV care centers for such a screening, the shorter length of the AUDIT-C may encourage its routine use in order to identify HIV patients in need of support to reduce or eliminate alcohol-related harms.
Funding for this study was provided by the US National Institute on Alcohol Abuse and Alcoholism (grant #R21 AA016743). We also gratefully acknowledge the support of the Muriel and Virginia Pless Center for Nursing Research at the New York University College of Nursing.