|Home | About | Journals | Submit | Contact Us | Français|
The Clinical Opiate Withdrawal Scale (COWS) is an 11-item clinician-administered scale assessing opioid withdrawal. Though commonly used in clinical practice, it has not been systematically validated. The present study validated the COWS in comparison to the validated Clinical Institute Narcotic Assessment (CINA) scale.
Opioid-dependent volunteers were enrolled in a residential trial and stabilized on morphine 30 mg given subcutaneously four times daily. Subjects then underwent double-blind, randomized challenges of intramuscularly administered placebo and naloxone (0.4 mg) on separate days, during which the COWS, CINA, and visual analog scale (VAS) assessments were concurrently obtained. Subjects completing both challenges were included (N=46). Correlations between mean peak COWS and CINA scores as well as self-report VAS questions were calculated.
Mean peak COWS and CINA scores of 7.6 and 24.4, respectively, occurred on average 30 minutes post-injection of naloxone. Mean COWS and CINA scores 30 minutes after placebo injection were 1.3 and 18.9, respectively. The Pearson correlation coefficient for peak COWS and CINA scores during the naloxone challenge session was 0.85 (p<0.001). Peak COWS scores also correlated well with peak VAS self-report scores of bad drug effect (r=0.57, p<0.001) and feeling sick (r=0.57, p<0.001), providing additional evidence of concurrent validity. Placebo was not associated with any significant elevation of COWS, CINA, or VAS scores, indicating discriminant validity. Cronbach’s alpha for the COWS was 0.78, indicating good internal consistency (reliability).
COWS, CINA, and certain VAS items are all valid measurement tools for acute opiate withdrawal.
The opiate withdrawal syndrome, a constellation of characteristic signs and symptoms, has been called “one of the most stereotyped syndromes in clinical medicine” (Isbell, 1950). The first instrument to quantitatively measure withdrawal was developed by Kolb and Himmelsbach in the mid-1930s (Kolb and Himmelsbach, 1938). That scale was based on clinical observations and was weighted heavily towards physical signs of withdrawal, such as systolic blood pressure changes, mydriasis, fever, and respiratory rate changes. In the 1960s, the Opiate Withdrawal Experience Scale, a subset of self-report questions from the Addiction Research Center Inventory (ARCI), was used to quantify the subjective symptoms of withdrawal (Haertzen and Meketon, 1968). However, this scale was time consuming for subjects to complete, even with the derived short form Opiate Withdrawal Questionnaire (Haertzen et al., 1970).
Following development of those initial instruments, multiple other subjective and objective scales have been developed and used (Handelsman et al., 1987; Judson et al., 1980; Wang et al., 1974; Bradley et al., 1987; Gossop, 1990). Methods for using these scales have sought to improve on sensitivity and specificity for detecting withdrawal by controlling the level of physical dependence, the time point within the withdrawal syndrome when the assessment is made, and the possibility of feigned responses. In 1988, Peachey and Lei reported on the reliability and validity of the Clinical Institute Narcotic Assessment (CINA), one of the first scales to include both opiate withdrawal signs and symptoms (Peachey and Lei, 1988). This scale was validated using a naloxone challenge in heroin-dependent subjects and the peak score was found to predict the clinically determined maintenance methadone dose used to treat these patients. However, the CINA required nursing support to measure heart rate and blood pressure and contained items which could be easily feigned. As well, there was no fixed upper limit to the scale given the variable contribution of blood pressure and pulse ratings.
Wesson and colleagues, therefore, developed the Clinical Opiate Withdrawal Scale (COWS). This scale was designed to be administered quickly, was intended to improve upon existing measurement tools, and was first published in a training manual for buprenorphine treatment (Wesson et al., 1999). The COWS consisted of an 11-item rating system that could be completed within two minutes by a trained observer and could track opioid withdrawal as differentiated from opioid toxicity through serial measurements. Total scores ranged from 0 to 47, and withdrawal was classified as mild (5-12), moderate (13-24), moderately severe (25-36), or severe (>36). These category scores were not derived using standard statistical techniques but were based upon the authors’ clinical expertise (Wesson and Ling, 2003). Because of its clinical utility, its association with buprenorphine maintenance, and ease of application, the COWS has become widely used for assessing opiate withdrawal (Center for Substance Abuse Treatment, 2004). Although the scale was modeled after items on previously validated scales, the COWS itself has never been systematically validated (Wesson and Ling, 2003). The present project assessed the validity of the COWS in comparison to a previously validated instrument, the CINA, using a double-blind, placebo-controlled naloxone challenge in opioid-dependent individuals. As well, comparisons between the COWS, CINA, and single-item subjective ratings (Visual Analogue Scales) were done to examine the validity and possible utility of using one overall item to assess opioid withdrawal.
Forty-six out-of-treatment opioid-dependent volunteers participated while residing on a supervised research unit at the Johns Hopkins Behavioral Pharmacology Research Unit (BPRU). Participants were recruited for a clinical trial that will be reported on separately; the trial was registered at www.clinicaltrials.gov, identifier NCT00637000. The analyses in this paper were done as part of the confirmation of opioid physical dependence required for the subsequent opioid clinical pharmacology study. In order to be enrolled, participants had to meet DSM-IV-TR (American Psychiatric Association, 2000) criteria for opioid dependence and be between the ages of 18-65, willing to stay on the residential research unit for up to 12 days in order to complete the full clinical trial, and on adequate birth control (if female). Exclusionary factors were clinically significant medical or psychiatric diagnoses (i.e. schizophrenia or active suicidal ideation); engaged in opioid agonist, partial agonist or antagonist treatment immediately prior to admission; pregnant or lactating; physically dependent on alcohol or sedative hypnotics; and poor oral health (i.e. active aphthous stomatitis, active oral herpes, tongue or mouth piercing, or requiring immediate dental attention). This last condition was included because the main clinical trial involved sublingual drug administration.
The Johns Hopkins Institutional Review Board approved this study and all participants provided written, informed consent. Subjects in the present analysis were primarily male (74%), Caucasian (61%) and had a mean age of 41.7 years. The primary opioid abused by subjects was either heroin (mean use 6 years, SD 9.5) or prescription opioids (mean use 4.7 years, SD 4) prior to study entry, and all subjects had been using opioids daily (96%) or near daily (at least 16 days; 4%) in the 30 days before study entry. Forty-nine subjects initially enrolled; the present report includes the 46 who completed both the placebo and naloxone challenges. Two volunteers withdrew for non-study-related personal reasons after one challenge session, and one participant withdrew after experiencing a panic attack during the naloxone challenge session. Additionally, two participants had their naloxone sessions stopped after 30 minutes for excessive withdrawal symptoms.
Participants were screened on an outpatient basis and then admitted to the research unit where they were stabilized on 30 mg of subcutaneously administered morphine given four times daily (120 mg/day) for 2-8 days prior to the challenge sessions (mean 4.4 days, SD 1.3). After stabilization, participants received intramuscularly administered injections of placebo and 0.4 mg naloxone in a randomized, double blind fashion in two sessions separated by at least 24 hours. Withdrawal assessments and drug effect rating scales, vital signs, and pupil measurements were collected every fifteen minutes, starting 30 minutes pre- and through 150 minutes post-injection, except for the time of drug administration (time 0). Trained research assistants, who were present during the entire session, collected the data and administered the scales.
Withdrawal measurements consisted of the CINA, COWS, and VAS self-report items.
The item content and scoring of the CINA and COWS are summarized in Table 1. There is substantial overlap in content, but each scale also includes items absent from the other. The CINA consists of 13 items — 1 purely subjective symptom item, 7 purely objective sign items, and 5 items that included subjective and objective components. The COWS consists of 11 items — 1 purely subjective symptom item, 6 objective sign items, and 4 items that included subjective and objective components. Item scoring options were specified differently for the two scales, but each scale summed the scores of its items to produce a total score. The COWS provided instructions for categorical ratings of pupil size and pulse, including an option for a zero score. On the CINA, the heart rate and blood pressure items ensured a minimum score of approximately 20 even in the absence of any withdrawal.
Visual analog scales (VAS) were single item questions that assess subjective drug effects at the time of scale completion (Preston et al., 1988). Ratings were completed on a computer; using a mouse, the subject positioned an arrow along a 100-point line marked at either end with “none” and “extremely.” VAS items in the present study were: “Do you feel any DRUG EFFECT?”, “Does the drug have any GOOD EFFECTS?”, “Does the drug have any BAD EFFECTS?”, “How HIGH are you?”, “Does this drug make you feel SICK?”, and “Do you LIKE the drug?”
Pupil diameter was assessed with a digital pupillometer (Neuroptics, Inc.) in constant room lighting. The measurements provided by the pupillometer were also used for the pupil score in the COWS, which required observers to categorize the pupil diameter (Table 1).
Mean scores and standard deviations were calculated for each time point in the naloxone challenge sessions using SAS™ software, Ver. 9.1 (SAS Institute, 2003). Repeated measures regressions were used to assess significant differences on the separate opioid withdrawal measurements, using drug (naloxone versus placebo), time, and drug-by-time interaction terms. All the rest of the statistical calculations used SPSS 16.0 (SPSS Inc., Chicago, IL). Concurrent validity was assessed using Pearson correlation coefficients calculated between peak CINA, COWS, VAS items, and pupil diameter during the naloxone challenge session. Correlations of time to peak on the different measures were similarly calculated. Internal consistency reliability of the 11 COWS items was assessed with Cronbach’s alpha statistic. Lastly, inter-item correlation matrices were created to describe the association of individual COWS and CINA items to each other and to the total scale score.
Overall, the COWS and CINA scales were very similar in terms of both the magnitude and time course of their withdrawal score changes in the naloxone challenge session, demonstrating the concurrent validity of the COWS. Figure 1 shows the mean scores and standard errors (SEM) of the COWS and CINA graphed versus time. The mean peak COWS (7.6) and CINA (24.4) scores occurred on average at 30 minutes post-injection, which is within the expected time range of peak withdrawal following intramuscular naloxone (Daftery, 1974; Wang et al., 1974; Judson et al., 1980). Additionally, time to peak (TTP) analysis revealed positive correlation between COWS TTP and CINA TTP (r=0.66, p<0.0001). Repeated measure regression analysis revealed statistically significant effects on COWS for drug (naloxone vs. placebo) (F= 79.3, df=45, p<0.0001), time (F=15.03, df=495, p<0.0001), and drug-by-time interaction (F=13.82, df=476, p<0.0001). There were also significant effects for the above three analyses on the CINA (F=77.4, df=45, p<0.0001), (F=10.35, df=495, p<0.0001), and (F=10.94, df=477, p<0.0001), respectively. Table 2 shows a strong positive correlation between peak COWS and CINA scores (r=0.85, p<0.001) during the naloxone challenge session.
Table 2 also shows the effect of omitting various physiological measurement items from the CINA and COWS. For the COWS, removing the pupil diameter item (COWS noPUP), heart rate item (COWS noHR), and both of these measures (COWS noPHYS) still leave these modified COWS scores highly correlated with the CINA, indicating that these items may not be needed to detect this level of opioid withdrawal with the COWS. The score on the COWS heart rate item is 0 (<80), 1 (81-100), 2(101-120), or 4 (>120). In this sample, subjects had little change in heart rate during the naloxone session (peak 7.5 bpm change from baseline); therefore, this item rarely affected the total COWS score which may explain high similarity in correlation coefficients between COWS vs. CINA and COWS without the heart rate vs. CINA. Similarly, correlations between the CINA without heart rate item (CINA noHR), systolic blood pressure score (CINA noBP), or both of these measurements (CINA noPHYS) were highly correlated with the CINA total score, as well as the COWS total score and the various modified versions of the COWS.
Two VAS items, Bad Effects and Sick, showed a similar time course but greater variability in mean score than the CINA (Figure 2). VAS mean peak scores for Bad Effects and Sick occurred on average somewhat later than the CINA peak, with the peak score occurring at 60 minutes for Bad Effects (Score=33.2) and 45 minutes for Sick (Score=28.1). The VAS time course of a rapid increase in scores after injection and then a gradual decline over 2.5 hours was very similar to the time course seen with the CINA and COWS (Figure 1). Results from repeated measures regression revealed statistically significant effects for drug condition, time, and drug-by-time interaction on these two VAS items (p<0.0001 in all cases). Correlation analysis showed moderately good association between peak CINA and Bad Effects (r=0.63, p<0.001) as well as Sick (r=0.65, p<0.001) (Table 2). Correlations between peak VAS and COWS were slightly lower; however, there was a strong correlation between the two VAS items (r=0.88, p<0.001). Lastly, no significant correlations were seen between peak CINA or COWS scores and VAS ratings for Good Effects, Drug Liking, or High.
The time course of pupil diameter change showed a mean peak increase (1.03 mm) that occurred 15 minutes after naloxone injection, followed by gradual return to baseline. Pupil diameter showed <0.27 mm change from baseline in the placebo session. Repeated measures regression using drug condition, time, and drug-by-time showed significance (p<0.001) in each analysis. Maximum pupil diameter and peak CINA and COWS scores showed a modest correlation (r=0.39, p=0.01 and r=0.36, p=0.01). There was no significant correlation between maximum pupil diameter and Bad Effects or Sick VAS.
Analysis of the internal consistency for the eleven COWS items revealed an overall Cronbach’s alpha of 0.78, indicating good reliability. As well, there was surprisingly little inter-item correlation between individual COWS items (Table 3). Only combinations of restlessness and anxiety/irritability (0.67) as well as runny nose/tearing and yawning (0.54) were significantly correlated. A similar inter-item correlation matrix for the CINA revealed that the objective physiological measurements did not correlate well with the total CINA score, and similar items showed significant inter-item correlations as with the COWS (Table 4). Finally, the VAS items correlated with the total COWS and CINA scores about as well as did the individual items constituting the scales (Table 2).
Seven individuals did not differentiate between the effects of placebo vs. naloxone based upon VAS scores of Bad Effects. Of these individuals, four had opioid withdrawal (COWS scores ≥5) with both placebo and naloxone; two had no withdrawal in either session (COWS <5); and one person had mild withdrawal (COWS score of 6 two hours after injection) with naloxone only. There were no significant differences in demographic or history characteristics that explained those who did or did not differentiate naloxone from placebo. When these individuals were removed, no significant changes occurred in correlation, repeated measures regression, or time to peak analyses.
The accurate and rapid assessment of opioid withdrawal is important in the clinical management of opioid dependent patients in both inpatient and outpatient settings. As well, U.S. guidelines for opioid treatment require clinical evidence of dependence in patients, which may include the presence of withdrawal (SAMHSA, 2001). Likewise, office-based outpatient treatment requires a medical professional to assess opioid withdrawal when initiating treatment with buprenorphine or buprenorphine/naloxone (Center for Substance Abuse Treatment, 2004). The present analyses provide validation of a short, easy-to-use scale for withdrawal (COWS) as well as quantification of the relationship of that scale to the CINA and single-item VAS indices of withdrawal. Our results demonstrate that the COWS correlates well with the previously validated CINA scale in the context of a standardized naloxone challenge in opioid-dependent persons. The time course of withdrawal as measured by the COWS was congruent with the pharmacologic properties of naloxone. Finally, the overlap in content of the two scales (Table 1) supports the content validity and face validity of the COWS.
Internal consistency of the COWS was high, demonstrating the scale was reliable in measuring the construct of opioid withdrawal. Inter-item correlations indicated little item overlap, providing evidence of content validity (measuring a broad range of symptoms). There was a high degree of consistency across opioid withdrawal measures in terms of identifying and tracking the syndrome over time, demonstrating concurrent validity of these measures. The time course for COWS and CINA were remarkably consistent. The similarity to CINA time course was somewhat less for the two VAS items, but the overall trend of both measures was the same. As well, the variance in the mean scores was relatively minor, except for VAS, which as single items with a larger scale range would be expected to have greater variance. This larger variance of the single-item VAS scores was probably also related to subjects’ understanding of the items, personality effects on expressing discomfort, or possibly demographic and history characteristics. Nevertheless, these single-item questions may have utility in following the progress of withdrawal distress and guiding its medical management. Given the strong correlation between CINA and COWS seen in Table 3, both scales are well suited for assessing and tracking opioid withdrawal. Modifying these scales to omit objective physiological indices may not affect each scale’s utility in the discrimination of the level of opioid withdrawal or guiding its medical treatment. Therefore, non-medical staff could aid in the assessment of withdrawal, and time of medical staff could be freed up for other needs. However, the physiological measures do provide objective indices to supplement the otherwise subjective responses and could thereby assist the clinician in determining false positive withdrawal responses. The choice of assessment instrument should be determined by site-specific needs, including the probability of feigned responses and the desire for objective indices.
This study has several limitations. First, relatively mild opioid withdrawal was produced, most likely due to the combination of a low naloxone dose (0.4 mg) and a modest level of morphine physical dependence (120 mg/day). However, the recognition of more severe forms of opioid withdrawal is less ambiguous for most clinicians, and the more critical need is to have scales that are sensitive enough to distinguish mild but clinically significant withdrawal. The most useful aspect of an opioid withdrawal scale is in differentiating the presence versus absence of withdrawal, which the COWS does. The original COWS authors (Wesson and Ling, 2003) had specifically recommended that a validation be done for the low-end of the scale, which this study accomplished. Second, we did not assess external reliability of these measurements; we did not have multiple raters of the same sessions to include inter-rater reliability and did not perform multiple naloxone challenges to calculate test-retest reliability. This could be done in the future. Third, the modest correlation of COWS and CINA with pupil diameter is puzzling, given that mydriasis is a classic sign in opioid withdrawal (Himmelsbach, 1941). However, the measurement tool in this study may have affected this measure. The digital pupillometer decreased the ambient light reaching the eye by surrounding the eye before determining the pupil diameter; the intensity of lighting influences pupillary response to opioids (Weinhold and Bigelow, 1993). Fourth, this study included seven individuals who failed to distinguish placebo from naloxone. This did not change the overall results significantly but it does highlight two important points: prior literature has shown placebo can precipitate mild withdrawal in heroin-dependent individuals (Kanof et al., 1991) and not all opioid-dependent individuals respond to a naloxone challenge with signs or symptoms of withdrawal (Blachly, 1973; Wang et al., 1982). Finally, while these data document the sensitivity of these indices to opioid withdrawal, they do not address their specificity — i.e. the extent to which they may be affected by factors other than opioid withdrawal.
Even with these limitations, the validation of the COWS and correlations with the other opioid withdrawal measurement tools provide useful information for future clinical evaluation of this syndrome. The COWS and CINA followed comparable trajectories for the time course of opioid withdrawal. The VAS Bad Effects and Sick single-item assessments followed a parallel time course for withdrawal, suggesting these easily administered scales might be useful in certain settings for identifying and following opioid withdrawal. Clinicians who are not worried about feigned responses might simply use these questions to screen quickly for withdrawal and treat where appropriate. In other settings, the COWS or CINA could be used for the identification of withdrawal (with or without objective sign measurement) and for monitoring response to treatment interventions. Having easy and reliable quantification has distinct advantages when following withdrawal and setting up treatment protocols based upon these findings.
In summary, this study shows that the COWS is a valid instrument with sufficient sensitivity to detect mild opiate withdrawal. It would therefore be expected to detect moderate to severe withdrawal. The COWS, as well as the VAS items reported here, have potential uses in inpatient and outpatient treatment, in detoxification, and in research protocols. Their brevity and ease of use make them good choices for use in all these settings.
We appreciate Linda Felch with her help analyzing the data and the many research assistants who ran the study sessions and assisted with data collection.
We would especially like to thank Lillian Salinas who helped create the figures and tables.
Role of Funding Source: Reckitt Benckiser Pharmaceuticals Inc. provided support and medications for use in this study. As well, this company’s representatives assisted in the preparation of the protocol and review of the manuscript. NIDA also provided funding but had no other role. Grants were R01 DA08045, K24 DA023186, and T32 DA07209.
Conflict of Interest: Drs. Tompkins and Bigelow as well as Mr. Harrison have no conflict of interest to report. Drs. Johnson and Fudala are employees of Reckitt Benckiser Pharmaceuticals Inc., which is a maker of buprenorphine and provided the funding and medications for the clinical trial. Dr. Strain also is a paid consultant to Reckitt Benckiser Pharmaceuticals. The terms of this arrangement are being managed by the Johns Hopkins University in accordance with its conflict of interest policies.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.