|Home | About | Journals | Submit | Contact Us | Français|
Clinicians recognize the importance of monitoring aberrant medication-related behaviors of chronic pain patients while being prescribed opioid therapy. The purpose of this study was to develop and validate the Current Opioid Misuse Measure (COMM) for those pain patients already on long-term opioid therapy. An initial pool of 177 items was developed with input from 26 pain management and addiction specialists. Concept mapping identified six primary concepts underlying medication misuse, which were used to develop an initial item pool. Twenty-two pain and addiction specialists rated the items on importance and relevance, resulting in selection of a 40-item alpha COMM. Final item selection was based on empirical evaluation of items with patients taking opioids for chronic, noncancer pain (N=227). One-week test-retest reliability was examined with 55 participants. All participants were administered the alpha version of the COMM, the Prescription Drug Use Questionnaire (PDUQ) interview, and submitted a urine sample for toxicology screening. Physician ratings of patient aberrant behaviors were also obtained. Of the 40 items, 17 items appeared to adequately measure aberrant behavior, demonstrating excellent internal consistency and test-retest reliability. Cutoff scores were examined using ROC curve analysis and reasonable sensitivity and specificity were established. To evaluate the COMM’s ability to capture change in patient status, it was tested on a subset of patients (N = 86) that were followed and reassessed three months later. The COMM was found to have promise as a brief, self-report measure of current aberrant drug-related behavior. Further cross-validation and replication of these preliminary results is pending.
Despite international attention to improve pain management, inadequate pain relief is a serious public health issue (Gilson et al., 2004; Joranson et al., 2000). While there are various nonpharmacologic and pharmacologic pain treatments, opioids have increasingly gained acceptance as key agents for treating chronic pain (American Pain Society, 1999, 2002; Michna et al., 2004). However, long-term administration of opioids to patients with chronic pain may be associated with increased risk of abuse and addiction (NIH, 2005).
With the growing support of opioid therapy as a treatment for chronic pain, the United States Government General Accounting Office (GAO) recommended efforts to improve identification of abuse by patients of healthcare providers (2003) who prescribe controlled substances. Physicians are now in the difficult position of providing appropriate pain relief while minimizing the inappropriate use of pain medications (Hampton, 2004). Inappropriate use can include: selling and diverting prescription drugs; seeking additional prescriptions from multiple providers; and manipulating the formulations to use them in a manner in which they were not intended (e.g., snorting, injecting). It is also important for the successful treatment of chronic, noncancer pain to be able to frequently monitor patients on opioid regimens and to identify those patients who exhibit ongoing abuse behaviors (Passik & Kirsh, 2003; Friedman et al., 2003).
Unfortunately, there is no “gold standard” assessment for chronic pain patients undergoing opioid treatment. Probably the most well-developed tool is the interview by Compton and colleagues (1998) called the Prescription Drug Use Questionnaire (PDUQ). This 42-item measure is based on the American Society of Addiction Medicine (ASAM) definition of addiction in chronic pain patients and is designed to be used in an interview format. Chabal et al. (1997) developed a prescription abuse checklist of five criteria; patients who meet three of the five criteria are considered to be opiate abusers. Other measures include the Screening Instrument for Substance Abuse Potential (SISAP; Coambs et al., 1996), the Pain Assessment and Documentation Tool (PADT; Passik et al., 2004) and the Pain Medication Questionnaire (PMQ; Adams et al., 2004). While the authors of these measures report that some items distinguish patients currently abusing their medications from those who are not abusing their medications, none have undergone prospective testing such as that recommended by Robinson et al. (2001), and no one scale has been determined to be superior in assessing opioid abuse (Butler et al., 2004). No tool was developed exclusively for continued assessment of current opioid use.
The aim of this study was to engage experts in the field of pain medicine and addiction to develop and validate a self-report assessment tool to improve a clinician’s ability to assess a patient’s current misuse of opioids. Unlike other available predictive measures, the objective was to develop an assessment tool to periodically monitor misuse of medication for patients who have been prescribed opioids for an extended period of time. This article describes the development and initial empirical evaluation of a self-administered tool for detecting concurrent opioid misuse among patients prescribed opioid therapy for chronic pain.
Concise definitions of terms are important to minimize confusion and help to clarify the objectives of this study. For purposes of this investigation, we define substance misuse as the use of any drug in a manner other than how it is indicated or prescribed. Substance abuse is defined as the use of any substance when such use is unlawful, or when such use is detrimental to the user or the others. Addiction is a primary, chronic, neurobiologic disease that is characterized by behaviors that include one of more of the following: impaired control over drug use, compulsive use, continued use despite harm, and craving. Aberrant drug-related behaviors are any behaviors that suggest the presence of substance abuse or addiction (Kirsh et al., 2002; Nedeljkovic et al., 2002).
Content validity of the Current Opioid Misuse Measure (COMM) was established by tapping expert consensus as to those patient activities that are suggestive of current, ongoing aberrant drug-related behaviors. The consensus was achieved by having pain and addiction experts complete a concept mapping exercise (Trochim, 1989). Concept mapping uses both a qualitative and quantitative structured process to develop a consensus-based conceptual framework about a problem or issue (Jackson and Trochim, 2002; Trochim, 1989; Trochim et al., 1994). The concept mapping process is inductive (bottom-up), in that it begins with brainstorming specific ideas and moves toward more general concepts.
Concept mapping consists of three phases: (1) brainstorming, in which specific ideas from stakeholders are stimulated by a focus prompt; (2) rating each item brainstormed by the entire group; and (3) grouping the brainstormed items into conceptual clusters. Analyses of the rated and grouped data utilize multidimensional scaling and hierarchical cluster analysis statistical techniques.
Pain specialists, addiction experts, and primary care providers (doctors, nurses, and psychologists) who treat patients with chronic pain were recruited from five pain centers. The specialists came from Pennsylvania, New Hampshire, New York, and Massachusetts and were recruited through the Internet or through colleagues. We were interested in obtaining input from a range of professionals involved in the management of chronic pain patients, including doctoral-level providers (M.D., Ph.D., Psy.D.), nurses, and support staff. Support staff from the pain centers were included because of their significant interactions with pain patients and their unique perspective in identifying those patients who exhibit aberrant medication-related behaviors.
The research team developed a focus prompt to be presented to the respondents by mail or email for the brainstorming phase of the concept mapping procedure (Trochim, 1993). The task involved presenting a focus prompt and asked participants to provide at least 10 to 15 statements. The focus prompt distributed to the participants was: “Please list specific aberrant drug-related behaviors of chronic pain patients already taking opioids for pain. Please list as many indicators as possible that may signal that a patient is having problems with opioid therapy.” The items were reviewed and culled to remove duplicates. The remaining items were used in the sorting and rating stage of concept mapping.
Separate pain and addiction specialists were recruited from the International Pain and Chemical Dependency Listserv to complete the sorting and rating phase of the concept mapping procedures. Following procedures recommended by Trochim (1993), participants sorted and rated items individually using a computer program that was sent to them by email. Participants were instructed to sort the items into piles “in a way that makes sense to you.” After sorting, all statements were randomly ordered and presented to the participants for the rating process. Participants rated each item on its importance in determining whether a patient currently on long-term opioid therapy was misusing medications. Each item was rated on a five-point scale from “0 = relatively unimportant” to “4 = extremely important.”
Concept System® software was utilized to examine the conceptual consensus observed amongst the expert stakeholders (Concept mapping analysis and results conducted using The Concept System® software: Copyright 2004–2006; Concept Systems Inc.). The software uses multidimensional scaling to produce a point map of the statements, in which statements that were sorted together more frequently are relationally closer together than statements sorted together less frequently. Cluster analysis is applied to the point map to generate the concept maps and to compare the extent to which various subgroups within the stakeholder group tend to agree (or disagree) on which statements reflect coherent concepts. A self-report item pool was generated based on the concept mapping analyses results.
The item pool was sent to a pain management listserv of professionals (PainEDU.org) for expert feedback on the importance and wording of the items selected. Respondents (physicians, nurses, and psychologists) rated each item on a five-point scale from 0 = “relatively unimportant/irrelevant” to 4 = “extremely important/relevant” on “how important or relevant each statement is in determining whether a patient currently on a long-term opioid regimen is misusing their medication.” Respondents also rated each item for the quality of the wording of each statement on a five-point scale from 1 = “poor” to 5 = “excellent.” A field was provided for suggestions to improve the wording of an item. There was also a prompt for additional questions respondents felt should be included to help detect opioid pain medication misuse. This conceptual item evaluation was used to cull the item pool to create an alpha version of the COMM.
After construction of the alpha version of the COMM, empirical item selection consisted of examining: (1) the concurrent validity of each item with a measure of medication abuse/misuse/addiction; (2) the contribution of each item to internal consistency of the measure; and (3) the contribution of each item to test-retest reliability. Once final item selection was established, a cutoff score and associated sensitivity and specificity of the scale were determined.
Chronic pain patients were recruited from two hospital-based pain management centers (Massachusetts and Pennsylvania) and a private pain management treatment center in Ohio. Inclusion criteria for study participation were: (1) at least 18 years of age; (2) currently in treatment for chronic, noncancer pain (pain duration > 6 months); (3) currently taking opioids (greater or equal to the equivalent of 20mg oxycodone/day); (4) fluency in English; and (5) no serious psychiatric impairment. All subjects completed an informed consent form and were assured that the information obtained from the study would remain confidential and would not be a part of their clinical record nor affect their treatment in any way.
Participants were part of a larger study investigating the predictive validity of another scale and were therefore recruited three months earlier. The COMM study began only during the follow up assessment, three months following their recruitment. Participants were reimbursed $50 for initially completing the COMM and comparison measures and for giving a urine toxicology sample. Those participants who completed the test-retest assessment one-week later were paid another $50. Finally, a sub-sample of patients was followed for an additional three months. These patients, too, were paid $50. The Human Subjects Committees of Inflexxion, Inc. and the participating hospitals approved this study.
Information about the study was posted at each clinic inviting patients to participate. Those who agreed to participate signed an informed consent and completed the assessments. A convenience sample of sixty subjects from the initial pool of participants were asked to participate in a one-week test-retest reliability study. These patients, after completing the initial questionnaire, were given a packet with an identical COMM questionnaire, instructions for completing the questionnaire after one week, and a self-addressed stamped envelope to return their completed questionnaire to the researchers.
In addition to the alpha version of the COMM, participating patients also were assessed using the following measures.
This is a 42-item interview to assess abuse/misuse for pain patients. The patient answers yes and no questions about his or her pain condition, opioid use patterns, social and family factors, family history of pain and substance abuse syndromes, patient history of substance abuse, and psychiatric history. A test of the internal consistency of this measure resulted in a Cronbach’s alpha of 0.79. While there is some disagreement on what constitutes acceptable levels of internal consistency parameters (e.g., Nunnally & Bernstein, 1994; Butler, 2004; Streiner, 2003a,b), we adopted the convention of acceptable coefficient alphas as being between .70 and .80, as suggested by others developing brief health measures (e.g., Stewart et al., 1993). Compton and colleagues suggested a score below 11 ‘did not meet criteria for a substance disorder,’ while those with a score of 15 or greater ‘had a substance use disorder.’ For purposes of this study, those patients who had a score of 11 or higher on the PDUQ were identified as having a substance use disorder. Unfortunately, there is no gold standard assessment for use with chronic pain patients undergoing opioid treatment. The PDUQ has been used to validate the predictive validity of a screener of opioid abuse and was found in previous studies to be correlated with physician ratings of aberrant drug-related behavior (Butler et al., 2004).
This is an 11-item scale adapted from the Physician Questionnaire of Aberrant Drug Behavior completed by the treating clinician to assess misuse of opioids. The items reflect the behaviors outlined by Chabal et al. (1997) that were indicative of substance abuse. The participant patient’s chart was made available to the clinician to facilitate accurate recall of information. Providers answered yes or no to eleven questions indicative of misuse of opioids, including: multiple unsanctioned dose escalations; episodes of lost or stolen prescriptions; frequent unscheduled visits to the pain center or emergency room; excessive phone calls; and inflexibility around treatment options. Patients who were positively rated on three or more of the items met criteria for prescription opioid abuse. Clinicians were asked to complete the POTQ for each of their patients at the assessment period.
This 13-item self-report questionnaire was designed to measure social desirability, which is a test of response bias. This measure was included to test items’ tendency to be associated with patients’ desire to answer questions in a socially desirable way. Reynolds (1982) found this scale to be a viable substitute for the regular 33-item Marlowe-Crowne scale (Crowne & Marlowe, 1964).
Participants provided a urine sample and informed staff of their current medications along with the date and time the medications had last been taken. Each subject was given a specimen cup and instructed to provide a urine sample (~30 –75 ml of urine) without supervision in the clinic bathroom. The sample was shipped to a central Quest Diagnostics lab (www.questdiagnostics.com). Results of the urine toxicology were sent directly to the research team. The treating physician and the clinic did not have access to the results. The report included evidence of 6-MAM (heroin), codeine, dihydrocodeine, morphine, oxycodone, oxymorphone, hydrocodone, hydromorphone, meperidine, methadone, propoxyphene, buprenophine, fentanyl, tramadol, amphetamines, barbiturates, benzodiazepines, cannabinoids, cocaine, phencyclidine, and ethyl alcohol.
Patients were classified on the Aberrant Drug Behavior Index (ADBI), which relates positively to opioid mediation misuse. The ADBI is based on positive scores on the self-reported PDUQ, the physician-reported POTQ, and the urine toxicology results. A positive rating on the PDUQ is an accumulated score higher than 11 (Compton et al., 1998). A positive rating on the POTQ is given to anyone who has three or more physician-rated aberrant behaviors (Butler et al., 2004). A positive rating from the urine screens is given to anyone with evidence of having taken an illicit substance (e.g., cocaine) or an additional opioid medication that was not prescribed. We chose not to count the omission of a prescribed opioid medication as a positive rating because of multiple factors that can contribute to this result (e.g., subject ran out the medication before the urine screen). Urine screen results were confirmed based on a chart review of prescription history and a comparison between self-report at the time of the urine screen and the toxicology report. Those with positive scores on the PDUQ (>11) were given a positive ADBI. If this score was negative, then positive scores on both the urine toxicology screen and on the POTQ (>2) were scored as having a positive ADBI. This allowed for triangulation of data to identify those patients who admitted to aberrant drug-related behavior and those who underreported aberrant behavior (e.g., low PDUQ scores, but positive POTQ and abnormal urine screen results).
Items were examined with the following predetermined criteria in mind: (1) Selected items should correlate with the criterion at a level of .20 or higher; (2) The correlation with the criterion should be greater than that item’s correlation with the Marlowe-Crowne, suggesting that the item conveys more information about the criterion than it does about social desirability; and (3) an item’s test-retest IntraClass Correlation (ICC) should be greater than .50. The research team met to review each item with respect to these parameters. The final selected items were those, in the opinion of the research team, which represented the best combination of items.
Receiver operative characteristic (ROC) curves were used to assess the sensitivity and specificity of the COMM as a screening tool for the detection of aberrant drug-related behavior. As with other screening tests, a survey instrument’s sensitivity is of primary importance, but its specificity must be considered as well. ROC analysis helps to determine the appropriate cutoff score for optimizing sensitivity and specificity for the given scale, assuming patients are drawn from a comparable population (i.e., patients with chronic pain being seen at a pain treatment center). All analyses were performed using SPSS v. 13 (Chicago, IL).
An important goal of the COMM is to track patient status over time, so that the COMM could be used repeatedly and provide an estimate of the patients “current” status. Thus, items were written to capture a 30-day time period (i.e., “in the past 30 days,”), and only behaviors that could change from time to time were included (i.e., historical items were excluded). Eighty-six (86) patients were randomly selected from the original pool of patients and followed for an additional three months. Resource limitations prevented a three-month follow up of all 227 patients and an N of 86 participants was determined to be feasible. Power calculations for ROC curve analyses following methods described by Obuchowski and McClish (1997), assume that about one-third of patients (N = 29) would have a positive COMM score and test the hypothesis of an AUC of .80 versus an AUC of .50 (chance) detection is high (.99). Thus, we concluded that 86 patients was a sufficient N for the follow-up portion of the study.
Assessment procedures were then repeated for these participants, including retaking the COMM, undergoing another PDUQ interview, and supplying another urine sample for toxicology testing. Additional information about aberrant drug behavior was collected from their treating physician using the POTQ. Analyses included examination of the area under the curve in the ROC curve analyses along with standard measures used to evaluate the effectiveness of cutoff scores (e.g., Sackett et al., 1991).
Twenty-six professional pain and addiction specialists were recruited from the International Pain and Chemical Dependency Listserv. Approximately 30% of the participants were female; 69% (N = 18) were doctoral level providers; 15% (N = 4) were nurses; 15% (N = 4) were support staff; and 42% (N = 11) represented a minority group.
Five hundred seventy-nine (N=579) unique items were generated during the brainstorming stage. Duplicates were removed and very similar items were reworded or combined, resulting in a list of 177 items. These items were further reduced by having research team members make two ratings on each statement, one reflecting importance of the item and another reflecting quality of item wording, on a scale from 1 = “not at all important/very poorly worded” to 5 = “very important/excellent wording.” Statements that achieved an average importance rating of 3 or higher were retained. This resulted in a final list of 94 statements that were used in the sorting and rating stage. Each of the 26 participants rated the items’ importance and sorted the statements individually. All data were analyzed as a single project using Concept Systems. A six-cluster concept map was generated (see figure 1).
The six clusters identified through the concept mapping process were labeled: (1) signs and symptoms of drug misuse; (2) emotional problems/psychiatric issues; (3) poor response to medications; (4) evidence of lying and illicit drug use; (5) inconsistent appointment patterns; and (6) medication misuse/abuse as well as noncompliance with medication. The map presents each cluster as having one to five layers that represent the average rating of the importance of statements included in the cluster. The legend presents the value range included in each layer. Thus, single-layered clusters contain statements that were rated, on average, as least important, with averages from 3.26 to 3.44 (out of a possible 5). Conversely, clusters with five layers contain statements rated, on average, as most important, with averages from 3.97 to 4.14. Note the size of the cluster is a visual representation of the extent to which the items in a given cluster were grouped together. This means the smaller the area of the cluster, the more often participants sorted these statements together. Conceptually, a larger area suggests a broader, less well-defined concept. Finally, clusters that are further apart reflect statements that were least likely to be sorted together. Closer clusters contain statements that were more likely to be sorted together. This suggests that the concepts represented by the clusters are conceptually “closer” to each other as determined by this sample of professionals.
Examination of Figure 1 suggests that the two most important concepts (more layers) are medication misuse/noncompliance (average rating = 4.05) and evidence of lying and drug use (average rating = 3.88). Medication misuse/noncompliance reflects observations such as evasiveness around providing urine samples, reports of stolen or lost prescriptions, “difficulties” with pharmacy, etc. Evidence of lying and drug use includes observations such as positive urine screens for illicit or unprescribed drugs, reports of supplementing medications with alcohol or drug use, etc. The third most important concept is emotional problems/psychiatric issues (average rating = 3.70). This reflects reports of anger or impulse control issues (such as getting into intense arguments and fights), emotional stability, suicidality concerns, emerging family or marital problems, etc. The remaining single-layer clusters are 1) poor response to medication, which taps into patient behaviors of complaining (pain is always 10 out of 10), signs of inflexibility, refusal to consider non-medication treatment alternatives, etc.; 2) signs/symptoms of drug misuse, which includes observations such as patients appearing intoxicated during the visit, patient quality of life decreasing, not able to make quick, simple decisions, etc.; and 3) appointment pattern use including being late for clinic appointments, missing mental health appointments while keeping “script” appointments, etc. Clearly, despite the relative ranking of the concepts, the participants rated all of these issues as highly important. The average cluster rating for the lowest rated cluster was 3.35 (almost a whole point higher than the midpoint of the five-point scale), suggesting that the raters felt that all clusters were relatively important.
Figure 2 represents a ladder graph that compares the importance ratings of the clusters of the doctoral-level participants (N= 18) and the non-doctoral level participants (N = 8), which in this group were nurses and support staff. This figure shows that the two groups tended to rate the clusters in a reasonably similar manner with respect to importance, achieving a high positive correlation of 0.96. This suggests that these two groups of healthcare providers tend to see the factors that may be related to identifying which patients do well or poorly on long-term opioid treatment in generally similar ways. Specifically, the two highest-ranking clusters, medication misuse/noncompliance and evidence of lying and drug use, were identical for the two groups. The relative rankings of the other four concepts were somewhat different. Nurses/support staff and non-doctoral level individuals tended to see appointment patterns as more important than physicians, perhaps because they deal more directly with such patterns. Likewise, patients being more difficult in their interactions with providers may affect nurses and support staff more directly than physicians, prompting these individuals to rate such behaviors more highly. Both groups of participants rated all the concepts about a 3 on a 5-point scale, suggesting that the participants viewed all concepts as important. While these concepts require empirical validation, it is encouraging to find reasonably high correspondence of views across disciplines.
Based on expert importance ratings, the highest ranked items in each cluster made up the alpha version of the COMM. Forty items were identified for the alpha version. The two most important clusters were: 1) medication misuse/noncompliance and 2) evidence of lying and drug abuse, contributing to half of the items in the COMM. The remaining items were derived from the other four clusters.
The final step was to consult with pain and addiction experts to wordsmith items and get consensus on the importance of each item in the alpha version of the COMM. Twenty-two different respondents were recruited from the Internet (PainEDU.org) to rate 40 items in the alpha version. The majority of respondents (55%; N=12) were nurses; 27% (N=6) were doctors; and 27% (N=6) were other specialties (pharmacists and counselor). Thirty-two percent (N=7) were male and 14% (N=3) were minority. The average number of years participants have been working in the pain and addiction field was 9.27 (range of 1 to 15 years). Participants rated “how important or relevant each statement is in determining whether a patient currently on a long-term opioid regimen is misusing their medication.” All 40 items received an average importance of 2.41 or higher (1=“not at all important” to 5=“very important”). Thirty-seven of the 40 items received average importance ratings of 3.18 or higher. Considering the overall high ratings for each item, the team decided to keep all 40 items in the alpha version.
Two-hundred twenty-seven (N = 227) chronic pain patients were recruited and completed the 40-item COMM. The intent of these first analyses was to examine the items of the alpha version of the COMM. It was decided that at least 200 participants (five for each of the 40 items on the alpha version of the COMM) would be adequate for an item-by-item analysis of the COMM. This number was also feasible given the resources available for this study. Since all 227 available participants met inclusion criteria, all were included in the study. Sixty-two percent (61.7%; N = 140) were women; 35.7% (N = 81) were men (gender data were missing for six individuals, 2.6%); and 14.1% (N = 32) were minorities (racial data were missing for six individuals or 2.6%). Mean age of the participants was 50.8 years (SD = 12.4, range = 21 to 89). Additional demographic and descriptive characteristics are presented in Table 1.
A convenience sample of sixty subjects was selected to complete the COMM one week after the first administration to assess test-retest reliability. Power calculations recommended by Walter et al. (1998) and Winer (1991) for the ICC analyses reveal that N = 28 will detect an ICC of .70 (an acceptable level of intraclass correlation) at 80% power. However, we were concerned that the magnitude of the ICCs obtained be reasonably stable and replicable. That is, a subject number that is too small will result in a test-retest value that might not cross validate. Thus, we elected to increase the numbers to 60 recruited patients, of which 55 (92%) completed the test-retest. Non-responders (mean age = 45.5) were significantly younger than the rest of the sample (mean age = 52.0; p<0.001). No other statistically significant differences were found on demographic variables between this subgroup and the total sample. Eighty-six patients from the original group (38.9%) were followed for another three months and re-evaluated. Demographics of the participants in this subgroup were not different from the rest of the sample, except for age. Those whose results were repeated were significantly older than those who were not selected (repeated testing patients’ mean age = 55.6 versus 47.8 in patients not selected, difference is significant at p < .001).
The 40 COMM items were empirically examined along the following parameters: the mean, standard deviation, and range of each item; its test-retest ICC; and its correlation with the ADBI and the Marlowe-Crowne. Furthermore, the sample was randomly split in two and the correlations rerun for the ADBI and Marlowe-Crowne to minimize the chances that the correlations observed were reliant on the particular characteristics of the sample. In general, items were considered candidates when respondents used the entire range of responses, when the test-retest ICC was at least .50, and the correlation with the ADBI was greater in absolute magnitude than the correlation with the Marlowe-Crowne (i.e., the item tells us more than merely social desirability). Items with the best balance of these characteristics were selected. Table 2 presents the final 17 COMM items and the total score; the concept mapping category represented by the item; the mean and standard deviation; item-specific and overall score test-retest reliability ICC; the correlation with the ADBI; and the correlation with the Marlowe-Crowne. In addition, the Table presents the effect size of the item with the criterion. Overall, the 17 items achieved a mean score in this sample of 10.18 (SD = 7.58) with a range from 0 to 42 (possible range = 0 to 68). In all cases except one (i.e., item 8), the item correlation with the ADBI was higher than the item’s correlation with the Marlowe-Crowne. This item was included because it performed well when the sample was randomly split. The total COMM score correlated .51 with the ADBI and -.26 with the Marlowe-Crowne. One-week test-retest for the total COMM score was excellent (ICC = .86 with a 95% confidence interval ranging between .77 and .92). Coefficient alpha for the 17-item COMM (α = .86) suggested excellent internal reliability. Cohen’s D effect size of the total COMM score with the criterion (i.e., the ADBI) was large at 1.25.
The primary purpose in developing the COMM was to have a scale that could be used by clinicians to determine whether their patient may be engaged in aberrant drug-related behavior. Thus, the validity of the COMM should be established as a screener with specified sensitivity and specificity for predicting whether or not the patient is actually experiencing aberrant drug-related behavior. The analysis for establishing the degree to which a test accurately detects a particular condition is the Receiver Operating Characteristic (ROC) curve analysis. COMM scores are analyzed in terms of their relationship to the ADBI and the ROC curve is presented in Figure 3. The area under the ROC curve was .81 (95% confidence interval .74 to .86; standard error = .031; p < .001), suggesting that the information obtained from the COMM provides significantly more information about the condition than chance. Table 3 presents the sensitivity and specificity estimates for the range of COMM scores gauged against the ADBI. A cutoff score of 9 yielded a sensitivity of 0.77 and specificity of 0.68. A cutoff score of 10 yielded sensitivity of .74 and specificity of 0.73. These data suggest that a cutoff score of 9 or higher may be a reasonably conservative choice for the COMM cutoff.
In order to examine the COMM’s ability to track changes over time, data from the 86 individuals who were re-assessed three months later were analyzed to determine the extent to which a new COMM score corresponded to their misuse/abuse status at that time using the interview data from a readministration of the PDUQ, new urine toxicology results, and provider input on the POTQ (i.e., the ADBI). Examination of the ADBI indicated that four of the 26 individuals (15.4%) that initially were classified as misusing/abusing their medications were not now doing so, and nine of the 60 (15%) initially classified as not misusing their medication were now classified as misusing/abusing their medications. The ROC curve analysis of the COMM data compared with the ADBI yielded an area under the curve of .92 (95% confidence interval .86 to .98; standard error = .028; p < .001), suggesting excellent detection of the patient status. A cutoff score of 9 yielded a sensitivity of .94 and a specificity of .73, while a cutoff score of 10 had a sensitivity of .84 and specificity of .82. These compare well with the results obtained in the original analysis, three months earlier. The COMM identified as positive 29 of the 31 individuals determined by the ADBI as misusing or abusing their medications.
Aside from the excellent sensitivity and specificity, we also calculated other indices used to evaluate cutoff scores, including positive predictive value, negative predictive value, positive likelihood ratio, and negative likelihood ratio. In this instance, using a cutoff score of 9 or higher, the positive predictive value was .66, while the negative predictive value was .95. The positive likelihood ratio was 3.48, and the negative likelihood ratio was .08.
Each value presents a somewhat different picture on the ability of a cutoff score to detect true positives and true negatives, while reducing or minimizing false positives and false negatives. All screening tests produce some false positives and some false negatives, so the decision about which cutoff score to use depends on the provider’s judgment about what is best for his or her patients (e.g., Sackett et al., 1991). Sensitivity, for instance, is the proportion of patients with the target condition who have a positive test result, while specificity is the proportion of people without the target condition who test negative. The positive predictive value is the proportion of patients with a positive score who have the target condition. It is important to note that, of the values presented, likelihood ratios are least affected by pretest prevalence of the target condition in a particular sample.
These results suggest that the total for the 17 self-report items of the COMM appears to provide a good estimate of whether a patient is currently misusing or abusing their medications.
Further analyses were conducted in order to examine the COMM for potential bias with respect to age, gender, race, and education level. Bias with respect to a criterion measure requires an examination of slope invariance (Nunnally & Bernstein, 1994). Logistic regressions were run in which the dependent variable was the ABDI (the criterion), and the independent variables were the COMM score, demographic variables of age, gender, race and education, and an interaction term. Bias is assumed if the interaction term is significant in the regression. No interactions were found to be significant. These results suggest that the COMM scores are not biased on these demographic variables with respect to the criterion used in this study.
This study attempts to create a valid and reliable self-report measure of current opioid medication misuse (COMM). The benefit of such a measure is to document the reliable use of opioids in the treatment of pain for persons with chronic pain. A 40-item questionnaire was developed using input from a panel of experts and concept mapping analyses. Seventeen of the items of the COMM were found to show good reliability and adequate validity in identifying which chronic pain patients currently on long-term opioid therapy would show evidence of medication misuse or abuse after an extensive assessment process. The questionnaire appears to be easy to understand and takes little effort to score.
Unlike other measures that were designed to identify risk potential for substance abuse (predictive validity), the COMM is designed to address ongoing medication misuse by asking patients to describe how they are currently using their medication. Each question asks the relative frequency of a thought or behavior over the past 30 days from “0 = Never” to “4 = Very Often.” Thus, instead of identifying character and personality traits based on past history, the COMM is mostly interested in current behaviors and cognition. We recognize that patients taking opioids for pain who misuse their medication are prone to be less than truthful when completing a current medication misuse questionnaire; however, many of the items are subtly related to misuse of medication and are less transparent. We have also found that patients are willing to admit to certain items if they rate them as 1 = ‘seldom’ on a 0 to 4 scale (Butler et al., 2004), thus decreasing the chance that the patients will falsify all of their answers.
The COMM cutoff score was selected to over-identify misuse, rather than to mislabel someone as responsible when they are not. This is why a low cut-off score was accepted. Any endorsement of the COMM items would have a greater likelihood of identifying current medication misuse. We believe that it is more important to identify patients who have only a possibility of misusing their medications than to fail to identify those who are actually abusing their medication. Thus, this scale will result in false positives – patients identified as misusing their medication when they were not. Similar to past measures that help to predict substance abuse, the COMM may also be valuable as a scale to identify those who are not having problems with their use of opioids (very low scores). However, since there are no objective means by which to identify substance abusers, errors can be made. Clinicians are encouraged to practice caution when interpreting the results of the COMM and to take into consideration other extenuating circumstances. As with all screeners, the COMM is a single indicator of possible medication misuse. Additional information should be used in making a diagnosis of a substance abuse disorder (Savage, 2002).
There is a risk that the COMM could be used as a “gatekeeper” for discontinuing prescription of opioids by some providers. Our past experience with the SOAPP (Screener and Opioid Assessment for Patients with Pain) has showed, however, that prescribing physicians are more willing to maintain patients on opioids for pain because of the reassurance offered by the SOAPP that there are minimal signs of opioid abuse (Butler et al., 2004; Akbik et al., 2006). Thus, the COMM may also be used in helping to reassure physicians about their prescription practices.
The goal of the COMM is to identify those patients with chronic pain taking opioids who have indicators of current medication misuse. We believe that the COMM will be able to assist providers in documenting compliance along with the use of other indicators such as periodic urine screens. We do not believe that the COMM should be used to deny care but rather to make appropriate decisions about the best ways to manage chronic pain. Ideally, the results of the COMM can serve as an educational tool for patients and providers. While the COMM will require additional research, the initial results suggest that this scale could be used in a pain practice or general medical setting to help document ongoing patient compliance. Patients who score higher on the COMM could be seen on a more frequent basis, with regular pill counts and urine toxicology screens.
As noted in previous studies, physicians can be unreliable in accurately identifying aberrant drug behavior within a busy pain practice. This was further supported in recent studies showing a 44.5% rate of abnormal urine results among random drug screening (Michna et al., in press) and a high degree of unreliability of physicians to judge aberrant behavior (Wasan et al., in press). These results reinforce the notion that the COMM is only one source of information and never should be used in isolation to determine appropriate use of opioids.
We purposely divided the experts into doctoral-level and non-doctoral-level (e.g., nurses) groups to see if there would be agreement in the way that these professionals rank-ordered the factors. We found that the two groups tended to rate the clusters in a reasonably similar manner with respect to importance, achieving a high positive correlation of 0.96. Specifically, the two highest-ranking clusters, medication misuse/noncompliance and evidence of lying and drug use, were identical for the two groups. The relative rankings of the other four concepts were somewhat different. Nurses/support staff and non-doctoral level individuals tended to see appointment patterns as more important than physicians, perhaps because they deal more directly with such patterns. Likewise, patients being more difficult in their interactions with providers may affect nurses and support staff more directly than physicians, prompting these individuals to rate such behaviors more highly. Both groups of participants rated all the concepts about a 3 on a 5-point scale, suggesting that the participants viewed all concepts as important. While these concepts require empirical validation, it is encouraging to find reasonably high correspondence of views across disciplines. Including more input from other professional groups in future validity studies of the COMM would be recommended.
The following limitations of this study deserve mention. First, this study needs to be replicated with more subjects in a variety of centers. We do not know, for instance, how useful the COMM may be in primary care settings vs. tertiary university-based pain centers. Attempts were made to include minorities, but further information on the usefulness of the COMM among different ethnic groups and pain populations is also needed.
Second, the long-term reliability of the COMM is unknown. We include the results of one-week test-retest reliability and three-month repeated administration data, which produced very promising results. However, use of the COMM repeatedly over a longer period of time has yet to be assessed.
Third, COMM items were derived by consensus and concept mapping techniques. Cross-validation of empirically-derived COMM items is needed. Also, evaluation of the COMM’s ability to detect misuse or abuse, both initially and during a three-month re-administration were based on data from the same patient sample used to develop the measurement items. We strongly believe that a balanced approach is necessary and recognition of other reasons to account for behavior need to be considered in order to avoid prejudicial thinking. A study is currently underway to cross-validate the COMM and to further examine its psychometric properties with a new population of patients. Ultimately, a revised version that would incorporate more predictive yet subtle items to reduce the risk of fabrication may be needed.
Opioids will likely continue to play a critical role in the treatment and management of chronic noncancer pain. The development of the COMM may offer clinicians a way to monitor misuse behaviors and to develop treatment strategies designed to minimize continued misuse. The COMM may serve as a useful tool for those providers who need to document their patients’ continued compliance and appropriate use of opioids for pain. The results of this measure may have the added benefit of reducing physicians’ concerns related to prescribing opioids and may keep patients more cognizant of their need to be responsible with these medications.
Special thanks are extended to MaryJane Cerrone, Mary Ann Yackabonis, David Janfaza, Edward Michna, Leslie Morey, Sanjeet Narang, Srdjan Nedeljkovic, Bruce Nicholson, Sarah O’Shea, Edgar Ross, Sharonah Soumekh, and Ajay Wasan, and staff members from Brigham and Women’s Hospital, Lehigh Valley Hospital, and PainCare of Northern Ohio for their participation in this study. Thanks also to Paul Guttry for reviewing an earlier version of this paper. This research was supported in part by a grant awarded to the first author (DA015617) from the National Institutes of Health, Bethesda, MD and by an unrestricted grant to Inflexxion, Inc. from Endo Pharmaceutical, Chadds Ford, PA.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.