Despite the fact that the Diagnostic and Statistical Manual of Mental Disorders-IV (DSM-IV;
American Psychiatric Association [APA], 1994) has been in use approximately 14 years, several gaps in our knowledge remain on whether the classification of a diagnosis should be categorical or dimensional, whether different dependence criteria should be used for different drugs, and how measurement bias influences assessment of diagnoses (
Edwards, 2007;
Hughes, 2006;
Saunders and Schuckit, 2006). As recently noted by
Edwards (2008), the dependence syndrome was a provisional idea in 1976 and in some ways it remains so today. At present, diagnoses of substance use disorders rely on interview data. In most addiction treatment studies, investigators depend not only upon self-report diagnostic instruments for the assessment of some inclusion and exclusion criteria, but often use these data to examine differences in treatment outcomes across diagnostic categories or key demographic groups (
Carroll, 1997). The quality of diagnostic measures in turn affects the validity of interpretations of study findings and evaluation of treatment interventions and programs. This issue is centrally relevant to some of the multisite studies conducted by the Clinical Trials Network (CTN) that seek to inform practice in real-world clinical settings. It is therefore important to evaluate whether the diagnostic instruments used in these studies actually measure the intended construct and whether symptom-endorsing is equivalent for drug users with diverse demographic characteristics.
To help ensure that an instrument is a sound measure that enables unbiased diagnoses for diverse subgroups, it is essential to show measurement equivalence of the instrument (
McHorney and Fleishman, 2006). Measurement nonequivalence occurs when persons with equivalent levels of the dependence factor respond differently to an instrument as a function of group membership (
Chen and Anthony, 2003;
Grant et al., 2007). This in turn may lead to inaccurate comparisons across groups involving diagnosing participants, as well as assessing prevalence rates, risk factors, and treatment outcomes of the measured conditions (
Chen and Anthony, 2003;
McHorney and Fleishman, 2006). DSM-IV Checklist has been used increasingly by investigators to assess the status of substance dependence of study participants in addiction treatment studies (
Peirce et al., 2006;
Petry et al., 2005;
Rawson et al., 2004). Despite its increasing ubiquity, little is presently known about the psychometric properties of DSM-IV Checklist for substance use disorders (
Forman et al., 2004).
In this study, we apply IRT (
Embretson and Reise, 2000) and MIMIC modeling (
Chen and Anthony, 2003;
Wu et al., in press) to examine the psychometric properties of DSM-IV criteria for cocaine and opioid dependences assessed by DSM-IV Checklist (
Wu et al., 2008). The data source for the study is a multisite randomized CTN trial examining the effects of lower-cost incentives on stimulant users in six community methadone maintenance treatment settings across the country (
Peirce et al., 2006), and all of which utilized DSM-IV Checklist to make diagnoses of stimulant and opioid dependences. As noted in a recent study by
Saha and colleagues (2006), IRT methods can be very useful in evaluating how individual diagnostic criteria maps onto the dependence construct, how well each item performs, how much information each criteria contributes to a diagnosis, and whether symptom-endorsing is equivalent between demographic groups. This level of information can help identify poor versus good criteria items for a given diagnosis, thus providing highly relevant information for evaluating specific criteria that define underlying constructs of disorder.
While
Saha et al. (2006) relied on IRT methods for assessing item-level bias between demographic groups, this present study extends that work by utilizing the MIMIC model plus IRT methods. The MIMIC model incorporates the measurement model (i.e., the underlying construct of the dependence syndrome) with structural regression equations that permit us to detect item-response bias (i.e., differential item functioning) of DSM-IV criteria for multiple background variables within a regression framework. Its improves comprehension of the impact of any item-response bias detected on the estimated dependence factor, while adjusting for the effects of multiple background variables (e.g., sex and race/ethnicity) on the estimated dependence factor (
Chen and Anthony, 2003;
Grant et al., 2007). This latter feature represents a unique advantage of the MMIC model over IRT methods.
Previous studies using factor analysis have found that DSM-III-R and DSM-IV criteria of cocaine or opioid dependence represent one factor, but these studies generally do not investigate measurement equivalence of their diagnostic instruments and item-level psychometric performance (
Bryant et al., 1991;
Feingold and Rounsaville, 1995;
Morgenstern et al., 1994). One notable exception was a study by
Langenbucher and colleagues (2004) that reported findings from one of the first published studies applying IRT modeling to evaluate DSM-IV criteria for substance use disorders. They examined abuse and dependence criteria for alcohol, marijuana, and cocaine use in a sample of 372 adults enrolled at addiction treatment programs. After removing two criteria (“legal problems” and “tolerance”) that demonstrated a poor fit of the 11 criteria to a unidimensional solution, they found that the remaining 9 criteria for abuse of and dependence on alcohol, marijuana, and cocaine, respectively, reflected one continuum of severity.
More recently,
Gillespie and colleagues (2007) conducted IRT modeling of DSM-IV criteria for drug use disorders (marijuana, cocaine, opioids, hallucinogens, sedatives, and stimulants, respectively) in a sample of all male participants from the Virginia Twin Registry. They found that abuse and dependence criteria for each drug class were explained by a single underlying continuum of risk. Additionally, the study reported large differences in estimates of individual item severity and discrimination across drug classes. In particular, cocaine users were found to endorse most items at much lower levels of latent liability than users of other drugs. The investigators concluded that cocaine use was the most disabling drug and had more harmful effects compared to the other drugs. Similarly,
Lynskey and Agrawal (2007) conducted IRT analysis to examine abuse and dependence criteria for drug use disorders in the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC). They found that abuse and dependence criteria formed a single latent construct of risk and that there were some similarities and variations in item performance in regards to item severity and discrimination. Specifically, the “inability to cut down on use” and “legal problems” exhibited poorer discrimination for most drugs, whereas “legal problems” and “withdrawal” appeared to index more severity levels of the abuse/dependence continuum for several drugs. Together, these studies suggest that DSM-IV’s hierarchical distinction between abuse and dependence may not be supported (i.e., one factor), and that the contribution of each item to the underlying construct of a diagnosis differs depending on the drugs used and the study sample as shown by variations in item performance across drugs and studies (e.g.,
Gillespie et al., 2007;
Lynskey and Agrawal, 2007;
Langenbucher et al., 2004).
While these IRT studies of drug use disorders have yielded new and important information, they have some limitations. Perhaps most importantly, the prior studies examined lifetime symptoms that occurred
sporadically over the course of participants’ lives (e.g.,
Gillespie et al., 2007;
Langenbucher et al., 2004;
Lynskey and Agrawal, 2007). It is unclear whether and to what extent results from this method of indexing symptoms over a lifetime apply to a current (past year) DSM-IV diagnosis of drug use disorder that requires the occurrence and clustering of a specific number of symptoms over the past 12-month (
APA, 2000). Further, criterion information and measurement equivalence of the diagnostic instruments have not been fully reported in prior studies (e.g.,
Gillespie et al., 2007;
Langenbucher et al., 2004;
Lynskey and Agrawal, 2007). The lack of such empirical data make it difficult to evaluate the item-level measurement precision, the extent of information that each item contributes to the underlying construct, and whether the diagnostic instruments used to measure drug use disorders represent a fair measure for drug users with diverse sociodemographic characteristics. The later data are particularly important to research of constructs of self-reported drug use disorders because it constitutes the prerequisite for making unbiased estimates.
The present study constitutes the first known IRT modeling of DSM-IV criteria for current (past year) cocaine and opioid dependence disorders. It extends prior work by addressing both item-level psychometric performance and measurement equivalence by age, sex, race/ethnicity, and educational level (
Grant et al., 2007). The study also explores whether the presence of a comorbid opioid dependence biases cocaine users’ endorsement of cocaine dependence symptoms, and whether the presence of a comorbid cocaine dependence biases opioid users’ endorsement of opioid dependence symptoms. This question heretofore has not been addressed, but is highly relevant to a valid assessment of DSM-IV diagnoses, particularly in the context of addiction treatment settings such as methadone treatment programs where drug abusers are likely to have comobid substance dependences and a diagnosis of dependence is often required for enrolling in a treatment program (e.g.,
Brooner et al., 1997;
Disney et al., 2005;
Wu et al., 2008). It is hypothesized that symptoms of cocaine and opioid dependences will form a continuum of severity and that the pattern of individual item performance will differ across cocaine and opioids because of their differences in pharmacological effects (
APA, 2000;
Gillespie et al., 2007). Specifically, symptoms of physical dependence on opioids (i.e., tolerance and withdrawal) will be endorsed at the low severity level on the dependence continuum, while symptoms of physical dependence on cocaine will be endorsed at the high severity levels on the continuum. It is also expected that there will be no significant item-response bias (i.e., no differential item functioning) across background variables after the level of the dependence continuum and background variables are adjusted statistically. Specifically, the following questions are evaluated: (1) where along the continuum does each criterion measure the dependence liability (item severity); (2) what criterion symptoms distinguish participants who are higher on the continuum of severity from those who are lower on the continuum (item discrimination); and (3) is the probability of endorsing symptoms of dependence at the equivalent severity level similar across groups defined by sex, age, race/ethnicity, educational level, and comorbid drug dependence (measurement equivalence)? The present study provides an excellent opportunity to investigate these questions in a geographically diverse sample of subjects from multiple treatment programs across the country (CTN) using an identical set of assessments measures.