|Home | About | Journals | Submit | Contact Us | Français|
Three standardized screening instruments—the Global Appraisal of Individual Needs Short Screener (GSS), the Mini-International Neuropsychiatric Interview–Modified (MINI-M), and the Mental Health Screening Form (MHSF)—were compared to two shorter instruments, the 6-item Co-Occurring Disorders Screening Instrument for Mental Disorders (CODSI-MD) and the 3-item CODSI for Severe Mental Disorders (CODSI-SMD) for use with offenders in prison substance-abuse treatment programs. Results showed that the CODSI screening instruments were comparable to the longer instruments in overall accuracy and that all of the instruments performed reasonably well. The CODSI instruments showed sufficient value to justify their use in prison substance-abuse treatment programs and to warrant validation testing in other criminal justice populations and settings.
Inmates with co-occurring substance use and mental disorders (COD) represent a significant problem in the criminal justice system, given the increasing numbers of prisoners diagnosed with co-occurring disorders and the difficulty of managing and treating offenders with concurrent disorders. It is estimated that, compared to the general population, prisoners are 2 to 4 times more likely to suffer from psychotic illness or major depression (Fazel & Danesh, 2002). Estimates from U.S. Department of Justice statistics indicate that 16% of those in state prisons, jails, and on probation suffer from some form of mental disorder (Ditton, 1999).
Furthermore, the number of correctional clients with mental disorders appears to be increasing. As an example, reports from the Colorado Department of Corrections chronicle a steadily rising proportion of inmates with mental illness, from 4% in 1991, to 14% in 2001 (Kleinsasser & Michaud, 2002), to 20% more recently (J. Stommel, personal communication, March 16, 2005). Among the 20% of inmates with mental illness, three quarters were estimated to have a co-occurring substance-use disorder.
The need to identify and treat these individuals is evinced in the risk to the community, problems of behavior management within the prison system, and poorer treatment outcomes. The increased risk of violence among individuals with COD has now been well-documented (Melnick, Sacks, & Banks, 2006; Monahan et al., 2000; Monahan et al., 2001; Monahan et al., 2005; Steadman et al., 1998). Furthermore, studies have found that among inmates treated in prison for schizophrenia and followed for as long as 10 years, 67% re-offended violently and 30% were recidivated for violent crime. From the perspective of the current article, it is noteworthy that polydrug abuse or alcohol problems increased the odds of reconviction by a factor of 3 or more (Baxter, Rabe-Hesketh, & Parrott, 1999). In addition to the risk for violence in the community, individuals with COD pose a particular challenge to the criminal justice system for several reasons. Prisoners with mental disorders generally adapt less well to the complexity of prison life (Morgan, Edwards, & Faulkner, 1993), frequently face stigmatization or isolation from other prisoners, and have high rates of victimization (Correctional Association of New York, 2004), and are more likely to be written up in an incident report during their first 3 months of incarceration (DiCataldo, Greer, & Profit, 1995).
Moreover, studies show that COD presents unique challenges to treatment programs. The elevated risk factors and poorer treatment outcomes of those with COD are exemplified by their higher rates of HIV infection, relapse to substance use, rehospitalization, depression, and suicide risk, as opposed to those with a single disorder (Drake et al., 1998; Office of the Surgeon General, 1999). Numerous studies have documented the effects of COD in criminal justice populations, including higher rates of dropout from treatment alternatives to prison (Lang & Belenko, 2000), shorter lengths of stay in substance-abuse treatment among probationers (Hiller, Knight, & Simpson, 1999), and the failure to complete jail-based substance-abuse treatment programs by ratios as high as 3 to 1 when compared to prisoners who did not have a history of mental disorder (Brady, Krebs, & Laird, 2004).
Although COD clients constitute a population that has proven difficult to treat, promising interventions now offer the very real prospect of positive change. For example, Sacks and colleagues demonstrated significantly lower reincarceration rates and significantly greater reduction in substance abuse for offenders with COD in a modified therapeutic community (TC) program as compared to those in standard mental health treatment (Sacks, Sacks, McKendrick, Banks, & Stommel, 2004; Sullivan, McKendrick, Sacks, & Banks, in press). Before treatment can be matched to a client population that can derive its benefits, however, an instrument capable of identifying appropriate individuals for referral must be available.
At present, no single brief instrument has been validated to screen for both substance use and co-occurring mental disorders among prisoners. The lack of a standardized and uniformly accepted COD screening instrument has hindered both the ability to plan programs for offenders with COD and to identify prisoners in need of specialized COD treatment. Peters and colleagues evaluated several frequently used screening instruments for alcohol and drug disorders in prison settings (Peters, Greenbaum, Steinberg, & Carter, 2000). They found the Texas Christian University Drug Screen (TCUDS; Broome, Knight, Joe, & Simpson, 1996; Simpson, 1995; Simpson, Joe, Rowan-Szal, & Greener, 1997), the Simple Screening Instrument (SSI; Center for Substance Abuse Treatment, 1994), and the ASI–Drug Screen (McLellan et al., 1985; McLellan et al., 1992; McLellan, Luborsky, Woody, & O'Brien, 1980) in combination with the Alcohol Dependence Scale1 (ADS; Skinner & Horn, 1984) to be about equally effective in detecting substance-use disorders among prison inmates and that best met their criteria of being accurate overall, brief, in the public domain, and capable of administration by non-(mental health) professional staff.
Instruments commonly used to screen for mental disorders include the Global Appraisal of Individual Needs (Dennis, 1999), the Mini-International Neuropsychiatric Interview–Modified (MINI-M; Sheehan et al., 1998), and the Mental Health Screening Form (MHSF; Carroll & McGinley, 2001). The K6 (Kessler et al., 2003) provides a very brief screen that was 96% accurate in identifying individuals who did not have a severe mental disorder, but was only 36% accurate in identifying those with a severe mental disorder. More recently, the K6 has been found to have a sensitivity of 76% and specificity of 81% when testing for severe mental disorders in population with a substance-use disorder (Swartz & Lurigio, 2006). The Psychiatric Diagnostic Screening Questionnaire (Zimmerman & Mattia, 2001) is another instrument widely used in criminal justice settings. The Brief Jail Mental Health Screen, which the authors recommend for men only (Steadman, Scott, Osher, Agnese, & Robbins, 2005), and the Jail Screening Assessment Tool, a semistructured interview used to assess women prisoners in jails (Nicholls, Lee, Corrado, & Ogloff, 2004) have been introduced recently to screen for mental disorders in criminal justice settings. Earlier reviews of screening instruments for mental disorders (Peters & Green Bartoi, 1997; Peters & Hills, 1997) recommended the use of two instruments, the Referral Decision Scale (Teplin & Schwartz, 1989) and the Brief Symptom Inventory (Derogatis, 1993).
Screeners for substance use and mental disorders have often been administered in separate settings, and the information has often not been consolidated to identify offenders who need specialized COD assessment and treatment services and to provide information to administrators regarding the need for specialized COD services. A more effective approach would be to have a consolidated, integrated screening instrument effective in identifying both disorders, so that this information could be gathered at the same time and by the same staff, resulting in a more comprehensive picture of the offender's needs for specialized COD services. This article reports on the first validation study in a planned development of a brief screening instrument for both mental and substance use disorders (i.e., COD) for use in a variety of criminal justice settings. The substance-use-screening component of the instrument (not examined in the present study) is composed of the first 9 items of the TCUDS, an instrument widely used in criminal justice research. The selection of the TCUDS was based on a previous validation study (Peters et al., 2000) and on the recommendation of two panels, one composed of criminal justice substance-abuse stakeholders and the other of experts in COD Sacks et al., 2007 The study described in this article was conducted to validate the cutoff points for a mental-health screening instrument to serve as a companion to the TCUDS in screening for COD. The validation study includes cutoff points for standardized (GSS, MHSF, and MINI-M) and unstandardized (CODSI-MD and CODSI-SMD) screening instruments previously identified in a pilot study of offenders in prison substance-abuse programs.
The research study was part of the National Institute on Drug Abuse (NIDA)-funded Criminal Justice Drug Abuse Treatment Studies (CJDATS) initiative. Under this initiative, nine regional research centers, a coordinating center, and NIDA work with federal, state, and local criminal justice partners to develop and test new approaches to meeting the needs of offenders with substance-use disorders for prison and reentry services. The present research project is 1 of 13 current CJDATS studies that include screening and referral, treatment interventions for reentering offenders, engagement and retention in treatment, improving coordination and linking services in community, meeting the needs of special populations of offenders, and understanding the organizational and contextual factors and current treatment practices for offenders with histories of substance abuse.
The project consisted of an initial pilot study and this validation study, which employed a geographically diverse sample of 280 consecutive new admissions to prison substance-abuse treatment programs. The pilot study, which established cutoff scores for each of the instruments among program admissions (reported elsewhere; Sacks et al., 2007), included a total of 100 subjects. The remaining 180 cases were used in this study to validate these cutpoints. A study subject was considered to be a “new admission” for 14 days from his or her entry to the treatment program. Exceptions were made under special circumstances (e.g., potential subjects were missed when a lockdown prevented interviews from being scheduled so that the initial test battery could not be completed within 2 weeks of entry to the program). Four CJDATS Research Centers were involved in the data collection and 13 different prison substance-abuse treatment programs were used. The participating centers and the number of subjects drawn from each were as follows: NDRI Rocky Mountain in Colorado (N = 117), Lifespan at Brown University in Rhode Island (N = 75), the Institute for Behavioral Research at Texas Christian University in Texas (N = 60), and the Integrated Substance Abuse Programs at UCLA in California (N = 28). The sample was stratified to include one third women, which represented an oversampling compared to the actual percentage of women in state prison populations (7%; Harrison & Beck, 2005) that was incorporated to make the screening instrument sensitive to any possible gender differences in the accuracy of the screening instruments in the prison population.
The large numbers of individuals with substance-abuse disorder entering these prison programs afforded a unique opportunity to develop and test a measure for mental disorders appropriate to this population. For example, previous research indicated a high prevalence rate of mental disorders among admissions to prison substance-abuse treatment programs: For example, 59% reported previous psychological treatment and 68% reported severe depression, 61% severe anxiety, 46% trouble controlling violent behavior, and 11% a previous suicide attempt (Prendergast, Hall, Wexler, Melnick, & Cao, 2004).
A total of 311 inmates were approached to participate in the study. In addition to the 280 subjects who constituted the full sample (pilot plus validation samples), 29 (9%) refused to participate in the study and a communication barrier2 prevented 2 inmates (0.6%) from participating. The 2 inmates who reported a problem understanding the questions were replaced by the next subjects to enter the treatment program. The 29 inmates who refused to participate cited reasons that were consistent across sites and included the following: not interested in participating in studies, denies drug history, had attention-deficit hyperactivity disorder (ADHD) and couldn't sit for long periods of time, and didn't want to spend the time. Because this rate of refusal was relatively low and was not a threat to validity, the authors did not collect any further information on those who declined to participate.
The CODSI study was approved by the Institutional Review Boards at each of the four CJDATS Research Centers, and each received certification from the Office for Human Research Protections (OHRP). The project was also reviewed by a Data and Safety Monitoring Board and received a Certificate of Confidentiality from NIDA. Prior to taking part in the study, each inmate completed an informed consent process, guided by a research interviewer, and signed a consent form; all research staff has been trained in the protection of human subjects in research and in the particular concerns for the protection of prisoners. The consent process was free from any coercion; participation was entirely voluntary and had no bearing on the inmate's circumstances, either within the treatment program or as a prisoner.
A modified (shorter version) of the CJDATS Intake Interview (CJDATS, 2005), a structured interview used to collect sociodemographic background information, including education and employment, criminal history, health and psychological status, and drug history, was administered to all subjects.
The selection of screening instruments was based on an extensive review of the screening literature, stakeholder input, and the recommendations of a panel of experts in COD (Sacks et al., 2006). A literature search identified screening instruments with good reliability and validity that were in the public domain, required no more than 20 minutes to complete, and were suitable for administration by staff members without extensive training. More than 150 instruments were reviewed; those that best met the criteria were submitted to a panel consisting of 12 stakeholders (criminal justice professionals working in the area of substance abuse) and to a panel consisting of 10 experts in COD. These panels evaluated the instruments based on the type of personnel and training needed for administration, the maximum time available for administration, the ease of use and scoring, acceptability of questions to respondents, the type of profile generated, and utility estimates (e.g., the tradeoffs between the length of the instrument, the desired level of accuracy). The process identified three instruments that were selected for pilot testing: the 22-item MINI-M (Sheehan et al., 1998), the 24-item MHSF (Carroll & McGinley, 2001), and the 15-item GSS version 1.03 (Dennis, Chan, & Funk, 2006).
The Structured Clinical Interview for DSM-IV (SCID; First, Spitzer, Gibbon, & Williams, 2002) was used as the criterion measure for mental disorders. The SCID is widely accepted as the standard for assessing substance use and mental disorders (Baldassano, 2005; Blackburn, 2000; Maffei et al., 1997; Magruder, Sonne, Brady, Quello, & Martin, 2005; Ramirez Basco et al., 2000), providing DSM diagnoses for Axis I and Axis II disorders using 30-day and lifetime information. Assessment of a mental disorder in the current study is based on the lifetime information. A screening instrument is considered to be accurate if it concurs with the SCID on the presence (or absence) of a disorder.
The development of the Co-Occurring Disorder Screening Instrument for Mental Disorders (CODSI-MD) and for Severe Mental Disorders (CODSI-SMD) is fully described in Sacks and colleagues 2007. Briefly, items from the MINI-M, MHSF, and GSS were tested in logistic regression analyses to determine the best combination of items to predict a SCID diagnosis of either an Axis I or Axis II mental disorder. The rationale for item selection for the CODSI screeners was to make use of existing items from standardized scales rather than creating and testing new items.
The 6 items of the CODSI-MD were selected by this procedure with coefficients ranging from 0.194 to 0.389 (disregarding the signs). The items and their original instruments were as follows:
An additional logistic regression analysis Sacks et al. 2007 was conducted to determine the items most associated with severe mental disorders (schizophrenia, major depression, bipolar disorder) as well as suicide potential. One item from the MINI-M and 2 items from the MHSF were selected in the regression to form the CODSI-SMD: (a) Have you felt sad, low, or depressed most of the time for the past 2 years? (MINI-M); (b) Did you ever attempt to kill yourself? (MHSF); and (c) Have you ever had a period of time when you were so full of energy and your ideas came very rapidly, when you talked nearly nonstop, when you moved quickly from one activity to another, when you needed little sleep, and when you believed you could do almost anything? (MHSF).
Testing was conducted in two face-to-face sessions within 1 month of each other. Oral administration of the interviews avoided questions of literacy in this population. The first session consisted of completing the informed consent, the CJDATS Intake Interview and the CODSI screening battery, which included the three mental disorder instruments: the MHSF, MINI-M, and GSS (the CODSI-MD and the CODSI-SMD items were nested within these other instruments). The order of administrating the three mental health screeners was randomized to control for possible ordering effects. The second session was the administration of the SCID. As the SCID diagnosis used for the study was based on lifetime information, this window was of sufficient brevity to have a minimal impact on the data though permitting the flexibility needed to accommodate prison and interviewer schedules. The initial session was conducted by experienced interviews who received standardized training (provided separately by each of the participating research centers) in the procedures for collecting informed consent and administering the CJDATS Intake Interview and screening battery. Rater assessment or interpretation was not obtained in the administration of the screening instruments; therefore, training consisted of reviewing the items on each of the screeners, reading manuals about proper interviewing techniques, mandatory training on human subjects, the observation of interviews by experienced staff, and completing multiple practice interviews with supervisors to achieve fluency in presentation. The absence of the need for more specialized training, which reduces the demands on potential users, is interpreted as a strength of the screening instruments. A different group of interviewers who were trained and experienced in the use of the SCID conducted the second session. All SCID interviews were reviewed by a SCID supervisor for completeness and accuracy. To avoid possible influence on the results, SCID interviewers and supervisors were not informed of the results of the first session (intake interview and CODSI screening battery).
The planned analysis for this validation study determined the sensitivity, specificity, and overall accuracy of the screening instruments. Sensitivity is the percentage of individuals identified by the SCID as having a mental disorder who are correctly identified as such by the screening instrument. Specificity is the percentage of individuals who are not classified by the SCID as having a mental disorder who are correctly identified as not having a disorder by the screening instrument. Overall accuracy is the percentage of individuals who are correctly identified by the screening instrument as either having or not having a mental disorder as determined by the SCID.
The analytic plan validated the cutpoints that the pilot study had established for each instrument. A cutpoint (or cutoff score) is the number of positively scored items that produces the best overall accuracy, sensitivity, and specificity. Receiver operating characteristics (ROC; Metz, 1978) curves were developed for the pilot study (N = 100) to determine the predictive value of the instruments along with the full range of possible cutpoints. A range of cutoff scores was calculated that was wide enough to produce a curve of ascending to descending overall accuracy. When cutpoints produced equivalent overall accuracy, the cutpoint with the highest sensitivity was chosen (i.e., the cutoff score that was best at determining those who screened in for a mental disorder). The cutpoints identified in the pilot study were as follows:
The second set of cutpoints developed in the pilot study was to screen for severe mental disorders (i.e., schizophrenia, bipolar disorder, and major depressive disorder) as well as for suicide potential, as determined by the MDE section of the SCID-IV. The CODSI-SMD with a cutpoint of 2 had an overall accuracy of 82%, with sensitivity at 60.7% and specificity at 90.3%. The MHSF, with a cutoff score of 11, had an overall accuracy of 76%, with sensitivity at 42.9% and specificity at 88.9%. The MINI-M, with a cutpoint of 10, had an overall accuracy of 72%, with sensitivity at 46.4% and specificity at 81.9%. Finally, the GSS Internal Disorder Screener (GSS-IDS; the subsection of the GSS that pertains to severe mental disorder), with a cutoff score of 5, had an overall accuracy of 68%, with sensitivity at 25% and specificity at 84.7%.
The two-tailed nonparametric McNemar test (Siegel, 1956) for related samples (as in the present instance when multiple measures are taken on the same individual) was used to test the significance of differences between the instruments in sensitivity and specificity. Differences in overall accuracy were not tested as these are affected by the prevalence of the disorder in each specific sample; tests higher in sensitivity will show improved overall accuracy when the prevalence of the disorder is high, and tests with high specificity will show greater overall accuracy when the prevalence is low. Therefore, although meaningful in the context of the particular population sampled, overall accuracy is dependent on conditions external to the validity of the instruments in the wider population in which the instrument would be used. Results of the McNemar tests are reported in the text to avoid creating confusion in the tables by presenting multiple comparisons.
Table 1 shows the demographic characteristics of the validation sample. Males comprised two thirds of the study sample by desing. The median number of all arrests was 10, with drug-related arrests accounting for half that number. Physical health problems were prevalent, with almost two thirds (62%) reporting hospitalization for physical health problems. Half of the sample reported experiencing severe depression, and a similar proportion reported severe anxiety in their lifetime. One fifth of the sample had serious thoughts about suicide, and more than 18% had at least one suicide attempt. Twenty-two percent of the sample had at least one psychiatric hospitalization. Substance use, as expected in this population, was significant, with a lifetime prevalence of alcohol use of 98%, marijuana use of 94%, cocaine use of 71%, methamphetamine use of 61%, intravenous drug use of 34%, and heroin use of 25%. Sixty-seven percent of study subjects had received prior treatment for substance abuse during the course of their life.
Table 2 shows the sensitivity, specificity, and overall accuracy for the four instruments in screening for any mental disorder after weighting the results to adjust for the oversampling of females in the stratified sample. In this table (and those that follow), the N for overall accuracy includes all of the subjects in the validation sample. The N for sensitivity reflects only those subjects with a SCID diagnosis of mental disorder and the N for specificity includes only those subjects who do not have a SCID diagnosis of mental disorder. As shown in Table 2, the overall accuracy was the similar for the MHSF and GSS. The CODSI-MD was within 3 percentage points of the MHSF and GSS and within 4 percentage points of the MINI-M. With reference to sensitivity, the MHSF and GSS produced the highest scores, followed by the CODSI-MD. All three of these instruments produced higher scores than the MINI-M, with the CODSI-MD exceeding the MINI-M by approximately 4% (ns), the GSS exceeding the MINI-M by 10% (p < .005), and the MHSF exceeding the MINI-M by 10 (p < .05). The MINI-M produced the highest specificity score with the CODSI-MD approximately 7% lower and the MHSF and GSS approximately 13% lower, but none of these differences reached statistical significance.
Table 3 shows the results by gender. The high prevalence rates for mental disorder in the study resulted in a relatively small sample for specificity when the data were disaggregated by gender, reducing the power to demonstrate statistical significance for this variable. All four instruments demonstrated higher overall accuracy for females than for males. Given the lower N values when the data were disaggregated for gender, a McNemar two-tailed 0.10 level of significance was used.
For males, the GSS, MHSF, CODSI-MD, and MINI-M all scored within 3 percentage points of each other in overall accuracy. The GSS and MHSF produced the highest values for sensitivity among the men, with the CODSI-MD producing an intermediate value, and the MINI-M the lowest value; the only significant difference was the performance of the GSS compared to the MINI-M. For specificity among men, the highest values were produced by MINI-M; the CODSI-MD produced an intermediate score approximately 7% lower, and the MHSF and GSS produced scores approximately 14% lower than the MINI-M, although this difference did not reach statistical significance. For females, the MHSF produced the highest level of overall accuracy and the MINI-M the lowest. Results were similar for sensitivity, with the MHSF again producing the highest values and the MINI-M the lowest (p < .01). With respect to specificity, the N of 10 precluded a meaningful test of significance, but the GSS score was 10% less than the other instruments.
In general, of the four instruments, the MHSF was tied with the GSS in highest overall accuracy in the weighted sample, and each of these two instruments showed superiority in at least one other score disaggregated by gender. The MHSF was superior in overall accuracy for females, and the MHSF and GSS produced the highest accuracy in males. The MINI-M and the 6-item CODSI-MD performed reasonably well in comparison to the other instruments; the CODSI-MD offered the advantage of being a shorter instrument than the MHSF or GSS, which contain 17 and 10 items, respectively.
The present study also validated previously identified cutoff scores for severe mental ill-ness for the MHSF, MINI-M, GSS, and the CODSI-SMD (Sacks et al., 2007). Table 4 shows the sensitivity, specificity, and overall accuracy for the four instruments in screening for severe mental disorders after weighting the results to adjust for the oversampling of females. The Internal Disorder Screener (IDS) subscale of the GSS, designated as the GSS-IDS, was substituted for the entire GSS because this subset of items was the most relevant to severe mental disorders.
The CODSI-SMD produced the highest overall accuracy across the entire sample. The GSS-IDS was the second-highest scoring instrument for overall accuracy, coming within 2 percentage points of the CODSI-SMD. The CODSI-SMD produced the highest sensitivity; the other instruments produced values that were approximately 9% to 15% lower. With respect to sensitivity, the CODSI-SMD was significantly higher than the MHSF (p < .05) and approached significance when compared to the GSS-IDS (p < .10). The GSS-IDS produced the highest specificity score, followed by the MHSF and the CODSI-SMD. The specificity score of the MINI-M was 7% to 12% lower than that of the other instruments (p < .05).
When the scores were disaggregated by gender (see Table 5), the CODSI-SMD produced the highest overall accuracy score for males and tied the MINI-M for highest overall accuracy score for females; the GSS-IDS was almost 7% lower for females. The highest sensitivity scores were achieved by the CODSI-SMD for both males and females, with none of the comparisons reaching statistical significance except the CODSI-SMD and MHSF (< .05). The GSS-IDS, followed by the MHSF, produced the highest specificity scores for men; the CODSI-SMD produced an intermediate score 4% lower than the GSS-IDS; these three instruments showed higher scores than the MINI-M, with p values of < .01, < .10, and < .10, respectively. Among females, the MHSF showed the highest specificity score and the CODSI-SMD the lowest, although these differences did not reach statistical significance.
Test/retest reliability for each instrument was calculated from a randomly selected sample of 60 subjects drawn from the Colorado sample and tested 1 week apart. The most meaningful form of reliability was considered replication based on whether an individual met the cutpoint for referral to assessment. Because these data were categorical, Kappa was used to determine agreement between the test and retest using the cutpoints selected for the study. All four instruments produced statistically significant Kappas, with the MHSF and MINI-M producing Kappas superior to those of the GSS as follows: MHSF with a cutoff score 3, κ = 0.625 (p < .000), MINI-M with a cutoff score of 5, κ = 0.618 (p < .000), and the GSS with a cutoff score of 2, κ = 0.381 (p < .01); the 6-item CODSI-MD screener was κ= 0.526 (p < .000). Kappa reliability was established on test–retest data on the same sample described above using the cutpoints for screening for severe mental disorders. Kappa reliability for the each of the instruments was: CODSI-SMD κ = 0.660, p < .000; MHSF κ = 0.761, p < .000; MINI-M κ = 0.682, p < .000, GSS κ = 0.487, p < .000. The percentage of times that the instrument consistently determines whether or not the individual falls below the specified cutoff point is another indication of the intertest agreement on whether the cutoff-score criterion was met at both Time 1 and Time 2. These percentages were calculated as follows: for any mental disorder, the percentage agreement between Time 1 and Time 2 was CODSI-MD (80.0%), MHSF (85.0%), MINI-M (81.7%), and GSS (76.7%); for any severe mental disorder, the percentage agreement between Time 1 and Time 2 was CODSI-SMD (86.7%), MHSF (91.7%), MINI-M (88.3%), and GSS-IDS (83.3%).
Four instruments—the MHSF, GSS, CODSI-MD, and the MINI-M—provided reasonable and comparable overall accuracy in screening for any mental disorder. Results were similar when the data were disaggregated by gender. The MHSF showed the highest overall accuracy for females, but few sizable differences were evident between the MHSF, GSS, CODSI-MD, and the MINI-M on the sensitivity and specificity scores for either males or females. Generally, each of the four instruments did best under some circumstances, such as an emphasis on sensitivity or specificity or for a particular gender, so that the choice of an instrument will depend on the particular requirements of the user. The CODSI-MD had the advantage of being shorter than the other instruments while producing comparable results, yet the use of the previously standardized instruments may well be justified because, besides performing reasonably well in this study, they are well established, already in use in many settings, and provide other information that is often valuable in a particular situation.
When screening for a severe mental disorder (i.e., schizophrenia, bipolar disorders, and depressive disorders) and suicide-risk potential, all instruments, the MHSF, GSS-IDS, CODSI-MD, and MINI-M provided reasonable accuracy. The CODSI-SMD produced a higher sensitivity score than either the MHSF or the GSS-IDS. The CODSI-SMD ruled out 91% of those without a severe disorder and identified 51% of those with a severe disorder. Although the sensitivity of the CODSI-SMD could be substantially increased (to 84%) by using a cutoff score of 1, the specificity score would be halved (to 45%), resulting in lower overall accuracy; therefore, retaining the selected cutoff score of 2 was judged to be best for balancing the need to identify cases with the need to avoid overstressing scarce assessment resources. Attempts to add additional items to the CODSI-SMD also resulted in lower overall accuracy, as specificity was reduced to a greater extent than sensitivity was increased. One could argue that, in this case, the choice of instruments would depend on the prevalence of mental disorders in the population of interest and on the importance of identifying positive cases. Under conditions of very low prevalence in the population being screened, the MHSF and the GSS-IDS would be expected to produce better overall accuracy and, when prevalence was high, the CODSI-SMD would be expected to produce better overall accuracy.
Clinical utility of the instruments will be determined by the prevalence rates for mental disorders in the population and the availability of diagnostic assessment and treatment resources. Generally, diagnostic assessment resources are scarce so that, when prevalence rates are low, specificity is valued for the ability to screen out individuals without the disorder. When prevalence is high and adequate diagnostic resources are available, sensitivity is preferred as a means of identifying the highest number of individuals possessing the disorder. The CODSI-MD approximates the overall accuracy of the two best-performing instruments—the MHSF and the GSS. No instrument demonstrated significantly higher sensitivity or specificity scores than the CODSI-MD in the analyses of the entire sample or in the separately reported gender findings.
The CODSI-SMD offers comparable overall accuracy to the other instruments while demonstrating the best balance between sensitivity and specificity. Thus, the CODSI-SMD identifies nearly 91% of those without a severe mental disorder as not having a disorder and 50% of those with a severe mental disorder as having the disorder. The ability to identify all cases with a disorder is particularly important when prevalence is high or when the need to identify as many of the positive cases as practicable is compelling. The three-item CODSI-SMD also provides greater efficiency in its use of screening resources; it consists of fewer items and, consequently, its response times are lower than the other instruments.
Additional analyses of the unweighted data (as differentiated from the weighted data in Tables 2 and and4)4) showed that the CODSI screening instruments offer users a high degree of efficiency. The CODSI-MD identified 132 individuals as having a disorder, 85% (112) of whom had a SCID diagnosis of a mental disorder. Thus, only 20 cases (15%) who would have been referred for assessment would not have had a mental disorder. The CODSI-SMD, in particular, offers users with limited assessment resources a highly efficient means of identifying those prisoners most in need of further assessment. The CODSI-SMD identified 53% (40 of 75) of the subjects who met the criteria for a severe mental disorder. Furthermore, these analyses found that all 40 (100%) of the individuals who screened positively for severe mental disorder on the CODSI-SMD met SCID criteria for a severe mental disorder. Thus, using the CODSI-SMD screener would have maximized the identification of severe mental disorders and produced little or no wasted assessment resources in identifying individuals who met the mental disorder criteria for COD.
Although this validation study uses a convenience sample of admissions to prison substance-abuse treatment programs, the sample was geographically diverse, and the total of 141 (78.3%) admissions who were diagnosed with a mental disorder (using the SCID) was consistent with the 50% to 75% of admissions to community-based programs who are estimated to have at least some mental disorder (Compton, Cottler, Phelps, Abdallah, & Spitznagel, 2000; Havassy, Alvidrez, & Owen, 2004; Prendergast et al., 2004; Sacks, Sacks, De Leon, Bernhardt, & Staines, 1997). Furthermore, of the 141 mental disorder diagnoses, more than half were for severe mental disorders (75 of 141). Such high prevalence rates emphasize the importance of screening offenders to inform policy makers of the need for specialized services for inmates housed in prisons under their jurisdiction and to identify individuals who need further assessment to identify the potential need for specialized mental health services.
It should be further noted that the clinical utility of any screening instrument will be affected by the prevalence rates for mental disorders in the population. Although sensitivity and specificity are independent of prevalence, the overall accuracy of the screening instruments is not. For example, in a population with a high prevalence of mental disorders, sensitivity will be more influential in determining overall accuracy, whereas in a population with low prevalence of mental disorders, specificity will be more influential. In the present study population, sensitivity was favored because of the high prevalence of mental disorder, and specificity was comparatively more important in determining overall accuracy on the less prevalent severe mental disorders.
A further example of the effect of the prevalence of mental disorder can be seen in the reversal of the relationship between sensitivity and specificity scores between the two CODSI screening instruments. The CODSI-MD produced better overall accuracy for females, whereas the CODSI-SMD produced better overall accuracy for males. This may be attributed to any of several reasons, but one explanation for the difference lies in the differential prevalence rates for any mental disorder and for severe mental disorders. The higher prevalence for any mental disorder among females favors instruments with high sensitivity scores such as the CODSI-MD. On the other hand, the lower prevalence rates for severe mental disorders favors instruments that score high in specificity, such as the CODSI-SMD. In this instance the lower prevalence rate among males resulted in higher overall accuracy. In addition to gender, minority status may influence the overall accuracy of the results either because of the nature of the questions or the prevalence rates for any or for severe mental disorder(s). This issue is being addressed in an additional data collection and will be reported once that study has concluded.
The reader should exercise caution in interpreting the generalizability of these findings beyond the specific population studied (e.g., admissions to prison substance-abuse treatment programs). High prevalence rates of mental illness resulted in relatively small sample sizes on which to estimate specificity, particularly when the data were disaggregated by gender. Furthermore, although cutpoints were established in the pilot study within this population, the standard mental-health screening instruments studied had been designed originally to screen different populations. The MHSF, for example, was created to assist community-based substance-abuse treatment programs to determine the mental health needs of program admissions, a population that was similar, but not identical, to the offender population used in the pilot and validation studies described here. The other instruments (the MINI-M and the GSS) had been created to be screening devices for use in a variety of settings. It is reasonable to assume that these instruments may perform differently in other populations or other settings, particularly where the populations are markedly dissimilar from the present sample. Another possible drawback consists of the presentation of the items in both CODSI instruments as embedded in the other instruments. It is likely that these effects were small and will be corrected in planned future studies in which the CODSI instruments will be administered as distinct and separate instruments.
The investigators plan a future study to assess the validity of the CODSI-MD and the CODSI-SMD in conjunction with the TCUDS as a screener for offenders entering prison. It is expected that such study will undertake analysis for the following mental disorder categories: Axis I, severe; Axis I, all; Axis II with Axis I; and Axis II.
The high prevalence rates for mental disorders among admissions to prison substance-abuse treatment programs found in the present study and others emphasizes the need for screening, assessment, and specialized treatment services in criminal justice populations. The current study provides evidence for the overall accuracy of the 6-item CODSI-MD in determining the presence of any mental disorder and demonstrates particular strength for the 3-item CODSI-SMD in determining the presence of a severe mental disorder. The CODSI instruments can be used as a basis for referring prisoners for further assessment and as a means of collecting mental health data in prison substance-abuse treatment programs. In addition, the present study can be seen as validating the use of three standardized mental-health screening instruments (MHSF, the MINI-M, and GSS) in prison substance-abuse settings and as establishing the best cutoff scores for this purpose. Overall, the CODSI instruments show sufficient value in terms of brevity and efficiency to be combined with the TCUDS to form a screening device for co-occurring mental and substance-use disorders in prison substance-abuse treatment programs and to warrant future validation studies in other criminal justice populations and settings.
AUTHORS' NOTE: This study was funded (Grant No. U01 DA16200) under a cooperative agreement from the U.S. Department of Health and Human Services, Public Health Service, National Institutes of Health, National Institute on Drug Abuse (NIH/NIDA). The authors gratefully acknowledge the collaborative contributions by federal staff from NIDA, the 10 research centers, and participating field sites in the Criminal Justice Drug Abuse Treatment Studies (CJDATS) project, as described in more detail in the introduction to this special volume of Criminal Justice and Behavior.
1Although the ADS is not in the public domain, it is relatively inexpensive for bulk administration.
2Reading facility was not a consideration as a research interviewer read all questions aloud to each subject (the instruments were available in English only).
3During the course of the study, a new version of the GSS was released; all findings discussed here refer to version 1.0 of the GSS.
Views and opinions are those of the authors and do not necessarily reflect those of Department of Health and Human Services, the NIH, NIDA, or other participants in CJDATS. This article has not been published elsewhere nor has it been submitted simultaneously for publication elsewhere.