Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
J Affect Disord. Author manuscript; available in PMC 2013 September 27.
Published in final edited form as:
PMCID: PMC3785082

Comparison of clinical and research assessments of diagnosis, suicide attempt history and suicidal ideation in Major Depression


Over two decades ago, clinicians were challenged to demonstrate they were not superfluous as diagnosticians (Spitzer 1983). Since then, reports have compared clinical diagnostic assessments with standardized schedules to determine level of agreement. Studies have focused on children, adolescents and adult outpatients (Basco et al 2000; Komiti et al 2001; Kranzler et al 1995; Shear et al 2000; Zimmerman and Mattia 1999; Ezpeleta et al 1997; Jensen and Weisz 2002; Lewczyk et al 2003; Thienemann 2004), inpatients (Fennig et al 1996; Kranzler et al 1995; Miller et al 2001; Rosenman et al 1997; Aronen et al 1993; Steiner et al 1995), adults transferring from emergency departments to inpatient units (Miller 2001; Taggart et al 2006) and adult epidemiologic samples (Anthony et al 1985; Eaton et al 2000). Studies have assessed psychiatric diagnostic agreement in children and adults from a diagnostic range (Aronen et al 1993; Ezpeleta et al 1997; Jensen and Weiss 2002; Lewczyk et al 2003; Steiner et al 1995; Weinstein et al 1989; Zimmerman and Mattia 1999), and from restricted number (Basco et al 2000; Fennig et al 1996; Komiti et al 2001; Kranzler et al 1995; Miller 2001; Miller et al 2001; Rosenman et al 1997; Shear et al 2000; Taggart et al 2006; Thienemann 2004). Some findings indicate moderate (Anthony et al 1985; Ezpeleta et al 1997; Fennig et al 1996; Komiti et al 2001; Kranzler et al 1995; Miller et al 2001; Taggart et al 2006), but mostly poor (Aronen et al 1993; Ezpeleta et al 1997; Jensen et al 2002; Komiti et al 2001; Lewczyk et al 2003; Miller et al 2001; Rosenman et al 1997; Shear et al 2000) agreement between diagnoses obtained by clinical versus research assessment.

A more critical task is suicide risk assessment. Prospective studies have identified risk factors for suicidal behavior (Oquendo et al 2006), but no standard clinical suicide assessment exists. Few studies have assessed the utility and accuracy of suicide related rating scales in psychiatric in and outpatients (Beck et al 1988; Beck et al 1989; Beck et al 1979; Brown et al 2000; Holden and DeLisle 2005; Pinninti et al 2002; Steer et al 1993; Steer et al 1993), and there is sparse literature comparing standardized rating scales for suicidality with clinical assessments. However, clinicians appear to fail to document suicidal behaviors reported by patient self-report or identified by research ratings (Healy et al 2006; Malone et al 1995).

Thus, accuracy of clinical diagnostic assessment and suicide risk evaluation, imperative to providing quality and safe care, is sub-par. To address this issue, we determined agreement between clinical and research assessments of diagnosis and suicidal behaviors in inpatients admitted to a research unit. If in fact clinical assessment is less likely to identify high-risk patients or different diagnoses compared to research assessments, then standardized scales in routine care may be useful.

Materials and methods

Adult inpatients (N=201) with a major depressive episode (MDE) in the context of major depressive or bipolar disorder based on the Structured Clinical Interview for DSM III-R (Spitzer and Williams 1985) gave written informed consent as approved by the IRB. Postgraduate year II resident physicians (PGYIIs), with attending physician supervision made clinical diagnostic and suicide assessments. Master’s or Ph.D. level clinicians performed independent structured diagnostic interviews and suicide assessments within 1-5 days of another. Clinical data were obtained from a retrospective chart review of consecutively admitted patients (October 2002 - August 2006).


Women (n=120) and men (n=81) aged 18-72 had a physical examination and routine blood tests, including urine toxicology. Exclusion criteria were current substance or alcohol abuse, or active medical conditions that could confound diagnosis.


The inpatient unit is in a tertiary care, university-affiliated medical center. Attending psychiatrists, PGYIIs, nurses, social workers, recreational therapists and mental health therapy aides provide patient care.

Routine Clinical Assessment

On admission, patients had a thorough clinical assessment by a PGYII covering chief complaint, history of present illness, current medications, past psychiatric, substance use, physical and sexual abuse, family psychiatric, past medical, family medical, and psychosocial histories, allergies, mental status exam (MSE), multi-axial diagnosis, immediate needs and plan. The standard of care includes an unstructured assessment of current and past suicidal ideation, intent or plan. Attending psychiatrists evaluated patients within 24 hours and concurred or amended the PGYIIs’ diagnostic and suicide assessment.

Charts were reviewed for documented suicide risk in the “alerts” section. When there was a suicide alert, the MSE was reviewed for current suicidal ideation, intent or plan. If no alert was documented, the chart was not reviewed. Charts were reviewed for admission and discharge diagnoses.

Research Assessment Instruments

The International Personality Disorders Examination (Loranger et al. 1994) and SCID-II (Spitzer et al 1990), the 17-item Hamilton Depression Rating Scale (HDRS-17) (Hamilton 1960), Beck Depression Inventory (BDI) (Beck et al. 1961), and Brief Psychiatric Rating Scale (Overall and Gorham 1962) were used to assess psychopathology. Suicide attempt was defined as a self-destructive act with intent to end one’s life. The number, method, and degree of medical damage of suicide attempts were characterized using the Columbia Suicide History Form (CSHF) (Oquendo et al. 2003). Suicidal ideation was assessed using the Scale for Suicidal Ideation (BSSI: Beck et al. 1979).

Statistical Methods

For comparative purposes, PGYII provided one type of rating and research interviewers, a second type. Research interviewers’ inter-rater reliability is robust (ICC: 0.80 to 0.95). We do not have similar data for clinicians, but this approximation was necessary because 50 PGYIIs assessed between 1-5 patients. Attending staff was unchanged during the study. The SSI used in the research interview was dichotomized to classify ideation as present/absent for comparison with clinical assessment. Percent agreement and Cohen’s Kappa coefficients (Cohen, 1960) were calculated and interpreted according to standard criteria (Landis and Koch, 1977).

Proximity of the most recent suicide attempt was classified as remote, recent, or none. An attempt was recent if within one year of assessment and remote if more than a year. Associations between proximity of suicide attempt and agreement between raters were tested using Chi-squared.


Clinical and demographic characteristics of the sample are in Table 1. Agreement for admission diagnosis between research assessment and that made by PGYIIs was 67.7%; moderate agreement with a Cohen’s kappa of 0.407. Agreement between discharge diagnosis and research assessment was moderate, 68.3%, with a kappa of 0.432. Cross-tabulation of discharge diagnoses showed that over half the patients identified by a scheduled interview as having MDD were so diagnosed by PGYIIs (Table II).

Table I
Baseline Characteristics of Study Patients (n=201)
Table II
Agreement between clinical and research discharge axis I diagnosis

There was moderate agreement for suicide attempts at 79.2%, kappa = 0.595. Of note, 18.8% of those patients identified by research schedule as past suicide attempters were not identified as such by PGYIIs (Table III). Agreement was fair when evaluating suicidal ideation with a value of 66.5%, kappa = 0.250. All 54.1% of patients identified by PGYIIs as having suicidal ideation were also captured by research assessment. The converse was not true, with 29.7% of patients assessed by structured interview as having suicidal ideation, not identified as such by PGYIIs.

Table III
Agreement between Clinical and Research Interview for Suicide Attempt History and Suicidal Ideation

Table IV shows 74.6% agreement between clinical and research assessments for attempts within a year. For suicide attempts beyond a year, agreement dropped to 56.5% (p <0.001). The level of agreement for suicidal ideation based on the most recent suicide attempt showed no statistically significant difference based on proximity of suicide attempt.

Table IV
Clinical and Research Evaluation of Suicide Attempt and Suicidal Ideation Status with Proximity of Most Recent Suicide Attempt


We extend Malone et al (1995) findings using a larger sample to compare agreement of clinical diagnosis and suicide assessments with standardized diagnostic and suicide assessments for depressed inpatients. We report moderate agreement in clinical and research diagnoses. As well, we show moderate agreement on suicide attempt history obtained by clinical and standardized interview. Finally, we find fair agreement in the assessment of suicidal ideation by clinical and standardized interview.

Moderate diagnostic agreement has been previously reported using similar methodology (Anthony et al 1985; Ezpeleta et al 1997; Fennig et al 1996; Komiti et al 2001; Kranzler et al 1995; Miller et al 2001; Taggart et al 2006). Prior studies’ limitations include: (1) mismatch in experience of clinicians performing structured and unstructured interviews (Ezpeleta 1997; Fennig et al 1996; Jensen et al 2002; Kranzler et al 1995; Komiti et al 2001; Lewczyk et al 2003; Rosenman et al 1997; Steiner et al 1995); (2) time frames between evaluations varying from days to a month (Fennig et al 1996; Jensen et al 2002; Kranzler et al 1995; Rosenman et al 1997; Steiner et al 1995); (3) in two studies, different patients groups with similar demographics were evaluated (Thienemann 2004; Zimmerman and Mattia 1999); (4) no “gold standard” for assessment of psychopathology (Brugha 1999). Our study offers several strengths. Evaluations were performed within days of each other, by similarly experienced (though not identical trained) clinicians, on the same patients, by clinical and research teams. We are left to contend with the issue of a lack of a “gold standard” for assessing psychopathology. Assessment involves knowledge about abnormal mental states, the skill to elicit them, and judgment regarding their presence and significance (Brugha et al 1999). Baca-Garcia and colleagues (2007) showed that consistency of psychiatric diagnosis ranged from 29% for personality disorders to 70% for schizophrenia, with greatest stability for inpatient and least for outpatient diagnoses. Longitudinal data demonstrating significant fluctuation of psychiatric diagnosis in clinical settings are important reminders of the inherent weakness in our current nosology. As diagnostic assessments move from clinical impressions to semi-structured schedules, reliability may improve, but the issue of validity remains unaddressed.

Our findings regarding agreement between clinical and research assessments of suicide attempts and ideation are sobering. We found moderate agreement for assessment of suicide attempt history, which dropped to only fair agreement for suicidal ideation. These findings are remarkable given that clinicians were aware of the research team’s focus on suicidal behavior. Most worrisome were patients found on semi-structured interview to have either a suicide attempt history (18.7%) or suicidal ideation (29.7%), not identified by PGYIIs as suicidal. Investigators have reported similar findings (Beck et al 1988; Levine et al 1989; Steer et al 1993) stating patients reveal more information about suicide risk during computer-assisted assessment than during clinical interview. Consonant with our findings, Malone et al (1995) demonstrated that discharge summaries did not document recent suicidal ideation or planning behavior in 38% of patients identified with suicidal behavior on research assessment. Our study has a similar design and setting, but a 400% increase in sample size. Healy and colleagues reported 90% of 735 patients presenting to an Emergency Department had suicidal ideation. Only 37% were rated as suicidal by clinicians, although 62% scored positive on the BSSI. While the sample is large, assessment of suicidal ideation was not clinician administered in the comparison with patient self-report.

There are limitations to our study. Neither PGYIIs nor research raters were blinded to the goal of the study. Second, this was a retrospective chart review in which only predetermined parts of the chart were surveyed for diagnosis and suicide alert. Third, patients may have differentially expressed suicidal behaviors to “researchers” versus “clinicians” fearing restrictive observation status by the latter. Fourth, suicidal behaviors may have changed over the 1-5 days between assessments explaining discrepancies. Fifth, clinical material was generated by PGYIIs on one unit at one training program. This may not reflect the clinical skills of PGYIIs at other programs or more experienced clinicians.

Despite these caveats, this study is unique. We compared independent evaluations of diagnosis, suicide history and ideation in the same patients over a brief time using different assessment tools, allowing us to make recommendations for care. Use of semi-structured interviews and suicide assessments would improve clinical assessments by capturing almost 20% of patients clinically misidentified as not being past suicide attempters and close to 30% of patients clinically misidentified as not having suicidal ideation. User-friendly instruments may aid clinical assessment by enhancing reliability and validity in diagnostic and suicide risk assessment.


Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.


  • Anthony JC, Folstein M, Romanoski AJ, Von Koff MR, Nestadt GR, Chahal R, Merchant A, Brown CH, Shapiro S, Kramer M. Comparison of the lay diagnostic interview schedule and a standardized psychiatric diagnosis. Experience in eastern Baltimore. Archives of General Psychiatry. 1985;42:667–675. [PubMed]
  • Aronen ET, Noam GG, Weinstein SR. Structured diagnostic interviews and clinicians’ discharge diagnosis in hospitalized adolescents. Journal of the American Academy of Child and Adolescent Psychiatry. 1993;32:674–681. [PubMed]
  • Baca-Garcia E, Perez-Rodriquez MM, Basurte-Villamor I, Fernandez del Moral F, Jiminez-Arriero MA, Gonzalez de Rivera JL, Saiz-Ruiz J, Oquendo MA. Diagnostic stability of psychiatric disorders in clinical practice. British Journal of Psychiatry. 2007;190:210–216. [PubMed]
  • Basco MR, Bostic JQ, Davies D, Rush AJ, Witte B, Hendrickse W, Barnett Methods to improve diagnostic accuracy in a community mental health setting. American Journal of Psychiatry. 2000;157:1599–1605. [PubMed]
  • Beck AT, Brown G, Steer RA. Prediction of eventual suicide in psychiatric inpatients by clinical ratings of hopelessness. Journal of Consultation and Clinical Psychology. 1989;57:309–310. [PubMed]
  • Beck AT, Kovacs M, Weissman A. Assessment of suicidal intention: the scale for suicidal ideation. Journal of Consultation and Clinical Psychology. 1979;47:343–352. [PubMed]
  • Beck AT, Steer RA, Ranieri WF. Scale for suicide ideation: psychometric properties of a self-report version. Journal of Clinical Psychology. 1988;44:499–505. [PubMed]
  • Beck AT, Ward CH, Mendelson M, Mock J, Erbaugh J. An inventory for measuring depression. Archives of General Psychiatry. 1961;4:561–571. [PubMed]
  • Brown GK, Beck AT, Steer RA, Graham JR. Risk factors for suicide in psychiatric outpatients. A 20-year prospective study. Journal of Consultation and Clinical Psychology. 2000;68:371–377. [PubMed]
  • Brugha TS, Bebbington PE, Jenkins R. A difference that matters: comparisons of structured and semi-structured psychiatric diagnostic interviews in the general population. Psychological Medicine. 1999;29:1013–1020. [PubMed]
  • Cohen J. A coefficient for agreement for nominal scales. Educational and Psychological Measurement. 1960;20:37–46.
  • Eaton WW, Neufield K, Chen L-S, Cai G. A comparison of self-report and clinical diagnostic interviews for depression. 2000;57:217–222. [PubMed]
  • Ezpeleta L, de la Osa N, Domeniech JM, Navarro JB, Losilla JM, Judez J. Diagnostic agreement between clinicians and the diagnostic interview for children and adolescents DICA-R in an outpatient sample. Journal of Child Psychology and Psychiatry and allied disciplines. 1997;38:431–440. [PubMed]
  • Fennig S, Naisberg-Fennig S, Craig TJ, Tanenberg-Karant M, Bromet EJ. Comparison of clinical and research diagnosis of substance use disorders in a first-admission psychotic sample. American Journal of Addictions. 1996;5:40–48.
  • Hamilton M. A rating scale for depression. Journal of Neurology, Neurosurgery and Psychiatry. 1960;23:56–62. [PMC free article] [PubMed]
  • Healy DJ, Barry K, Blow F, Welsh D, Milner KK. Routine use of the Beck Scale for Suicide Ideation in a psychiatric emergency department in a general hospital. Psychiatry. 2006;26:323–329. [PubMed]
  • Holden RR, DeLisle MM. Factor analysis of the Beck Scale for Suicide Ideation with female suicide attempters. Assessment. 2005;12:231–238. [PubMed]
  • Jensen AL, Weisz JR. Assessing match and mismatch between practitioner-generated and standardized interview-generated diagnosis for clinic-referred children and adolescents. Journal of Consultation and Clinical Psychology. 2002;70:158–168. [PubMed]
  • Komiti AA, Jackson HJ, Judd FK, Cockram AM, Kyrios M, Yeatman R, Murray G, Hordern C, Wainwright K, Allen N, Singh B. A comparison of the Composite International Diagnostic Interview (CIDI-Auto) with clinical assessment in diagnosing mood and anxiety disorders. Australian and New Zealand Journal of Psychiatry. 2001;35:224–230. [PubMed]
  • Kranzler HR, Kadden RM, Burleson JA, Babor TF, Apter A, Rounsaville BJ. Validity of psychiatric diagnosis in patients with substance use disorders: Is the interview more important than the interviewer? Comprehensive Psychiatry. 1995;36:278–288. [PubMed]
  • Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–174. [PubMed]
  • Levine S, Ancill RJ, Roberts AP. Assessment of suicide risk by computer-delivered self-rating questionnaire: preliminary findings. Acta Psychiatrica Scandinavica. 1989;80:216–220. [PubMed]
  • Lewczyk CM, Garland AF, Huriburt MS, Gearity J, Hough RL. Comparing DISC/V and clinician diagnosis among youths receiving public mental health services. American Academy of Child and Adolescent Psychiatry. 2003;42:349–350. [PubMed]
  • Loranger AW, Sartorious N, Andreoli A, Berger P, Buchheim P, Channnabasavanna SM, Coid B, Dahl A, Diekstra RFW, Ferguson B, Mombour W, Pull C, Ono Y, Reiger DA. The International Personality Disorder Examination; the World Health Organization/Alcohol, Drug Abuse, and Mental Health Administration International Pilot of Personality Disorders. Archives of General Psychiatry. 1994;51:215–224. [PubMed]
  • Overall JE, Gorham DR. The Brief Psychiatric Rating Scale. Psychology Reports. 1962;10:799–812.
  • Malone KM, Szanto K, Corbitt EM, Mann JJ. Clinical Assessment versus research methods in the assessment of suicidal behavior. American Journal of Psychiatry. 1995;152:1601–1607. [PubMed]
  • Miller PR. Inpatient diagnostic assessments: 2. Interrater reliability and outcomes of structured vs. unstructured interviews. Psychiatric Research. 2001;105:265–271. [PubMed]
  • Miller PR, Dasher R, Collins R, Griffiths P, Brown F. Inpatient diagnostic assessments: 1. Accuracy of structured vs. unstructured interviews. Psychiatric Research. 2001;105:255–264. [PubMed]
  • Oquendo MA, Currier D, Mann JJ. Prospective studies of suicidal behavior in major depressive and bipolar disorders: what is the evidence for predictive risk factors? Acta Psychiatrica Scandinavica. 2006;114:151–158. [PubMed]
  • Oquendo MA, Halberstam B, Mann JJ. Risk factors for suicidal behaviors: the utility and limitations of research instruments. In: First MB, editor. Standardized Evaluation in Clinical Practice. American Psychiatric Press; Arlington, VA: 2003. pp. 103–130.
  • Pinninti N, Steer RA, Rissmiller DJ, Nelson S, Beck AT. Use of the Beck Scale for Suicide Ideation with psychiatric inpatients diagnosed with schizophrenia, schizoaffective, or bipolar disorders. Behavior Research and Therapy. 2002;40:1071–1079. [PubMed]
  • Robins LN, Helzer JE, Croughan J, Ratcliff KS. National Institute of Mental Health Diagnostic Interview Schedule: its history, characteristics and validity. Archives of General Psychiatry. 1981;38:381–389. [PubMed]
  • Rosenman SJ, Korten AE, Levings CT. Computerized diagnosis in acute psychiatry: validity of CIDI-Auto against routine clinical diagnosis. Journal of Psychiatric Research. 1997;31:581–592. [PubMed]
  • Shear MK, Greene C, Kang J, Ludewig D, Frank E, Swartz HA, Hanekamp M. Diagnosis of nonpsychotic patients in community clinics. American Journal of Psychiatry. 2000;157:581–587. [PubMed]
  • Spitzer RL. Psychiatric diagnosis: are clinicians still necessary? Comprehensive Psychiatry. 1983;24:399–411. [PubMed]
  • Spitzer RL, Williams JBW. Structured Clinical Interview of DSM-II-R-Patient Version (SCID P) New York State Psychiatric Institute Biometrics Research; New York: 1985.
  • Spitzer RL, Willimas JBW, Gibbon M, First MB. Structured Clinical Interview for DSM-II-R Personality Disorder (SCID-II) American Psychiatric Press; Washington DC: 1990.
  • Steer RA, Kumar G, Beck AT. Self-reported suicidal ideation in adolescent psychiatric inpatients. Journal of Consultation and Clinical Psychology. 1993;61:1096–1099. [PubMed]
  • Steer RA, Rissmiller DJ, Ranieri WF, Beck AT. Dimensions of suicidal ideation in psychiatric inpatients. Behavior Research and Therapy. 1993;31:229–236. [PubMed]
  • Steiner JL, Tebes JK, Sledge WH, Walker ML. A comparison of the structured interview for DSM-II-R and clinical diagnosis. Journal of Nervous and Mental Disease. 1995;183:365–369. [PubMed]
  • Taggart C, O’Grady J, Stevenson M, Hand E, McClelland, Kelly C. Accuracy of diagnosis at routine psychiatric assessment in patients presenting to an accident and emergency department. General Hospital Psychiatry. 2006;28:330–335. [PubMed]
  • Thienemann M. Introducing a structured interview into a clinical setting. Journal of the American Academy of Child and Adolescent Psychiatry. 2004;43:1057–1060. [PubMed]
  • Weinstein SR, Stone K, Noam G, Grimes K, Schwab-Stone Comparison of DISC with clinicians; DSM-III diagnosis in psychiatric inpatients. Journal of the American Academy of Child and Adolescent Psychiatry. 1989;28:53–60. [PubMed]
  • Zimmerman M, Mattia JI. Psychiatric diagnosis in clinical practice: is comorbidity being missed? Comprehensive Psychiatry. 1999;40L:182–191. [PubMed]