|Home | About | Journals | Submit | Contact Us | Français|
This article reviews the current state of the literature on the assessment of bipolar disorder in adults. Research on reliable and valid measures for bipolar disorder has unfortunately lagged behind assessment research for other disorders, such as major depression. We review diagnostic tools, self-report measures to facilitate screening for bipolar diagnoses, and symptom severity measures. We briefly review other assessment domains, including measures designed to facilitate self-monitoring of symptoms. We highlight particular gaps in the field, including an absence of research on the reliable diagnosis of bipolar II and milder forms of disorder, a lack of empirical data on the best ways to integrate data from multiple domains, and a shortage of measures targeting a broader set of illness-related constructs relevant to bipolar disorder.
The goal of this review is to summarize measures that are useful for the assessment of bipolar disorder among adults. We will focus, in particular, on measures pertinent to screening, diagnosis, and symptom monitoring. With the apparent success of lithium in treating bipolar disorder, research on the disorder languished until the 1990s. Interest in bipolar disorder assessment has been renewed in recent decades. Nonetheless, research on the accurate assessment of bipolar disorder is relatively sparse when compared with other disorders such as major depression. We begin by describing the forms of bipolar disorder, then turn to available measures for its diagnosis, including both interview and self-report measures. Later sections discuss interviews and scales used for assessing symptom severity, including self-monitoring.
Several types of bipolar disorder are recognized by the Diagnostic and Statistical Manual of Mental Disorders of the American Psychiatric Association (APA, 2000), differentiated by the severity and duration of manic symptoms. A diagnosis of bipolar I disorder is made based on a single lifetime episode of mania, which is in turn defined by euphoric or irritable mood, along with at least three additional symptoms (or four if mood is only irritable) that result in marked social or vocational impairment. The duration criterion for mania specifies that symptoms must last one week or require hospitalization. Bipolar II disorder, in contrast, is defined by a history of at least one hypomanic episode and at least one major depressive episode. Criteria for hypomania are similar to those of mania, but in milder form: instead of impairment, a hypomanic episode is marked by a distinct change in functioning. Cyclothymic disorder is an even milder subtype of bipolar disorder, and is diagnosed based on a period of at least two years of recurrent mood swings. By definition, these mood swings must be in both the “up” and the “down” directions, but do not meet full criteria for mania, hypomania, or depression. In addition, the symptomatic two-year period cannot include any two-month span that is free of mood swings.
Symptoms that are secondary to drugs such as cocaine, or medical conditions such as thyroid problems, will generally yield a diagnosis of substance-induced mood disorder or bipolar disorder not otherwise specified. Those with a vulnerability to bipolar disorder may become manic when prescribed antidepressants without an accompanying mood stabilizer (Ghaemi, Lenox, & Baldessarini, 2001), yielding a diagnosis of substance-induced mood disorder with manic features.
Large epidemiological studies indicate a prevalence of 1% for bipolar I disorder and an additional 3% for bipolar II disorder (Kessler, Berglund, Demler, Jin, & Walters, 2005). As many as three quarters of those with bipolar I disorder have also experienced an episode of major depression (Karkowski & Kendler, 1997; Kessler, Rubinow, Holmes, Abelson, & Zhao, 1997). Comorbidity rates with anxiety disorders and substance abuse disorders have been reported as high as 93% and 61%, respectively (Kessler et al., 1997; Regier et al., 1990), underscoring the need for effective assessments and treatments of bipolar disorder to take comorbid conditions into account.
Twin studies suggest that heritability accounts for more than 90% of the variability in the development of bipolar disorder (Kieseppä, Partonen, Haukka, Kaprio, & Lönnqvist, 2004), leading many researchers to focus on medications such as lithium for treatment (Prien & Potter, 1990). The course of the disorder, however, may be strongly affected by psychosocial variables. Manic episodes may be triggered by sleep disturbance (Leibenluft et al., 1996) or excessive pursuit of goals (Johnson, 2005). Depressive episodes within bipolar disorder share common triggers with unipolar depression, such as negative life events, maladaptive cognitive styles, and lack of social support (Johnson & Kizer, 2002). Thus psychotherapy may serve as an effective addition to medication in the treatment of bipolar disorder (Johnson & Leahy, 2004; Rizvi & Zaretsky, 2007).
The diagnosis of bipolar disorder is based on a review of symptoms and potential medical explanations for those symptoms, as there is no biological marker for the disorder. In clinical practice, symptoms are frequently reviewed in an unstructured manner. It should be noted, though, that when practitioners do not use structured diagnostic tools, as many as half of comorbid conditions go undetected (Zimmerman & Mattia, 1999). Furthermore, many practitioners report that they do not routinely screen for bipolar disorder even among people with a history of major depression, many of whom would meet the diagnostic criteria for bipolar disorder (Brickman, LoPicollo, & Johnson, 2002). Due to informal or poor screening, the average time between onset of symptoms and formal diagnosis is more than seven years (Lish, Dime-Meenan, Whybrow, Price, & Hirschfeld, 1994; Mantere, Suiminen, Leppamaki, Arvilommi, & Isometsa, 2004). Improper diagnosis has serious repercussions because antidepressant treatment without mood-stabilizing medication can trigger iatrogenic mania (Ghaemi et al., 2001).
Several semistructured interviews have been developed to assess bipolar disorder in adults. The two most commonly used measures are the Structured Clinical Interview for DSM-IV (SCID) and the Schedule for Affective Disorders and Schizophrenia (SADS). We will not focus here on the Composite Interview Diagnostic Interview (CIDI; Robbins et al., 1988), which has been developed and used mostly in epidemiological surveys (e.g., Kessler & Zhao, 1999). Briefly, there is some evidence that the CIDI may systematically underdiagnose bipolar disorder (e.g., Kessler, Rubinow, Holmes, Abelson, & Zhao, 1997), but more recent work has since validated it against the SCID (Kessler et al., 2006). The SCID and the SADS both provide interview probes, symptom thresholds, and information about exclusion criteria (i.e., medical or pharmacological conditions that may induce mania). They differ, however, in the criteria they were designed to assess. The SCID is designed to help assess diagnoses according to the DSM-IV, whereas the SADS is designed to assess diagnoses according to the Research Diagnostic Criteria (RDC). RDC criteria are stricter in that psychotic symptoms are more likely to yield a diagnosis of schizoaffective disorder than would be applied in the DSM-IV criteria; within the DSM-IV criteria, psychotic symptoms must be present for at least two weeks outside of episode to be considered evidence of schizoaffective disorder. Further details about these measures are provided next. We begin by describing the measures and their psychometric characteristics for assessing bipolar I disorder. We then turn toward some specific issues that complicate the assessment of milder forms of bipolar disorder. Table 1 summarizes some of the well-supported measures for the diagnosis of bipolar disorder.
The SCID (Spitzer, Williams, Gibbon, & First, 1992) is recommended as a routine part of clinical intake procedures. The SCID is a semistructured interview that is divided into modules to cover different diagnoses. The modular design allows for the interview to be easily tailored to capture relevant diagnoses for a given research or clinical situation. Each SCID module contains probes to cover each of the core symptoms, and interviewers can use clinical judgment in gathering supplemental information if probes do not provide sufficient information for reliable symptom assessment. A clinician’s version is available through American Psychiatric Publishing (First, Spitzer, Gibbon, & Williams, 1997). The SCID, and more specifically its bipolar disorder module, demonstrated good interrater reliability both in a large international multisite trial (Williams et al., 1992) and in at least 10 other major trials (Rogers, Jackson, & Cashel, 2001). In patient samples, reliability for current and lifetime diagnoses of bipolar disorder has been adequate to excellent, ranging from .64 to .92; establishing reliability for the SCID in community samples is more difficult due to low base rates of the disorder (Williams et al., 1992). Compared to other structured interviews including the Diagnostic Interview Schedule (DIS) and the Composite International Diagnostic Interview (CIDI), and to clinicians not using a structured interview, diagnoses of bipolar disorder based on the SCID appear substantially more reliable. Results of one study indicated that the percentage of agreements with the gold standard were higher for the SCID as compared to standard clinician interviews (Basco et al., 2000). In a sample of twins, diagnoses of bipolar disorder made using the SCID showed similar concordance rates between monozygotic and dizygotic twins compared to traditional twin studies using standard diagnostic interviews (Kieseppä, et al., 2004).
The SADS (Endicott & Spitzer, 1978) was designed to assess a broad range of Axis I diagnoses. For each diagnosis, the probes focus on the symptoms for the most recent episode and then capture a broad overview of past episodes. The reliability and validity of the SADS has been established across 21 studies (see Rogers, Jackson, & Cashel, 2001, for a review). The SADS has demonstrated good to excellent reliability for both symptoms and diagnoses (Andreasen et al., 1981). Specifically, mania diagnoses have achieved good interrater reliability and achieved good test–retest reliability over 5 to 10 years among adults (Coryell et al., 1995; Rice et al., 1986). SADS diagnoses of bipolar disorder correlate robustly with other measures of mania (Secunda et al., 1985), and the SADS appears to validly capture diagnoses across different cultural and ethnic groups within the United States (Vernon & Roberts, 1982).
Hypomania is unique among DSM syndromes, in that by definition it does not cause any functional impairment. Perhaps because of this quality, the presence of at least one major depressive episode is also required to achieve a diagnosis of bipolar II disorder. This presents a unique diagnostic challenge: the hypomanic episodes that separate bipolar II disorder from unipolar depression are by definition of only limited severity, making this a hard diagnosis to reliably detect. Complicating this picture is the fact that there are important disagreements in the field regarding the best criteria for hypomanic episodes. For instance, current DSM criteria require three or four symptoms, in addition to elevated or irritable mood, lasting at least four days. In contrast, RDC criteria only require three symptoms lasting two days. Given this uncertainty and relative lack of severity of hypomania, it is not surprising that the accurate assessment of bipolar II disorder is more difficult to achieve than bipolar I disorder.
Given that hypomania is almost always accompanied by less distress than depressive episodes, one might be tempted to focus on detecting depression. There is evidence, however, that the diagnosis of hypomania (and hence, bipolar II disorder) is important above and beyond the detection of depression. Diagnoses of bipolar II disorder are accompanied by increased mood lability (Akiskal et al., 1995) and a family history of bipolar II disorder (Rice et al., 1986). In addition, at least three studies have demonstrated that people with bipolar II disorder are at a higher risk for suicide than are those with bipolar I disorder or unipolar depression (Dunner, 1996). It is possible that the low mood of depression, combined with the impulsivity of hypomania, may be especially likely to lead to suicide attempts. In addition to suicide risk, the misdiagnosis of bipolar II disorder can have harmful pharmacological implications. The prescription of antidepressants, which is likely if bipolar II disorder is misdiagnosed as unipolar depression, may cause or exacerbate manic symptoms (Ghaemi et al., 2001). Thus, identification of bipolar II disorder may be pivotal in administering effective treatments.
The above-described difficulties in assessing hypomanic symptoms have manifested in low reliability for the SADS in detecting bipolar II disorder (Andreasen et al., 1981), even when interviewers rate the same tapes (Keller et al., 1981). Some research groups have achieved better estimates, however (Simpson et al., 2002; Spitzer & Endicott, 1978). Beyond the inconsistent estimates of interrater reliability, test–retest reliability over six months to two years likewise has been low for bipolar II disorder and cyclothymic disorder alike (Andreasen et al.; Rice et al., 1986). In one study, only 40% of participants with bipolar II disorder according to the SADS at baseline experienced any manic or hypomanic episodes over the ensuing 10 years (Coryell et al., 1995). This lack of ability to accurately detect bipolar II disorder is not limited to the SADS. In one study, a SCID interview missed one third of bipolar II cases identified by expert clinical interview (Dunner & Tay, 1993; Simpson et al., 2002). In sum, the best available diagnostic interviews are limited in their psychometric characteristics for the diagnosis of bipolar II disorder.
These difficulties have led some researchers to suggest that interviews aimed at detecting bipolar II disorder should start with questions about behavioral activation and increases in goal-directed behaviors rather than mood (Akiskal & Benazzi, 2005). Although promising, such approaches have not yet been fully validated.
In sum, a set of issues mars diagnosis of bipolar II disorder. Persons who meet criteria for bipolar II disorder may be at high risk for suicidality, and they may experience a worsening of manic symptoms if prescribed antidepressants. On the other hand, available tools do not detect bipolar II disorder reliably. Thus a major goal for ongoing research is to develop ways to reliably capture diagnoses of bipolar II disorder.
The most reliable and valid way to obtain a diagnosis of bipolar disorder is through a structured interview with a trained clinician (Akiskal, 2002). Nonetheless, given the time commitment involved in conducting structured interviews, several self-report measures have been developed to help clinicians identify persons most likely to meet criteria for bipolar disorders. It should be emphasized that these measures do not provide diagnostic accuracy, but, rather, might help identify people who should warrant more careful diagnostic interviews.
The General Behavior Inventory (GBI) was designed to cover the core symptoms of bipolar disorder, including both depressive and manic symptoms (Depue et al., 1981). Different versions range from 52 to 73 items (e.g., Depue et al., 1981; Depue & Klein, 1988; Mallon, Klein, Bornstein, & Slater, 1986). Items on each version assess symptom intensity, duration, and frequency on a scale ranging from 1 (“never or hardly ever”) to 4 (“very often or almost constantly”). Although the GBI has the most robust psychometric properties of the available self-report screeners, the multiple versions make generalizations regarding psychometric properties difficult.
The full 73-item version of the GBI has demonstrated excellent internal consistency and adequate test–retest reliability. It has demonstrated sensitivity to bipolar disorder of approximately 75% and specificity greater than 97% (Depue & Klein, 1988; Depue et al., 1989; Klein, Dickstein, Taylor, & Harding, 1989; Mallon et al., 1986) in clinical and nonclinical samples. Cutoff scores, however, have not been consistent across studies, further limiting the generalizability of the scale. At present, the GBI appears to be a useful screening tool for bipolar disorder, but future research to establish norms and cutoffs would increase its utility.
Another screening tool is the Mood Disorder Questionnaire (MDQ; Hirschfeld et al., 2000). The first 13 items of the MDQ ask about the DSM-IV manic symptoms using a yes–no format. To achieve a positive screen, seven items must be endorsed. Additional items assess if the identified symptoms co-occurred and caused at least moderate impairment. The MDQ has attained adequate internal consistency (Hirschfeld et al., 2000; Isometsä et al., 2003), fair one-month test–retest reliability, and fair sensitivity (.73 to .90) in distinguishing between bipolar and unipolar disorder in clinical samples (Weber Rouget et al., 2005). In addition, at least one recent study has demonstrated that high MDQ scores are associated with greater impairment and suicidal ideation in a primary care setting (Das et al., 2005). Nonetheless, specificity has been low in some studies (.47 to .90; Hirschfeld et al., 2000, 2003; Isometsä et al., 2003; Miller et al., 2004; Weber Rouget et al., 2005) and the sensitivity in a community sample was only .28 (Hirschfeld et al., 2003).
A review of the content of MDQ items may help clarify why the scale has achieved better performance in inpatient settings than in community settings. Several of the items appear to capture common experiences in community samples. For example, in one study, as many as 90% of college students endorsed items such as “Have you ever had a time when you were not your usual self and you felt much more self-confident than usual?” (Miller, Johnson, & Carver, 2008). These items may be less commonly endorsed by persons with schizophrenia and other severe psychopathology, explaining why the scale may appear more beneficial in an inpatient setting than in a community sampling. Hence, the MDQ may be a potentially useful tool in clinical settings to screen for bipolar disorder among those with severe psychopathology, but may be less helpful in community settings.
Other scales appear helpful in nonclinical samples, but do not have enough data regarding their usefulness as screening tools in clinical settings. The Hypomanic Personality Scale (HPS; Eckblad & Chapman, 1986) predicted the development of manic episodes at 13-year follow-up in undergraduates (Kwapil et al., 2000). To date, the HPS has only been studied in one clinical sample, achieving a positive predictive value of .82 and a negative predictive value of .67, and achieving a point-biserial correlation of .56 with bipolar I diagnosis (Kwapil, 2008). The Bipolar Spectrum Diagnostic Scale (Ghaemi et al., 2005) and the Mood Spectrum Self-Reports (Dell’Osso et al., 2002) have only been examined in a single study each, and two Hypomania Checklists (Angst et al., 2005; Hantouche et al., 2006) have only been examined in Europe and China (e.g., Meyer et al., 2007; Vieta et al., 2007). The Temperament Evaluation of Memphis, Pisa, Paris, and San Diego—Autoquestionnaire version (TEMPS-A; Akiskal & Akiskal, 2005) is a measure of temperament rather than manic or hypomanic episodes per se. Although the four-factor structure that includes dysthymic, cyclothymic, hyperthymic, and irritable temperaments has been examined in several countries and languages and psychometrically validated in clinical populations, research has not directly established the usefulness of this measure as a screen for bipolar spectrum disorders (e.g., Akiskal et al., 2005; Karam et al., 2007; Kesebir et al., 2005; Matsumoto et al., 2005; Mendlowicz, Jean-Louis, Kelsoe, & Akiskal, 2005; Sandor et al., 2006; Vazquez et al., 2007). At least one study, however, has demonstrated that the cyclothymic subscale of the TEMPS-A can prospectively predict bipolar spectrum diagnoses among clinically depressed children and adolescents over a two-year period (Kochman et al., 2005). Although initial studies indicate that these scales demonstrate good psychometric properties, more research is needed to determine their usefulness as screening measures.
Overall, the SCID and the SADS are the most common means of diagnosing bipolar disorder in adults. With excellent psychometric characteristics for the assessment of bipolar I disorder, they fare less well in assessing bipolar II disorder. This may be due to issues related to the definition of hypomania.
As a diagnostic screening tool, the scale with the best support is the GBI, as it has consistently demonstrated sensitivity of approximately .75 and specificity above .97. Readers should be cautious, however, because multiple versions of the scale exist, and cutoffs for a positive screen have not been firmly established. The MDQ has been helpful in clinical populations, but suffers from poor discriminatory power in community settings. Other promising scales require more psychometric development. When using self-report scales as screening tools, several broader issues must be kept in mind. First, the usefulness of a screening tool will vary depending on the prevalence of a disorder in the population of interest (Phelps & Ghaemi, 2006). Second, few studies provide direct comparisons of psychometric characteristics of the different measures. Third, there are several ways to report on a screener’s usefulness, including sensitivity and specificity, positive and negative predictive values, area under the curve, and point-biserial correlations with diagnosis (Kraemer, 1992). Not all studies on the detection of bipolar disorder report all of these results, limiting the ability to compare studies or measures. Furthermore, sensitivity and specificity are commonly reported, but these indices may be dependent on sample characteristics. Fourth, authors have often modified the diagnostic interviews used as a reference standard to capture milder forms of bipolar spectrum disorder, yet limited information about these modifications is available. Each of these issues makes comparisons between measures complex.
The most common approach to measuring the severity of manic symptoms has been clinician-rated interviews. The Young Mania Rating Scale (YMRS) and Bech-Rafaelsen Mania Rating Scale (MAS) are two of the most widely used clinician-rated scales for assessing symptom severity. These scales have been commonly used to track changes in symptoms over time as treatment progresses. We briefly review these two scales, as well as the Schedule for Affective Disorders and Schizophrenia—Change version (SADS-C) mania subscale. There has been growing recognition, though, of the need to track both clinician and patient perspectives on the course of treatment, and so we discuss available symptom severity measures that rely on self-report. Some research has focused on measures useful for case conceptualization and treatment planning, but this literature is not covered in detail here: interested readers are referred to other reviews (e.g., Johnson, Miller, & Eisner, 2008). Table 2 summarizes some of the well-supported measures for assessing symptom severity in bipolar disorder.
The YMRS (Young, Biggs, Ziegler, & Meyer, 1978) is a 15- to 30-min interview designed to be conducted by a trained clinician. It was originally developed and tested within an inpatient population based on semi-structured interview and observation during an eight-hour period. Today, the YMRS combines the patient’s report of manic symptoms over the previous two days as well as the clinician’s observations during the interview. It consists of 11 items covering the “core symptoms of the manic phase”: mood, motor activity, interest in sex, sleep, irritability, speech, flight of ideas, grandiosity, aggressive behavior, appearance, and an item regarding patient insight (Carlson & Goodwin, 1973; Winokur, Clayton, & Reich, 1969). It should be noted that item 8, Bizarre Content, combines the assessment of the manic symptom of grandiosity with other psychotic symptoms, including hyperreligiousity, paranoia, ideas of reference, delusions, and hallucinations. The YMRS does not account for other DSM criteria of mania, including distractibility, increases in goal-directed activity, or excessive involvement in pleasurable activities with a high potential for painful consequences. A factor analysis of the YMRS revealed a thought disturbance factor, an overactive/aggressive behavior factor, and a factor tapping elevated mood and psychomotor symptoms (Double, 1990).
Seven items are rated on a severity scale ranging from 0 to 4, and four items are rated on a scale of 0 to 8. Four core symptoms (irritability, speech, bizarre content, and disruptive–aggressive behavior) are double-weighted to account for poor cooperation from severely ill patients. Although the weighting may make rating more complex, it has not been shown to affect the reliability, validity, or sensitivity of the scale. The YMRS has demonstrated excellent psychometric properties, including a high inter-rater reliability for total scores (intraclass correlation = .93) and for individual item scores (intraclass correlation = .66 to .92), as well as high correlations with other mania rating scales (Young et al., 1978). Scores also statistically differentiate patients before and after two weeks of treatment. The YMRS has primarily been used to assess manic symptoms in treatment trials and was the primary measure of mania in the Systematic Treatment Enhancement Program for Bipolar Disorder study, the largest study to date on the effectiveness of treatments for bipolar disorder (Sachs et al., 2003).
The MAS (Bech et al., 1979) is a clinician-rated instrument that is similar in format to the YMRS. The 11 items of the MAS are rated on a five-point scale (ranging from 0 “not present” to 4 “severe”) and cover classic manic symptoms such as elevated mood, irritability, sleep, increased activity, talkativeness, flight of ideas, self-esteem, noise level, and sexual interest. Like the YMRS, it has achieved excellent internal consistency and interrater reliability, as well as strong correlations with more exhaustive measures of manic symptoms (Bech, 1988; Bech, Bolwig, Kramp, & Rafaelsen, 1979; Licht & Jensen, 1997). It has been widely used in treatment and basic research (e.g., Bech, 2002; Johnson et al., 2008; Malkoff-Schwartz et al., 1998). Scores on the MAS reliably differentiate placebo and treatment groups, as well as detect changes in symptoms associated with treatment (Bech, 2002).
The SADS-C (Spitzer & Endicott, 1978) mania subscale is a five-item interview that assesses current severity of manic symptoms. Items are rated on a six-point scale that includes behavioral anchors. Good interrater reliability has been established in a range of settings with the exception of a sample of patients referred for emergency evaluation (intraclass correlation = .63 for mania; Rogers, Jackson, Salekin, & Neumann, 2003). Expected elevations on the scale have been seen in a bipolar sample compared to patients with other psychiatric disorders, as have robust correlations with another interview to assess manic severity, the MAS (r = .89; Johnson, Magaro, & Stern, 1986). Support for the scale in factor analytic studies has been mixed. One study found that all items loaded onto a single factor distinct from dysphoria, insomnia, and psychosis (Rogers et al., 2003). However, less factor analytic support was obtained in a study that examined the item loadings for the SADS-C and a nurse observation scale for mania (Swann et al., 2001).
Two self-report measures of symptom severity have strong psychometric support: the Altman Self-Rating Mania (ASRM) Scale and the Self-Rating Mania Inventory (SRMI). We will also discuss other measures under development.
The ASRM scale (Altman, Hedeker, Peterson, & Davis, 1997) is a five-item scale that assesses mood, self-confidence, sleep disturbance, speech, and activity level over the past week. Items are scored on a 0 (absent) to 4 (present nearly all the time) scale, with total scores ranging from 0 to 20. Although the brevity can be an advantage, the scale covers fewer symptoms than other mania scales. Normative data for the ASRM have been gathered across major diagnostic groups (Altman et al., 1997; Altman, Hedeker, Peterson, & Davis, 2001).
The ASRM has demonstrated good psychometric properties. A cutoff score of 5.5 is recommended, as it has shown an optimal combination of sensitivity and specificity (85% and 86%, respectively). The ASRM also shows good sensitivity to treatment, with an average decrease of five points after discharge from the hospital (Altman et al., 2001). Finally, the ASRM demonstrated adequate internal consistency and concurrent validity when compared to SADS-based diagnoses, the YMRS (Young et al., 1978), and the Clinician-Administered Rating Scale for Mania (Altman et al., 1994, 1997, 2001). It should be noted that both of the published validation studies for the ASRM were conducted by the same research group. On the other hand, the scale has been shown to demonstrate expected correlations with psychological constructs related to mania, such as poor regulation of positive emotions (Feldman, Joormann, & Johnson, 2008).
The Self-Report Manic Inventory (SRMI; Braunig, Shugar, & Kruger, 1996; Shugar, Schertzer, Toner, & Di Gasbarro, 1992) is a 47-item true–false inventory that assesses increased energy, increased spending, increased sexual drive, increased verbosity, elation, irritability, racing thoughts and decreased concentration, grandiosity, and paranoid or psychotic experiences during the past week, and includes an item that addresses insight. Normative data have been reported in three small studies of inpatients, and these studies each provided estimates of good internal consistency (Altman et al., 2001; Braunig et al., 1996; Shugar et al., 1992). In two studies, the SRMI was found to have good discriminant validity, differentiating people with bipolar disorder from those with other psychopathology (Braunig et al., 1996; Shugar et al., 1992). However, another study found the SRMI to have low concurrent validity as compared to the ASRM (Altman et al., 2001). The scale appears sensitive to change in symptoms. It may not be well suited for inpatient assessment, however, because seven of the SRMI items describe behaviors that would not be possible within a hospital setting (Altman et al., 2001).
The Internal State Scale (ISS; Bauer et al., 1991) is a 17-item scale that discriminates mood state and tracks manic and depressive symptoms. There are four empirically derived subscales: Activation, Well-Being, Perceived Conflict, and Depression Index. The Activation subscale (five items) assesses racing thoughts and behavioral activation, specifically feeling restless, sped-up, overactive, and impulsive. These items appear to capture general arousal more than symptoms of mania. Still, the Activation correlates well with other measures of mania (Bauer, Vojta, Kinosian, Altshuler, & Glick, 2000). The overall scale has demonstrated correlations with other measures of mania ranging from .21 to .60 and rates of correct classification ranging from .55 to .78 (Altman et al., 2001; Bauer et al., 1991; Bauer et al., 2000; Cooke, Krüger, & Shugar, 1996). The measure is sensitive to symptom decreases during treatment (Altman et al., 2001; Bauer et al., 1991; Cooke et al., 1996). Despite these strengths, the ISS scale has a low sensitivity to manic symptoms at the time of hospitalization (Altman et al., 2001). In addition, scoring algorithms vary substantially across studies, as do means and standard deviations of score distributions (Altman et al., 2001; Bauer et al., 1991; Cooke et al., 1996). Thus, the ISS is not currently recommended.
Continuous monitoring of symptoms and functioning is pivotal for people suffering from chronic, recurrent conditions like bipolar disorder (e.g., Horn et al., 2002; Schärer, Hartweg, Hoern, et al., 2002). Such frequent monitoring, however, can be expensive both economically and in terms of clinicians’ time. In addition, there is increasing consensus regarding the benefits of a collaborative care model for bipolar disorder, in which patients play an active role in managing their illness (Bauer et al., 2006a, 2006b; Sajatovic et al., 2005). Enlisting patients’ input can have numerous benefits, including reduced costs, higher patient investment in treatment, and higher validity than clinician observations alone. These benefits may be especially relevant for longitudinal data with high variability, such as may be seen with rapid-cycling patients (Lam & Wong, 2005). In addition to the tracking of bipolar symptoms such as sleep disturbance and mood, self-monitoring may also provide broader information regarding important issues such as medication adherence and psychosocial functioning. These facts have led to a growing literature supporting the use of self-monitoring tools for bipolar disorder. For instance, the NIMH prospective Life-Chart Method (NIMH-LCM-p) can provide detailed information regarding rapid fluctuations in mood (Denicoff et al., 2000, 2002).
Frequent monitoring of bipolar symptoms can produce so much data that entering and organizing it into a useful format may be incredibly time-consuming. In response to this, some research has focused on the use of palmtop computers and other electronic formats for self-monitoring. Examples include a palmtop version of the NIMH-LCM (Schärer, Hartweg, Valerius, et al., 2002) as well as ChronoRecord software, the latter of which has shown significant correlations with the YMRS (Bauer et al., 2008).
Most of the research in support of self-monitoring in bipolar disorder should be considered preliminary, but promising. In addition to the methods described above, many clients find it helpful to create their own self-monitoring forms or to complete brief checklists to track their progress over time. Many consumer-oriented websites, such as that maintained by the Depression and Bipolar Support Alliance, provide such forms. To increase awareness of symptoms, these self-monitoring forms can be compared to clinician-rated interviews. This is an important area for future study, and it is the hope of the authors that self-monitoring methods continue to be refined and validated for bipolar disorder.
At least two interview measures (the YMRS and MAS), as well as some self-report measures (e.g., the Altman and SRMI), have received psychometric support. Self-report measures can be completed quickly, but brevity and ease of use may also result in reduced precision. Self-monitoring may also be useful to help increase awareness about symptoms and to track progress over time, but further research is required in this domain.
This article has summarized assessment tools for screening, diagnosis, and symptom monitoring within bipolar disorder. We would note that there are many important aspects of assessment in bipolar disorder that we have not addressed. Although the symptom severity and diagnostic scales covered above predominately address manic symptoms, we urge readers to evaluate a broader range of outcomes, including depression, quality of life, and social functioning. People with bipolar disorder experience at least some depressive symptoms at least one-third of the weeks in a year (Judd et al., 2002; Keck & McElroy, 2003), and these subsyndromal depressive symptoms can be associated with substantial impairment across a variety of domains (Altshuler et al., 2006). High risk for suicide has been documented during depression within bipolar disorder (Angst et al., 2005); thus it will also be important to assess for depressive symptoms and suicidality. To date, there is strong evidence that bipolar and unipolar depressive symptoms are relatively similar (Johnson & Kizer, 2002), so applying the well-validated measures of depression from the unipolar literature is a reasonable strategy. Patients report that improvement in quality of life is a more important treatment goal to them than are specific symptoms, highlighting the importance of this oft-ignored domain (Michalak, Yatham, Kolesar, & Lam, 2006). Whereas measures of these constructs have been developed for other disorders such as depression and schizophrenia, this is a realm that remains largely untapped for bipolar disorder, with at least one exception (e.g., Michalak et al., 2006). In addition, there is some debate regarding the ultimate treatment goals for bipolar disorder. Given the high base rates of subsyndromal symptoms, complete recovery may be an unrealistic goal, or require levels of medication that would lead to intolerable side effects (Sachs & Rush, 2003). Proper care must take individual needs into account, but to date little research has directly addressed this issue. Overall, it is highly recommended that researchers and clinicians pay attention to issues that extend far beyond the level of mania. For those who seek a more detailed review of assessment measures for bipolar disorder or psychiatric conditions more generally, the authors recommend comprehensive books such as the Handbook of Psychiatric Measures (Rush, First, & Blacker, 2008).
Returning to the focus of this article, though, the good news is that well-validated tools exist for the assessment of mania in adults. Reliable and valid measures are available for the diagnosis of bipolar I disorder, and indeed, the psychometric characteristics of these tools are as good as those seen for most Axis I disorders. Similarly, scales are available to measure symptoms using both interviewer and client perspectives.
On the other hand, much work remains to be done in this domain. A first goal would be the refinement of diagnostic measures for bipolar II disorder and other milder forms of bipolar disorder. Ideally, research and dialogue in the near future will help to establish accepted standards for defining hypomanic episodes. A second major goal is the refinement of screening tools. With the possible exception of the GBI, no self-report measure has consistently achieved acceptable levels of sensitivity and specificity within community samples, and conclusions regarding the GBI are limited by the existence of several different versions and cutoffs. One might expect that the most pressing need would be for screening tools that were viable for community or outpatient screening, as by the time a person is hospitalized, symptoms may be so extreme as to be easily diagnosed. A third major goal is more systematic research on how to integrate clinician and self-report ratings of symptom severity, especially in the face of potentially impaired insight for those with bipolar disorder (Ghaemi, Boiman, & Goodwin, 2000). Intriguingly, although researchers have now begun to examine the relative weight to give ratings from different informants in understanding juvenile bipolar disorder (Findling et al., 2002), such research has not been conducted in adult bipolar disorder. Rather, researchers focused on adult bipolar disorder have often failed to take into account patient perspectives on severity. We are hopeful that future research will continue to refine this field, and that this review has illuminated research challenges to be tackled.