|Home | About | Journals | Submit | Contact Us | Français|
Apathy is highly prevalent among neuropsychiatric populations and is associated with greater morbidity and worse functional outcomes. Despite this, it remains understudied and poorly understood, primarily due to lack of consensus definition and clear diagnostic criteria for apathy. Without a gold standard for defining and measuring apathy, the availability of empirically sound measures is imperative. This paper provides a psychometric review of the most commonly used apathy measures and provides recommendations for use and further research.
Pertinent literature databases were searched to identify all available assessment tools for apathy in adults aged 18 and older. Evidence of the reliability and validity of the scales were examined. Alternate variations of scales (e.g., non-English versions) were also evaluated if the validating articles were written in English.
Fifteen apathy scales or subscales were examined. The most psychometrically robust measures for assessing apathy across any disease population appear to be the Apathy Evaluation Scale and the apathy subscale of the Neuropsychiatric Inventory based on the criteria set in this review. For assessment in specific populations, the Dementia Apathy Interview and Rating for patients with Alzheimer’s dementia, the Positive and Negative Symptom Scale for schizophrenia populations, and the Frontal System Behavior Scale for patients with fronto-temporal deficits are reliable and valid measures.
Clinicians and researchers have numerous apathy scales for use in broad and disease-specific neuropsychiatric populations. Our understanding of apathy would be advanced by research that helps build a consensus as to the definition and diagnosis of apathy, and further refine the psychometric properties of all apathy assessment tools.
A comprehensive review of the literature on apathy found that this neurobehavioral syndrome is common across a large variety of neurological, psychiatric, and medical conditions (1). For example, the average point prevalence of apathy in outpatients with Alzheimer’s disease (AD) was roughly 60%, which was similar for adults who had suffered a traumatic brain injury (TBI; 61%) or a focal frontal lesion (60%). Patients with dementia with Lewy bodies (LBD), fronto-temporal dementia (FTD), basal ganglia lesions (e.g. Parkinson’s disease [PD]), and supranuclear palsy (PSP) also develop apathy at high rates but lower than those seen in disorders involving cortical dysfunction, such as AD and TBI (2–8). Apathy is also common in patients with thalamic lesions (9), vascular dementia (10–12), cerebrovascular accidents (CVA) (13;14), anoxic brain injury (13;14), multiple sclerosis (MS) (15;16), myotonic dystrophy (17;18), human immunodeficiency virus (19), major depressive disorder (18;20), and mild cognitive impairment (MCI) (21;22), especially when accompanied by extrapyramidal signs (1;8;16;23;24). The highest point prevalence rate found (84.1%) relate to nursing home residence (1;23).
Apathy is associated with decreased functioning (10;18;25;25–30), caregiver distress (31), cognitive decline (5;27;32;33), poor illness outcomes and response to treatments (18;25;26), and chronic apathy (10;18;21;25;26;31;34;35). There is limited data on the effectiveness of treatment for apathy. However, preliminary evidence showed partial benefits with amphetamines (36;37), dopaminergic agents (38), cholinesterase inhibitors (39;40), atypical antipsychotic medications, particularly in patients with schizophrenia, whose negative symptoms strongly resemble apathy (41;42), and recreation (43) and music therapies (44). Hence, apathy is clinically relevant and potentially amenable to treatment. Accurate assessment is important to improve our understanding and management of apathy and associated conditions. To accomplish this, clear definition and consensus diagnostic criteria for apathy are needed.
The concept of apathy is found in various sections of the fourth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV-TR) and in disorders of many types, as shown in Table 1, albeit the use of various terms such as “lack or loss of interest”, “avolition”, “affective flattening”, “social withdrawal”, and “indifference”. This is likely due to the lack of definition of apathy in the DSM. In fact, apathy appears as a symptom rather than a separate diagnostic entity in the DSM. Clinicians and researchers have indicated that although apathy can be experienced as a symptom secondary to mood disorder, altered level of consciousness or cognitive impairment, it is also a distinguishable syndrome (45–48), as it is common, not always present with other neurological/psychiatric or medical conditions, and also associated with a number of adverse outcomes which require separate treatment consideration. Apathy has also been found in some elderly and adolescent persons who are otherwise normal individuals (45;47;49;50.) For example, apathy has been reported in institutionalized elderly persons, prisoners, and immigrants who are prone to adapt a ‘why try’ response to environments that are lacking in reward/opportunities for control (45.) In some cases, such individuals do not become depressed but simply inactive (i.e. absence of goal-directedness/interest, etc.) because their attributions are global ('nothing I do is going to make a difference) and also external ('it's not because of me, it's because of this place/them'). Furthermore, recognition of apathy as a maladaptive consequence of psychiatric, medical, and neurological disorders lowers the risk of failing to recognize the disorders, failing to treat the disorders, or losing an opportunity to understand the psychological and neurological mechanisms that mediate this loss of goal-directedness/interest or motivation. However, consensus on its definition in the field is still lacking. Marin conceptualized apathy as a loss of motivation not attributable to emotional distress, cognitive impairment, or diminished level of consciousness(45–47;51), while Stuss and colleagues (48) conceptualized apathy as an absence of responsiveness to stimuli (external or internal) as characterized by a lack of self-initiated action. Although consensus on the definition of apathy is still lacking, there is some agreement that loss of motivation and lack of initiative and self-generated actions are important features of the neurobehavioral syndrome (52). There is also agreement that the syndrome of apathy is reflected by acquired changes in affect (mood), behavior, and cognition and requires its own diagnosis and treatment (45;48). Stuss et al (48) also proposed that apathy may not be a single syndrome, but rather may be separable states depending on which neural system is involved and which affective, behavioral, or cognitive response is affected. However, this hypothesis requires further work.
Arriving at diagnostic criteria for apathy has been the goal of a number of research groups (10;47;52). These efforts stem from the seminal paper, “Differential diagnosis and classification of apathy”, in which Marin (47) proposed that the diagnosis of apathy be based on its distinction from “the overt behavioral, cognitive, and emotional concomitants of goal-directed behavior”. Marin’s (47) criteria for apathy were later operationalized by Starkstein and colleagues (10) in which an individual was considered to have apathy if he/she experienced (self-report or observation by others) a) a lack of motivation relative to his/her previous level of functioning or the standards for his/her age and culture, b) accompanied by at least one symptom in each of three domains described as i) diminished goal-directed behaviour, ii) diminished goal-directed cognition and iii) diminished concomitants of goal-directed behaviours with resulting c) clinically significant distress and/or impairment in social, occupational, or other important areas of functioning. Importantly, the symptoms should not be a result of d) diminished level of consciousness or the direct physiological effects of a substance. These diagnostic criteria for apathy have been further revised by Robert and colleagues (52) with changes specifically to the B criterion. The revised criteria specified that along with the lack or loss of motivation from previous level of functioning or what is expected given the individual’s age and culture, only changes in two of the three domains are required but that these changes should be present for a period of at least four weeks and must be present most of the time with resulting c) clinically significant distress and/or impairment in social, occupational, or other important areas of functioning and the symptoms should not be a result of d) diminished level of consciousness or the direct physiological effects of a substance (52). However, given the clinical significance of apathy induced by illicit drug abuse and commonly used medications, this criterion may deserve reconsideration. Robert and colleagues (52) stated that “confusion remains regarding the essential features of apathy,” and that, inarguably, the need for consistent definitions is high. Nonetheless, their derived criteria specify as core features lack of motivation, reduced self-initiation, and emotional blunting – all of which are consistent with Marin (47) and Starkstein and colleagues (10) approaches. Consensus on these revised diagnostic criteria has still not been reached (personal communication with Marin and van Reekum who are part of a working group of apathy researchers currently putting together diagnostic criteria for submission to the DSM-V Taskforce for consideration in DSM-V).
Most of the clinical research to date has classified apathy on the basis of scores on various assessment tools. The development of these measures has been greatly facilitated and promoted by the existing conceptualization of apathy as a syndrome with loss or lack of motivation and self-initiated action as its prominent features and the diagnostic criteria proposed by Marin (46;47) with further modifications or refinement by other research groups (10;18;48;53) (52). With this in mind, the current paper reviews the reliability and validity of the available apathy measures, identifies measures that show good reliability and validity, and discusses future research needs in this field.
Pubmed, PsycInfo, Medline and Embase, and Cinahl literature databases were searched to identify all available assessment tools for apathy in adults aged 18 and older. Reviewed measures were limited to full apathy scales or scales in which an apathy subscale was available. The search terms apathy, lack of interest, amotivational, emotional indifference, lack of motivation, lack of initiative, negative symptoms, asociality, anhedonia, measures, assessment tools, validity, reliability, psychometrics, and measurement properties were used in various combinations to identify relevant articles. Articles were limited to those published in English from 1980–2008. The criteria for good reliability and validity were based on those used by Lloyd and colleagues (54). More specifically, the reliability of the instruments was assessed by means of their internal consistency, test-retest reliability and inter-rater reliability and validity was assessed by means of the scales’ convergent, divergent and predictive validities. Given the lack of consensus definition and diagnostic criteria for apathy, criterion validity was not set as a main criterion for good validity. However, if information was available on the criterion validity of the scales this would add to the determination of its psychometric robustness if sensitivity and specificity were greater than or equal to 80%. Per Lloyd and colleagues (54), adequate internal consistency was set at Cronbach’s alpha of greater than or equal to 0.80 for the scale total or if the apathy measure was a single item or the subscale of a larger scale the Cronbach’s alpha coefficient for the item, range of items or subscale had to at least 0.80. The test-retest and inter-rater reliabilities of the scales had to be greater than or equal to 0.70, which demonstrates moderate reliability (55). The criterion for convergent reliability was set at 0.50 or higher(54) but the criterion for divergent validity was set at less than 0.50, which was slightly different from Lloyd and colleagues (54).
The number of unique articles retrieved from the search terms and their respective literature database were: Pubmed (212 articles), Psychinfo (3 articles), Embase and Medline (29 articles), and Cinahl (0 articles). From these unique articles, 15 scales that were developed to ascertain and quantify apathy were identified (see Table 2). Of these, seven were full apathy scales and eight were apathy subscales embedded in larger scales. The Apathy Evaluation Scale (AES) and the Neuropsychiatric Inventory (NPI) are the most widely used scales to assess apathy. Frontal Lobe Personality Scale (FLOPS), now known as the Frontal Systems Behavior Scale (FrSBe), has been moderately cited in the literature since its development. A number of scales had high-to-moderate use within specific populations for which they were validated, i.e., Scale for the Assessment of Negative Symptoms (SANS), Brief Psychiatric Rating Scale (BPRS), Positive and Negative Symptom Scale (PANSS), Unified Parkinson’s Disease Rating Scale (UPDRS), Lille Apathy Rating Scale (LARS), and the Dementia Apathy Interview and Rating (DAIR). Fewer studies incorporated the use of the Irritability-Apathy Scale (IAS), Key Behavior Change Inventory (KBCI), and the Apathy Inventory (AI) beyond the scales’ original developers. The above-mentioned scales are available in English, per the “English only” inclusion criteria for this review. Where the scales are available in other languages these are indicated in Table 2, however, citations are not provided for these translated articles as they did not meet the current study’s inclusion criteria (i.e., written in English). The scales are copyrighted and most are available from their developers (see footnotes in Table 2).
The AES was developed to ascertain and quantify apathy within the month prior to the time of the assessment (46;47). The definition of apathy as “a syndrome of loss of motivation as reflected by acquired changes in affect (mood), behavior and cognition,” as formulated from the literature and extensive clinical experience (46;47), was used to guide the development of the scale. The scale comprises 18 core items that assess and quantify the affective, behavioral, and cognitive domains of apathy but with phrasing varying by rater (i.e., self [AES-S], informant [AES-I], or clinician [AES-C]). In addition, the AES-C incorporates a semi-structured open-ended interview with some guidelines and probes, in which the clinician elicits information from the patient about his/her typical day from the time he/she wakes up until he/she goes to bed, and about his/her usual activities or hobbies or interests including things that he/she would like to do but is physically unable to do. This information is intended to aid the clinician in providing his/her own rating of the individual’s level of apathy on each item, which is rated on a 4-point response scale (0=not at all true/characteristic to 3=very much true/characteristic). Higher scores indicate more severe apathy (46). The AES-C is said to take between 10–20 minutes to complete by a trained interviewer (46;47), but particularly if the interviewer has a minimum of a masters-level training and at least one year of experience with this clinical population or bachelor-level training and three or more years of clinical experience (56;57). The AES items have been shown to load onto three factors, with most items loading onto the apathy factor and accounting for the majority of the variance (46;57;58).
The internal consistency of the AES ranges from 0.86–0.94 for the self-rated to the clinician versions with replication in individuals with right or left CVA, AD and major depressive disorder (46;47;51); community-dwelling older adults with memory complaints (57) and first episode psychosis (58). These findings indicate good reliability for the scale. The AES-I and AES-C have been shown to have good test-retest reliabilities with assessments 25.4 days apart (46;47), but replication is needed. Good inter-rater reliabilities have been demonstrated in two studies for the AES-C (46;58).
The high rate of use of the AES-I and AES-C in various populations (e.g., normal controls, outpatient with cognitive complaints, PD, and schizophrenia) lend support to their face and construct validity. Statistically significant and fair to high correlation between the AES-I and AES-C and other measures of apathy (e.g., NPI-apathy subscale (57), the Lille Apathy Rating Scale [LARS] (59;60), and the PANSS-negative symptom (NS) subscale, particularly the emotional withdrawal item (58) indicated fair to good convergent validity. Understandably, the AES-I was also correlated highly with scores on the 7-item Apathy Scale (AS-7) (61) that was validated in a geriatric rehabilitation inpatients and the 10-item Apathy Scale (AS-10) (62) that was validated in geriatric nursing home residents given seven or more common items. Good discriminant validity was also demonstrated for the AES-I and AES-C as evident by low correlations with depression and anxiety measures (46;47;57) and the positive factor of the PANSS (58). The convergent and discriminant validities of the AES-S were not as favorable (57).
Starkstein et al’s 14-item Apathy Scale (AS-14) (53) is based on a preliminary version of the AES and therefore has 6 items in common. Resnick et al’s AS-7 (61) and Leuken et al’s AS-10 (62) represent truncated versions of the AES that result from removal of some items that were determined to be redundant (53;61) or inappropriate for the group being assessed (62). Judgment of redundancy was based on factor analyses (53;61) whereas inappropriateness was judged by experts with special knowledge of the population being assessed (61;62). The intent of these abridged versions of the AES was to better capture apathy in the respective populations with reliable and valid measures while possibly requiring less time although not explicitly stated. These abridged apathy scales are based on informant report of the patient’s behavior. As with the original scale, each item is rated a 4-point scale. The total scores range from 0–21 for the AS-7 (61), 0–30 for the AS-10 (62), and 0–42 for the AS-14 (53), with higher scores indicating more severe apathy.
The alpha coefficients for the AS-7 (α=0.67) and AS-14 (α=0.76) were below the criterion set for adequate reliability. On the other hand, very good internal consistency was demonstrated for the AS-10 (α= 0.92). However, good one-week test-retest (r=0.90) and inter-rater reliabilities (r=0. 81) have been demonstrated for the AS-14 as validated in a patients with PD. Studies that attempt to replicate these findings are lacking but warranted. Test-retest and inter-rater reliabilities were not assessed for the AS-7 (61) and the AS-10 (62).
A cut-point of 14 on the AS-14 using an independent neurologists blind classification of apathy status as the “gold standard” had 66% sensitivity and 100% specificity in categorizing individuals as apathetic versus not apathetic (53). The convergent and discriminant validity of the AS-14 was not assessed. The AS-10, intended for assessment of apathy in nursing home residents, was strongly correlated with the original 18-item AES (r=0.91) and the Neuropsychiatric Inventory (NPI) apathy subscale (r=0.62), which was above the criterion set for moderate convergent validity. The scale also demonstrated good discriminant validity as indicated by any poor correlation with the scores on the depression subscale of the NPI (r=0.09).
The AS-7 was strongly correlated (r=0.87) with the AES-C (61), as would be expected given the common items in the measures. Since higher scores on the AS-7 indicate increasing level of apathy, the level of correlation between the AS-7 and the measure of willingness to participate in rehabilitation efforts (r=0.37–0.42) and depression (r=0.45) indicated, at most, acceptable level of discriminant validity. This level of correlation between the AS-7 and depression is likely due to the inclusion of the loss of interest item in the measure of depression and hence confounding. The AS-7 was found to be significantly predictive of functional performance at discharge (beta estimate = −0.22) in older adults in rehabilitation inpatient care (61), but have not been tested in other studies.
Robert and colleagues (33) developed the 3-item AI to identify and quantify apathy and validated in individuals with MCI. The items included in the AI (i.e., emotional blunting, lack of interest, and lack of initiative) were based on the conceptualization of apathy as a lack of emotional response, self-initiated actions, and interest in things (46–48). The AI involves a clinician-administered interview to either the caregiver (AI-caregiver) or the patient (AI-patient). In the AI-caregiver interview, the presence or absence of the three items is initially rated. If present, frequency (4-point scale) and severity (3-point scale) are then assessed. The AI-caregiver score ranged from 0–36 with higher score indicating greater apathy. In the AI-patient interview, patients are asked about the presence or absence of the three items of the AI. If an AI item is rated as present, the patient is asked to estimate its intensity on a 12-point Likert-type scale “from mild at the left-hand end of the scale, to severe at the right-hand end,” thereby giving a range of score from 0–36 with higher scores indicating more severe apathy. The length of time required to complete the AI-caregiver and AI-patient was not indicated. Validation of the scales was carried out in a mixed sample consisting of normal controls, MCI, and AD patients.
Good reliabilities have been demonstrated for the caregiver but not the patient version of the AI using a sample of normal, MCI, AD, and PD patients and their caregivers. The AI-caregiver had an internal consistency of 0.84, but this statistic was not reported for the AI-patient. Very good inter-rater reliability (r=0.99) was observed for the AI-caregiver, using independent clinicians to rate videotaped caregiver interviews. Test-retest reliability exceeding the criterion set for moderate reliability (r=0.97, 0.99, 0.99 for lack of initiative, lack of interest, and emotional blunting, respectively) were reported for the caregiver version of the AI. The two assessments were completed by different interviewers with the same caregivers on the same day. The high probability of the caregivers still remembering their earlier rating of the patients’ apathy level in the second interview given the short time interval might explain the high test-retest reliability coefficient observed. Independent assessment of the test-retest reliability of the AI using a different time frame for the second interview would be useful.
More favorable validity information was observed for the AI-caregiver than the AI-patient. Convergent validity, as indicated by correlation with the NPI-apathy subscale, was moderate for the AI-caregiver lack of interest item (r=0.66) but only fair for its lack of initiative item (r=0.23). However, convergent validity of the AI-patient was poor. The AI-caregiver score discriminated controls from patients with AD. Specifically, caregivers rated patients with AD at significantly higher scores on lack of initiative and global apathy compared to controls. On the AI-patient, the PD group had higher lack of initiative and global apathy scores compared to controls. On both cases, the lack of interest item did not show good discriminant properties.
The DAIR is a 16-item clinician administered scale that was developed and validated in females with possible and probable AD (63). Items were generated based on clinical observation and interview with AD patients and their caregivers, as well as evidence from the existing literature on apathy. The scale is administered to a knowledgeable caregiver using a structured interview with inquiry about apathy in the patient over the past month. Each item is scored on a 4-point scale (0=no or almost never to 3=yes or almost always) with higher scores representing greater severity of apathy. Only items representing a change in behavior are included in the final apathy score. The average time to complete the scale is not indicated.
The DAIR has very good internal consistency whether it is conducted as an in-person or telephone interview (α =0.89–0.94) (63). The two-month test-retest reliability (r=0.85) and the inter-rater reliability (r=1.00) were also shown to be very good (63).
Statistically significant correlations (r=0.31–0.46) were observed between the DAIR total score and apathy as rated on a 12-point Likert-like scale completed by an independent clinician who was blind to the participants’ DAIR score, thus indicating only fair convergent validity (63). The researchers also found poor correlation between DAIR and the caregivers’ report of depression as rated by the depression subscale of the Consortium to Establish a Registry for Alzheimer’s Disease Behavior Rating Scale for Dementia (r=0.08) (63) showing good discriminant validity.
The LARS represents a 33-item apathy scale with nine domains that is clinician-administered using a structured interview (59;60). The scale was developed to ascertain and quantify apathy in the month prior to the assessment and validated in individuals with PD. Items and domains were generated from the apathy literature, Marin and colleagues’ (46;47) and Stuss et al’s (48) conceptualization of apathy as well as clinical experience with patients with apathy. The first three items of the LARS are scored on a 5-point Likert-like scale (0–4) and the remaining 30 items are scored on a no-versus-yes basis. Scores on the scale range from −36 to +36 with higher and more positive score indicating greater severity of apathy. Guidelines for the scoring method were not found in the available literature.
Internal consistency of the LARS was found to be good (α=0.80). The 4-month test-retest reliability was very good (r=0.95), as was the inter-rater reliability between two clinicians (ICC=0.98) (59). Similarly, good internal consistency, test-retest, and inter-rater reliabilities were observed by Dujardin et al (60) for the informant and clinician versions of the LARS in PD patients and controls.
The validity of the LARS for assessing the presence and severity of apathy has been demonstrated in patients with PD (59;60). Cut-off scores of −15 to −17 showed good sensitivities (0.87–0.94) and specificities (0.87–0.94) (59;60). In the original study, the correlation between the LARS and the AES total scores was 0.87, indicating very good convergent validity (59). Similarly, good discriminant validity was demonstrated by a lack of significant interaction between apathy as measured by the LARS and depression as determined by the MDARS (59). These findings were replicated by Dujardin and colleagues (60).
The BPRS is an international and widely used 18-item scale developed by Overall and Gorham (64) to assess psychosis in patients with schizophrenia. Fourteen of the 18 items were derived from factor and cluster analytic studies involving the Lorr’s Multidimensional Scale for Rating Psychiatric Patients and the Inpatient Multidimensional Psychiatric Scale (65). Each item is rated on a 0-to-4-point scale with higher score indicating greater severity of problems (65;66). Consistently, the emotional withdrawal, motor retardation, and blunted affect items have been found to comprise the BPRS-NS subscale with variations in some of the other items.
The reported internal consistency for the BPRS-NS-subscale (or withdrawn-depression factor) ranges for substantial to very high (65;67;68). Good inter-rater reliability coefficients were also demonstrated for the sum score (r=0.87–0.90) of the BPRS-NS-subscale across studies (65;66;69). The inter-rater reliability for the same items, as observed by Andersen and colleagues, (66) was fair to substantial (r=0.36–0.68). The average test-retest reliabilities for the items of the BPRS-NS subscale across the studies reviewed by Hedlund and Vieweg (65) were good (r=0.80–0.87).
The BPRS-NS subscale items had poor-to-good convergent validity (range 0.11–0.74) and the subscale as a whole had good convergent validity (r=0.82) when compared to the PANSS-NS subscale (63). Good discriminant (67) and predictive (68) validities were also reported for the scale.
The SANS is a clinician-administered scale that was developed to ascertain and quantify negative symptoms in inpatients and outpatients with schizophrenia. The items were generated from extensive clinical experience with patients with schizophrenia (70). The SANS comprises 30 negative symptoms representing five domains including affective flattening, emotional blunting/alogia, avolition/apathy, anhedonia/asociality and attentional impairment. Each item is scored on a 6-point scale (0 to 5) with higher scores indicating greater severity of negative symptoms. Five items comprised the avolition/apathy subscale (SANS-apathy).
The SANS-apathy subscale was observed to have good internal consistency (α=0.80) (70). The inter-rater reliability was also good (r=0.86) with reliability coefficients ranging from 0.70 to 0.92 for its five items. Test-retest reliability was not assessed by Andearsen and colleagues (70) but found to be fair (r=0.44–0.59) by McAdams et al (71), which is below the criterion for at least moderate test-retest reliability for this review.
McAdams et al (71) observed that the SANS-apathy subscale had good discriminant validity as indicated by poor correlation with the positive symptoms on the BPRS (r=0.16). The fairly high correlation between the SANS-apathy subscale and the HAM-D score is likely confounded by the inclusion of the “lack of interest item in work and activities” in the HAM-D, which is a significant symptom in apathy. Fair convergent validity was indicated by statistically significant but weak correlation between the SANS-apathy subscale and the negative symptoms of the BPRS (71). Good convergent validity was demonstrated by strong correlation between the SANS and Negative Subscale of the PANSS (r=0.77) (72;73).
The PANSS is 30-item clinician-administered measure that assesses positive (e.g., hallucinations, delusions), negative (e.g., blunt affect, apathetic social withdrawal), and general psychiatric (e.g., depressed mood, anxiety) symptoms of schizophrenia. Eighteen items were derived from the Brief Psychiatric Rating Scale (BPRS) (64) and 12 items from the Psychopathology Rating Scale (74) and all are rated on a 7-point scale. PANSS is available in many languages. The 7-item PANSS-NS subscale has been used to measure apathy and apathy-related behaviors in schizophrenia.
Substantial to good internal consistencies (0.68–0.83) were reported for the PANSS-NS subscale (72;73;75–77). This was also demonstrated in acute (α=0.87), chronic (α=0.78) and long-term care (α=0.82) patients with schizophrenia (75). The test-retest (r=0.68) (76) and inter-rater reliability for the this apathy measure was substantial (r=0.68) (72;73) to very good (r=0.94) (68) even for its individual items (i.e., 0.63–0.90) (68). Inter-rater reliability coefficients for the Spanish version of the scale ranged from 0.66–0.98, with a mean of 0.81(78).
Compared to the BPRS, Bell and colleagues (68) found that only three items on the PANSS had Kappa weights ≥0.75 (i.e., blunted affect, hallucinations, and grandiosity). The PANSS-NS and BPRS-NS items had low-to-good correlation (range 0.11–0.74). However, correlation between the entire negative syndrome subscales on both scales was high (r=0.82). These results indicated acceptable to very good convergent validity for the individual items and very good convergent validity for the entire PANSS-NS subscale (68). High and low scores on the PANSS-NS and BPRS-NS subscales had a Kappa of 0.72, with nearly 86% agreement between them. PANSS-NS scores were highly predictive of work performance, with the greatest relationship demonstrated in Negative Total Score predicting social skills (Adj R2 = 0.54, p<0.001) and work quality (Adj R2 = 0.55, p<0.001). There was no significant association between PANSS-NS score and measure of depression (r=0.22) indicating good discriminant validity. Correlation between total score and individual items of the PANSS-S and PANSS negative scales was strong (r =0.66–0.90) (78).
The single apathy item included in UPDRS is a quick and easy screen for apathy by neurologist in a busy clinic setting for patients with PD (79;80). The item is scored on a 5-point scale from 0–4 with higher score indicating greater severity of apathy. The item (and scale) is clinician administered.
The test-retest and inter-rater reliabilities of the UPDRS-apathy item has not been established.
Research studies have demonstrated the criterion, convergent and discriminant validities of the UPDRS-apathy item (79;80). The optimal cut-off for the UPDRS-apathy item was greater than or equal to 2, with sensitivity and specificity of 70% and 65% (79) and 73% and 75% (80), which is lower than the validity criteria set in this review. The UPDRS-apathy item was highly correlated with the loss of interest item of the MADRS (80) and the scores on the AS-14 scale (79), indicating good convergent validity. Poor correlation with the UPDRS-depression item (79;80) and the MADRS depression score (80) showed that the UPDRS-apathy item has good discriminant validity.
The IAS was developed to measure irritability and apathy and was validated in patients with Huntington’s disease and AD (81). The scale is clinician-administered to a knowledgeable informant and consists of 10 items and two subscales. The IAS-apathy subscale is comprised of five items, rated each on a 5-point Likert scale (1 = much more interest to 5 = much less interest). Higher scores on the IAS-apathy subscale indicate more severe apathy.
The IAS-apathy subscale was shown to be a reliable measure of apathy (81). The observed internal consistency (α=0.78) was considered good for a scale with few items (82) but slightly below the criterion set in this review. Good test-retest and inter-rater reliabilities were also observed (81).
The validity of the IAS-apathy subscale was demonstrated by Burns and colleagues (81) but has not been replicated elsewhere. More specifically, the IAS-apathy subscale was shown to have good discriminant validity and construct validity (81).
The FLOPS (now FrSBe) is a 46-item scale that was developed to assess and quantify behavioral disturbances associated with damages to the frontal lobe and frontal-subcortical brain circuits, such as apathy, disinhibition, and executive function(83;84). The FrSBe has three different versions: self, informant, and clinician-rated. Rating in the three domains is based on behaviors prior to the onset of dementia (“before” scores) and current behaviours (“after” scores). The scale was initially validated in outpatients or research participants with a “wide range of neurological disorders” (83) and later in patients with schizophrenia. The length of time taken to complete each version of the FrSBe is not indicated. The FrSBe-apathy subscale consists of 14 items each rated on a 5-point scale ranging from 1–5.
A number of studies have reported fair to high internal consistency for the FrSBe-apathy subscales with alpha coefficients ranging from 0.72–0.88 (84–86). Grace and Malloy (84) also reported a high internal consistency for the apathy measure (α=0.78) in a normative sample using the self-report form that was slightly lower than specified in this study. Velligan et al (86) also observed that the FrSBe-apathy had high test-retest reliability when used to measure apathy in individuals with schizophrenia.
Norton and colleagues (87) demonstrated that the FrSBe-apathy subscale had fair convergent validity by its significant but somewhat weak correlation with NPI-apathy subscale in patients with dementia. However, Ready et al (88), using a mixed sample of individuals with AD and MCI, found a statistically significant and stronger correlation between the FrSBe-apathy subscale and the loss of interest/reactivity item of the Cornell Depression Scale (r=0.53) indicating good convergent validity. Velligan and colleagues (86) also observed statistically significant and stronger correlations between the FrSBe-apathy subscale score and the emotional withdrawal and blunted affect items on the BPRS in a sample of individuals with schizophrenia. These results indicated that the FrSBe-apathy subscale had good convergent validity.
The FrSBe-apathy subscale also demonstrated good discriminant validity in samples with MCI, AD, PD, and schizophrenia. The FrSBe-apathy had little or no correlation with depression regardless of the diagnostic composition of the study sample (86;88;89). These findings supported the notion that apathy and depression are distinguishable constructs.
The KBCI is a 64-item measure designed to assess behavioural changes following TBI with validation in both TBI (90) and AD (91) samples. The KBCI has an apathy subscale with four positively and four negatively worded items each rated on a 4-point Likert scale (false or not at all true, slightly true, mainly true, and very true). The scale, and hence its apathy subscale, is clinician-administered. The time taken to complete the apathy assessment with the KBCI-apathy subscale is not indicated.
The KBCI-apathy subscale has high internal consistency (α=0.89) in a TBI sample (90). Test-retest and inter-rater reliabilities were not assessed in the validation studies found (90;91). Studies that assess the reliability of the KBCI-apathy are needed.
Scores on the KBCI-apathy subscale were found to be higher among patients with TBI and those with MS compared to normal controls, and also higher in TBI than in those with MS thus providing evidence of the construct validity of the subscale (90). Non-significant correlations between scores on the KBCI-apathy subscale and measures of language, visuospatial memory, and global cognitive functioning in individuals with AD have been interpreted as evidence of the discriminant validity of the subscale (91).
The NPI was developed to assess and quantify neurobehavioral disturbances in dementia patients and to quantify caregiver distress caused by such behaviors (92;93). The NPI has an apathy subscale, which consist of a general screen item rated on a yes-versus-no basis. If the symptom is found to be present, 7 additional apathy questions are administered and scored on a yes-versus-no basis. The overall frequency (rated as 1–4) and severity (rated as 1–3) of apathy is then rated. Scores on the NPI apathy subscale range from 0–12 with higher scores indicating more severe apathy (92;93). The NPI, and hence the NPI-apathy subscale, is widely used and has been validated in many different samples such as ambulatory patients with dementia, outpatients with AD, multicultural samples, and nursing home residents. The scale has been translated into many languages such as Korean (94), Hellenic (43;95), and Dutch (96) and validated in normal and clinical inpatient and outpatient samples from the respective countries. A shorter version of the scale (NPI-Q) that takes only 5 minutes to complete has also been developed in which the screen items are rated first on a presence-versus-absence basis followed by the assessment of frequency (1–4) and severity (1–3), resulting in a range of scores from 0–60 (97). Scores on the NPI-Q apathy item range from 0–12 with higher scores indicating greater severity of apathy. A nursing home version of the scale (NPI-NH) has also been developed with a NPI-NH-apathy subscale (98). Scores for the NPI-NH-apathy subscale range from 0–12 with increasing scores indicative of increasing severity of apathy. The NPI-apathy subscale has been used in many studies to determine the convergent validity of other apathy measures such as the AI-caregiver (33), the three versions of the AES (57), and the AS-10 (62).
The NPI and its subscales, including apathy, have been validated in many studies. This includes the English, Chinese, Hellenic, Korean, and Portuguese as well as the nursing home versions of the scale (92;94;97–100). Cummings and colleagues (92) reported good internal consistency (α=0.87–0.88), and test-retest (r=0.74 for frequency), and inter-rater reliabilities (r=0.89 for severity and r=0.98 for frequency) for the NPI-apathy subscale. Similarly, high test-retest reliability has been reported for apathy subscale of the nursing home (98) and Korean (94) versions. The test-retest reliability for the Portuguese version of the NPI-apathy subscale was not as high (99). High inter-rater reliability for the NPI-apathy subscale was also observed by Cummings et al (92) and Leung et al (100), while a lower value was reported by Camozzato and colleagues (99). Although the reliability of the total NPI-Q scale has been examined (97), this was not done for the apathy subscale.
Cummings and colleagues (92) assessed the content validity of the NPI and its subscales, including the apathy subscale, by using ratings from a panel of international experts in the assessment and treatment of neurobehavioral disturbances in the dementias. Other studies that have used the NPI-apathy subscale in the validation of other scales have provided information on its convergent validity. For example, Clarke et al (57), using a sample of community-dwelling individuals with memory complaints, found that the NPI-apathy subscale had statistically significant and fair correlations with the informant version but not the clinician version of the AES (AES-C, r=0.30; AES-I, r=0.50). Statistically significant and even stronger correlations between the NPI-apathy subscale and the AS-10 that was validated in nursing home residents (r=0.61 and 0.62) (62) and the NPI-apathy subscale and the AI-caregiver that was validated in MCI patients (r=0.66) (33) were observed. Politis et al (43;95) only demonstrated fair convergent validity between the Hellenic versions of the NPI-apathy subscale and the withdrawn/negative subscale of the BPRS (r=0.48). Leung et al (100) observed that the C-NPI-apathy subscale distinguish mild/moderate and severe dementia groups. Work on the discriminant validity of the NPI-apathy subscale is needed to further support its use as a valid measure of apathy in demented groups across many cultures.
The prevalence and incidence of apathy varies across neurological and psychiatric populations. As well, the associated clinical features tend to vary across diagnoses (i.e., Parkinsonian apathy is associated with motor changes and slowing; AD apathy is associated with cognitive impairments; FTD apathy is associated with impulsivity). As such, the suitability of the measure is highly important for accurate detection and treatment planning, and a single universal measure may not be appropriate for all neuropsychiatric patients. As reviewed above, there are a large number of well-structured, valid, and reliable measures, per the criteria set for this review, for assessing apathy across a range of populations. These measures, to some extent, are grounded in the conceptualization of apathy as a loss or lack of motivation, lack of self-initiated actions or emotional blunting. Disease-specific measures, such as the DAIR (apathy in AD) and UPDRS (apathy in PD Disease), are sensitive to symptom change and population treatment differences, making them important for understanding clinical course and prognosis.
Some apathy measures, such as the FrSBe and KBCI, are not disease-specific per se but provide tailored assessment for specific injury populations (fronto-corticol damage and TBI, respectively). Conversely, the NPI, AES and its variations (AS-7, AS-10 and AS-14), and AI are validated for broad application across dementia and cognitively impaired populations. The availability of specific and broad measures of apathy means clinicians have greater flexibility in assessing all patients, irrespective of symptomatology or diagnosis. It is possible that broad apathy measures may not sufficiently detect subtle, disease-specific variations in the presentation of apathy in certain populations (e.g., in patients with dysexecutive syndromes such as FTD; in patients with schizophrenia) but this line of inquiry requires further research. Lack of a unified definition and conceptual operationalization of “apathy” may be the foremost barrier to extending the current literature in this area. Promisingly, work is currently underway by apathy researchers to rectify these shortcomings in the field.
That apathy has no universally-accepted definition in medicine makes the availability of these measures even more important because clinicians and researchers have no gold standard upon which they can draw guidance. Clinicians and researchers do not consistently use the same terminology, with descriptors such as “flattened affect,” “amotivation,” or “avolition” given in place of or in relation to the term apathy. This lack of consensus has important implications for research and the interpretability of data from apathy measures. Naturally, when developers of assessment tools do not have a precisely defined construct around which they can base their instrument, bias can more easily occur. Those who view apathy as a symptom of other problems (e.g., dementia) may construct a scale differently and include different items than a measure constructed from the viewpoint that apathy is its own syndrome. This underscores the importance of clearly demonstrating good discriminant and convergent validity between measures, which was the method used to demonstrate the validity of majority of the apathy measures reviewed. This technique reflects the reliance on theoretical constructs to infer validity, which is a viable option in the measurement field when there is a lack of gold standard to demonstrate criterion validity. However, the assumption inherent in using one scale to validate another is that the former measure is valid, which could be erroneous and lead to incorrect interpretations. Therefore, vigilance is needed when choosing the measures to use to demonstrate the convergent and divergent validity of apathy measures with clear rationale for the choice of measure/s. Without a gold standard on which to validate an instrument, predictive validity is also seen as a viable option, as was done by Marin and colleagues (46) in the development and validation of the AES. Lack of clarity about the definition of apathy also has the unfortunate consequence of making it easier for clinicians and researchers to disregard apathy altogether. Indeed, apathy is only mentioned 15 times in DSM-IV, and it is absent from the International Classification of Diseases-10.
Future research should focus primarily on building a consensus about apathy as a construct so that a gold standard assessment tool can be developed. In the absence of this, though, the most psychometrically robust and widely-used measures for assessing apathy broadly appear to be the AES and NPI, although the NPI might have slight preference, especially in busy clinical practices, given its ease of use. Modified versions of the AES (e.g., the AS-10 and AS-14) have shown promise in terms of ease of use but their observed psychometric properties have not been replicated in independent studies. Among specific populations, the DAIR for patients with AD, SANS and PANSS for schizophrenia populations, and the FrSBe for patients with fronto-temporal deficits all demonstrate promising psychometric properties. More data is needed to further establish reliability and validity characteristics across the remaining apathy measures (e.g., UPDRS, KBCI-apathy, and BPRS). For the AI, very good internal consistency, test-retest and inter-rater reliabilities have been demonstrated by the scale’s developer but replication studies are needed. In addition, further study on its validity is needed to support its use as a reliable and valid apathy measure.
So far as apathy remains a somewhat elusive and ill-defined concept, proper assessment becomes decidedly more complicated– and important. The wide availability of empirically-supported measures increases the likelihood that, when used, accurate detection and treatment can occur. Inappropriate selection of measurement can lead to inaccurate interpretation of research data and flawed conclusions, which stymies research progress in neuropsychiatry as a whole. However, when measures are used more appropriately, improved patient and caregiver quality of life is more likely.
Though no gold standard apathy assessment tool exists, there are several well-validated, empirically-reliable measures for general (e.g., NPI, AES) and diagnosis-specific apathy (e.g., DAIR, PANSS, FrSBe). When used in conjunction with other clinical assessment tools (e.g., mood, history), these can provide a fuller picture of prognosis and clinical course, as well as informing treatment decision-making, response, and outcome. The measures described herein provide a range of options for assessing apathy in a variety of populations, with the goal of reducing morbidity. Future research needs to focus on further establishing psychometric properties of current apathy measures. Researchers and clinicians must make it a priority to build a consensus as to the definition of apathy, which will assist in future studies of apathy treatment and assessing the impact of treatment on course of illness.
Aside from the main funding source of the American Psychiatric Association (Dr. Diana E. Clarke, Dr. Emily A. Kuhl, and Ms. Rocio Salvador), Ms. Jean Y. Ko is a Ph.D. candidate (abd) in the Department of Mental Health at Johns Hopkins Bloomberg School of Public Health and is supported by NIA F31AG030908-02. Drs. van Reekum and Marin have no additional support.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Diana E. Clarke, Division of Research, American Psychiatric Association, , Arlington, Virginia, USA and Department of Mental Health, Johns Hopkins University School of Public Health, Baltimore, Maryland, USA.
Jean Y. Ko, Department of Mental Health, Johns Hopkins University School of Public Health, Baltimore, Maryland, USA.
Emily A. Kuhl, Division of Research, American Psychiatric Association, Arlington, Virginia, USA.
Robert van Reekum, Institute of Medical Science and the Department of Psychiatry, University of Toronto, Toronto, Ontario, Canada.
Rocio Salvador, Psychopathology Program Coordinator, Division of Research, American Psychiatric Association, Arlington, Virginia, USA.
Robert S. Marin, Medical Director, Hill Satellite Center; Associate Director, Center for Public Service Psychiatry, Western Psychiatric Institute and Clinic, Pittsburgh, Pennsylvania, USA.