|Home | About | Journals | Submit | Contact Us | Français|
RNS and MCM contributed equally to this work.
In order to formulate a parsimonious tool to assess empathy, we used factor analysis on a combination of self-report measures to examine consensus and developed a brief self-report measure of this common factor. The Toronto Empathy Questionnaire (TEQ) represents empathy as a primarily emotional process. In three studies, the TEQ demonstrated strong convergent validity, correlating positively with behavioral measures of social decoding, self-report measures of empathy, and negatively with a measure of Autism symptomatology. Moreover, it exhibited good internal consistency and high test-retest reliability. The TEQ is a brief, reliable, and valid instrument for the assessment of empathy.
Empathy is an important component of social cognition that contributes to our ability to understand and respond adaptively to others’ emotions, succeed in emotional communication, and promote prosocial behavior. The term “empathy” is derived from Titchener’s (1909; Wispé, 1986) translation of the German word Einfühlung, meaning “feeling into” (Wispé, 1987). Generally speaking, it refers to the consequences of perceiving the feeling state of another as well as the capacity to do so accurately. Despite the prominence of the empathy construct in developmental research (Sagi & Hoffman, 1976; Ungerer, 1990; Zahn-Waxler, Friedman & Cummings, 1983), and cross-species investigations of empathic capabilities (Masserman, Wechkin & Terris, 1964; Rice & Gainer, 1962), a clear, consensual definition of the construct of empathy remains elusive.
Recent research into empathy emphasizes the distinction between cognitive and emotional components of the construct (Preston & de Waal, 2002). These components assume various definitions. Put simply, however, emotional empathy is commonly thought of as an emotional reaction (e.g., compassion) to another’s emotional response (e.g., sadness). This reaction is not dependent on a cognitive understanding of why a person is suffering (Rankin, Kramer & Miller, 2005), although it may facilitate understanding and action. By contrast, cognitive empathy involves an intellectual or imaginative apprehension of another’s emotional state, often described as overlapping with the construct of theory of mind (understanding the thoughts and feelings of others) and used interchangeably by some authors (Lawrence, Shaw, Baker, Baron-Cohen, David, 2004). Numerous authors focus on distinguishing empathy from the related concepts of emotional contagion, sympathy and perspective-taking surveyed in some self-report measures of empathy (Wispé, 1987; Wispé, 1986; Omdahl, 1995). Whereas emotional contagion (also referred to as personal distress) involves the perceiver assuming the emotional state of the target, sympathy is thought to reflect a state of “feeling sorry” for the target with or without an associated behavioral response (Preston & de Waal, 2002). Perspective taking, in contrast, involves the apprehension of another’s thought and feeling states through the assessment of visual, auditory or situational cues (Rankin, Kramer & Miller, 2005), without any personal emotional response.
Agreement among researchers and theoreticians on the interrelated processes contributing to empathy has been elusive. Although the processes described above (perspective taking, sympathy, personal distress, emotional contagion, theory of mind) are referred to as “empathic,” there is little agreement in the literature as to whether they are distinct from empathy as an accurate affective insight into the feelings of another, or are facets of a central process required for empathic responding. Indeed, the current corpus of self-report measures of empathy reflects these differing constructs, resulting in significant heterogeneity among measures (Ickes, 1997). In the face of such heterogeneity, one useful approach may be to ask what is common among these different conceptions, allowing us to examine the consensus, or core, opinion on this important process.
It is important to note that a multifaceted measure may be preferable in some situations. We are not proposing that multifactorial approaches be replaced with a unidimensional measure or that empathy itself be viewed as a single, homogenous construct. Rather, the field of empathy measurement lacks a sufficient tool for examining this construct at the broadest level, and it is this gap that we endeavour to remedy. A useful parallel may be drawn with early intelligence research, which suffered a similar period of confusion populated by multiple conceptions. When a single underlying factor was extracted from the multiple tests, this “g factor” proved a useful tool in intelligence research (Spearman, 1904). Moreover, the utility of g was not achieved at the cost of other conceptions of intelligence. With a similar aim, we sought to derive a single-factor representation of the currently heterogeneous empathy construct in order to create a useful tool for empathy research that can complement, rather than replace, current multifactorial approaches. Importantly, this consensus measure was derived statistically, using factor analysis, rather than through intuition.
The Empathy Scale (Hogan, 1969), one of the first measures to achieve widespread use, contains four separate dimensions: social self-confidence, even-temperedness, sensitivity, and nonconformity. A recent psychometric analysis of the scale, however, indicates questionable test-retest reliability and low internal consistency, along with poor replication of its previously hypothesized factor structure (Froman & Peloquin, 2001). Indeed, several authors suggest that the four factors measured by this scale are better suited to the measurement of social skills, broadly speaking, than a central tendency towards empathic behavior (Davis, 1983; Baron-Cohen & Wheelwright, 2004). Hogan’s (1969) Empathy Scale has been widely employed as a measure of cognitive empathy (e.g. Eslinger, 1998), but has recently been supplanted in popularity by the Interpersonal Reactivity Index (IRI; Davis, 1983), discussed below.
The Questionnaire Measure of Emotional Empathy (QMEE; Mehrabian & Epstein, 1972) re-emphasizes the original definition of the empathy construct (Titchener, 1909; Wispé, 1986). The scale contains seven subscales that together show high split-half reliability, indicating the presence of a single underlying factor thought to reflect affective or emotional empathy. The authors of this scale suggested more recently, however, that rather than measuring empathy per se, the scale more accurately reflects general emotional arousability (Mehrabian, Young & Sato, 1988). In response, an unpublished, revised version of the measure, the Balanced Emotional Empathy Scale (Mehrabian, 2000) taps respondents’ reactions to others’ mental states (c.f. Lawrence, et al., 2004).
The IRI (Davis, 1983) contains four subscales: Perspective Taking and Fantasy in addition to Empathic Concern and Personal Distress-each pair purported to tap cognitive and affective components of empathy, respectively. As pointed out by Baron-Cohen and colleagues (Baron-Cohen & Wheelwright, 2004), however, the Fantasy and Personal Distress subscales of this measure contain items that may more properly assess imagination (e.g., “I daydream and fantasize with some regularity about things that might happen to me”) and emotional self-control (e.g., “In emergency situations I feel apprehensive and ill at ease”), respectively, than theoretically-derived notions of empathy. Indeed, the Personal Distress subscale appears to assess feelings of anxiety, discomfort, and a loss of control in negative environments. Factor analytic and validity studies suggest that the Personal Distress subscale may not assess a central component of empathy (Cliffordson, 2001). Instead, Personal Distress may be more related to the personality trait of neuroticism, while the most robust components of empathy appear to be represented in the Empathic Concern and Perspective Taking subscales (Alterman, McDermott, Cacciola & Rutherford, 2003).
Other self-report measures of empathy have been developed to target specific populations. These include: the Scale of Ethnocultural Empathy (Wang, et al., 2003), the Jefferson Scale of Physician Empathy (Hojat, et al., 2001), the Nursing Empathy Scale (Reynolds, 2000), the Autism Quotient (Baron-Cohen, Wheelwright, Skinner, Martin & Clubley, 2001) and the Japanese Adolescent Empathy Scale (Hashimoto & Shiomi, 2002). Although these instruments were designed for use with specific groups, aspects of these scales may be suitable for assessing a general capacity for empathic responding. That is, all of these diverse scales touch upon an aspect of empathy, broadly speaking.
The Autism Quotient (Baron-Cohen, Wheelwright, Skinner et al., 2001), for example, was developed to measure Autism Spectrum Disorder symptoms. The authors viewed a deficit in theory of mind as the characteristic symptom of this disease (Baron-Cohen, 1995) and number of items from this measure relate to broad deficits in social processing (e.g., “I find it difficult to work out people’s intentions.”). Thus, any measure of empathy should exhibit a negative correlation with this measure. The magnitude of this relation, however, will necessarily be attenuated by the other aspects of the Autism Quotient, which measure unrelated constructs (e.g., attentional focus and local processing biases).
Additional self-report measures of social interchange appearing in the neuropsychological literature contain items tapping empathic responding, including the Dysexecutive Questionnaire (Burgess, Alderman, Evans, Wilson & Emslie, 1996) and a measure of emotion comprehension developed by Hornak and colleagues (Hornak, Rolls & Wade, 1996). These scales focus on the respondent’s ability to identify the emotional states expressed by another (e.g., “I recognize when others are feeling sad.”). Current theoretical notions of empathy emphasize the requirement for understanding of another’s emotions in order to form an empathic response (Bernieri, 2001). Only a small number of items on current measures of empathy, however, assess this ability.
The present study attempts to formulate a consensus among the many scales in use to gauge the empathy construct. Using exploratory factor analysis (EFA), we forced the items to load onto a single factor, thereby assembling a group of highly related items from across many measures of empathic responding, bringing about a unidimensional factor of empathy. Our aim was to identify what is common among different conceptions of empathy, as operationalized by published measures of this construct. In a series of three studies, we constructed the Toronto Empathy Questionnaire (TEQ), and demonstrated the TEQ’s construct validity through associations with behavioral and self-report measures of interpersonal sensitivity, as well as its internal consistency and test-retest reliability.
We began by submitting responses to every self-report measure of empathy we were able to identify to an EFA, determining what were common across these previously published measures. Items were forced to load on to a single factor, forming the basis of our questionnaire that was then examined for factorial integrity, internal consistency and reliability.
Two hundred University of Toronto undergraduates (100 female) mean age 18.8 years (SD = 1.2) participated for course credit in a psychology course, satisfying general recommendations for sample size in factor analysis aimed at determining the stability of component patterns (Guilford, 1954; Russell, 2002). A balance of genders was carefully observed for initial scale development.
A review of the literature was conducted with the aim of collecting all available measures related, even tangentially, to the self-report of empathic processes or the assessment of deficits in empathic ability. Questions were selected from several published self-report empathy measures, including the IRI (28 items; Davis, 1983), Hogan’s Empathy Scale (15 items; Hogan, 1969), QMEE (nine items; Mehrabian & Epstein, 1972), a reworded Balanced Emotional Empathy Scale (12 items; Mehrabian, 2000), Scale of Ethnocultural Empathy (four items; Wang, et al., 2003), Jefferson Scale of Physician Empathy (six items; Hojat, et al., 2001), Nursing Empathy Scale (eight items; Reynolds, 2000), Japanese Adolescent Empathy Scale (10 items; Hashimoto & Shiomi, 2002), and the Measure of Emotional Intelligence (three items; Schutte, et al., 1998), for a total of 95 items after redundant questions were removed. An additional 36 questions were composed based on the literature concerning individuals with altered empathic responding due to neurological or psychiatric disease, with the addition of modified items from the Dysexecutive Questionnaire (four items; Burgess, et al., 1996) and a measure of emotion comprehension developed by Hornak and colleagues (seven items; Hornak, Rolls & Wade, 1996). Factor analysis with 200 participants and 142 items yielded an independent observation-to-item ratio of 1.4:1 that exceeds the minimum 1.2:1 ratio capable of recovering a population factor structure (Barrett & Kline, 1981; see MacCallum, Widaman, Zhang & Hong, 1999).
In order to ensure consistency across sampled items, questions were re-worded to assess frequency of behavior rather than to pose general statements or tendencies. Responses were given using a 5-point Likert-scale corresponding to various levels of frequency (i.e., never, rarely, sometimes, often, always), as opposed to agreement with individual statements, a method used in several of the scales described above.
Two additional self-report measures were administered in their entirety to establish convergent and discriminant validity: the IRI, comprising 4 subscales of 7 items each (Davis, 1983) and the 50-item Autism Quotient (Baron-Cohen, Wheelwright, Skinner et al., 2001). We expected the subscales of the IRI to be positively related to the TEQ, given that these subscales reflect the content of the majority of empathy measures. Within this measure, we predicted that the Empathic Concern subscale would show the strongest association with the TEQ, followed by the Perspective Taking subscale, where these subscales are thought to map closely onto emotional and cognitive constructs of empathy. We did not expect the Fantasy and Personal Distress subscales of this measure to show a strong association with the TEQ, given their close relation to imagination and emotional self-control (Baron-Cohen & Wheelwright, 2004). Finally, we predicted that the Autism Quotient would be negatively related to the TEQ, as it measures a degree of deficit in social processing. We expected this relation to be moderated, however, by the presence of items in this scale unrelated to empathic responding.
A consensus account of empathy was determined using an EFA examining the structure of inter-correlations among items. An iterated principal-axis factor analysis with squared multiple correlations of each item with all other items as the initial communality estimates was conducted on responses for each item. Items from this EFA were forced to load onto a single factor. To devise a unidimensional empathy questionnaire that maximized item-remainder coefficients and factor loadings, we eliminated items that had low item-remainder coefficients (below 0.30), those that failed to improve internal consistency, and items possessing factor loadings lower than 0.40. A second EFA was then conducted with the 16 retained items in order to more completely document the factor structure of the questionnaire.
Convergent and discriminant validity of the newly devised 16-item TEQ was then assessed by calculating Pearson correlations with the IRI and the Autism Quotient. Gender differences in the TEQ were assessed by an independent samples t-test and by calculating the effect size with Cohen’s d. Correlations between the IRI subscales and the Autism Quotient were also determined.
Initial eigenvalues greater than one and their variance explained are provided in Table 1. Forty-one factors with an eigenvalue greater than one suggest a multiplicity of factors in the self-report of empathy and related constructs (according to the Kaiser criteria). Conducting an EFA with a forced single factor yielded 55 items with loadings above .40, drawing on items from each scale. When more than ten items load at .40 or above, a single component can be considered a stable representation of the population parameter with the present sample size (Guadagnoli & Velicer, 1988; Stevens, 2002). To form a brief scale, these 142 items were then culled to maximize internal consistency and item-remainder coefficients. This process led to the formation of the 16-item TEQ (see Appendix). The TEQ contains an equal number of positively and negatively worded/scored items from a number of different scales as well as newly composed items (Table 2). Unidimensional factor loadings ranged from .41 to .65 (mean = .51, SD = .07) (Table 2).
Item-remainder coefficients were sound, ranging from .36 - .59 (Table 2); internal consistency was also good, Cronbach’s α = .85. In a second EFA of the 16-item TEQ, the first five eigenvalues were 5.23, 1.43, 1.13, 1.06 and 0.93. There is a discontinuity between the first and second factor, consistent with a unidimensional structure. Factor coefficients are reported in Table 2 where the items were forced to load upon a single factor, ranging from .42 to .65 (mean = .53, SD = .08). This analysis yielded four items with loadings above .60, an indication that the factor is reliable regardless of sample size (Guadagnoli & Velicer, 1988; Stevens, 2002). The factor structure of the newly formed TEQ is further explored in an independent sample in Study 2.
Participants’ total scores on TEQ items positively correlated with the IRI subscale Empathic Concern, r = .74, p < .001. Four items within the TEQ are reworded Empathic Concern subscale items. When these items are removed from the TEQ total score, the correlation remains high, r = .71, p < .001, suggesting that TEQ items used to measure empathy tap a construct similar to that measured by the Empathic Concern subscale of the IRI. The TEQ had a lower, but still positive, correlation with the IRI subscale of Perspective Taking, despite containing no items from this scale, r = .35, p < .001. Thus, our measure of the broadest level of empathy, while clearly closer to an emotional measure of empathy, still captures variance associated with a more cognitive measure of empathy.
The TEQ scores exhibited a negative correlation with the Autism Quotient, as hypothesized, r = -.30, p < .001. Individuals scoring highly on our measure tended to report less social processing and communication difficulties, as assessed by the Autism Quotient. As predicted, the magnitude of this association was not as great as that for the IRI, where the Autism Quotient measures other symptoms of this disorder not specifically related to social functioning and thus not expected to relate systematically to our measure of empathy. Relations to the Autism Quotient are intended only to demonstrate divergence with related, though conceptually quite different, measures. Means and standard deviations of all measures can be found in Table 3.
No effect of gender was observed in this sample of the TEQ (Table 4), suggesting that males and females provide equivalent responses on our measure.
The IRI subscales also demonstrated significant associations with the Autism Quotient. Consistent with the theory of mind deficits associated with Autism Spectrum Disorder, the IRI subscale Perspective Taking was negatively associated with the Autism Quotient, r = -.23, p <.01. A positive association, however, was observed between the IRI subscale Personal Distress and the Autism Quotient, r = .36, p < .01. This association suggests that individuals reporting greater emotional arousability report greater difficulties with social processing and communication and may not represent a core component of empathy. Additionally, there was a slight negative or no relationship with the other subscales, Empathic Concern: r = -.10, p > .10; and Fantasy: r = -.02, p > .75. The low association between the Autism Quotient and Empathic Concern suggests that the subscale’s construct of empathy is unrelated to self-reported proficiency in social processing and communication. The relationship between self-reported empathy and social processing are more explicitly examined in Study 2.
From the current corpus of heterogeneous self-report measures of empathy, we identified items that, together, assess a common construct of empathy. This led to the creation of a unidimensional empathy questionnaire, the TEQ, which possesses high internal consistency and demonstrated convergent and discriminant validity. In a second study, we aimed to further demonstrate the TEQ’s factorial integrity, internal consistency and expand upon its construct validity.
In processing interpersonal information, an empathic individual must discriminate and interpret stimuli relevant to the goals of social processing. This interpersonal information must subsequently be interpreted accurately in order to facilitate the task of responding in an empathic fashion (Bernieri, 2001). We assessed the relation of the TEQ to two behavioral measures that also require the processing of complex interpersonal stimuli: The Reading the Mind in the Eyes Test-Revised (MIE; Baron-Cohen, Wheelwright, Hill, Raste & Plumb, 2001) and the Interpersonal Perception Task-15 (IPT-15; Costanzo & Archer, 1994). Together, these measures assess processes that are described commonly in the theoretical literature surrounding empathic accuracy (e.g., emotion comprehension, perspective-taking; Sagi & Hoffman, 1976; Ungerer, 1990; Zahn-Waxler, Friedman & Cummings, 1983).
The utility of any self-report measure is improved greatly if associations can be found with task-based measures (which in this case are presumably less influenced by factors such as socially-desirable responding). Indeed, scores on a valid scale of empathy should be systemically related to the correct identification and comprehension of social stimuli, as assessed by these measures. However, most self-report measures of empathy are not systematically associated with performance on interpersonal sensitivity tasks (e.g., Ickes, 1997), except in rare instances when other factors, such as the targets’ trait expressivity, is taken into account (Zaki, Bolger, & Ochsner, 2008). Here, we predicted that, in assessing the broadest level of empathy, the TEQ would have more success in predicting empathic performance than did these earlier measures.
Seventy-nine University of Toronto students (55 female) aged, on average, 18.9 years (SD = 3.0) participated for course credit in psychology.
The MIE is an adult test of mentalizing that presents respondents with 36 still pictures of actors’ eye-regions and asks which of 4 possible mental states the person currently possesses (Baron-Cohen, Wheelwright, Hill, et al., 2001). All participants are presented with a list of terms used in the task, and are provided with the opportunity to read an explanation and example for each. This list of terms and definitions remains with each participant throughout testing for reference. Correct responses on the MIE indicate an ability to understand and pair mental-state terms with static nonverbal cues. High functioning individuals with Asperger’s syndrome or autism perform worse on this measure compared to age- and IQ-matched controls, indicating that the test is sensitive to rather subtle individual differences in social perception (Baron-Cohen, Wheelwright, Hill, et al., 2001).
The IPT-15 is a video containing 15 unscripted interactions between two or more individuals (Costanzo & Archer, 1993). Following each vignette, a multiple-choice question is presented that has an objective and true answer (e.g., “Who is the child of the two adults?”). Respondents must closely attend to dynamic nonverbal cues (e.g., prosody, posture, gesture, etc.) in order to select the correct answer. The answer to this question is never explicitly conveyed. Participants reliably score significantly above chance and scores on this measure are highly correlated with peer ratings of interpersonal sensitivity and social skills (Costanzo & Archer, 1989).
The validity of the TEQ was examined by correlating total scores with the IRI subscales, MIE and IPT-15. Gender differences were assessed by an independent samples t-test and the effect size was determined by calculating Cohen’s d. As a secondary goal, the structure of this measure was again examined by calculating item-remainder coefficients and Cronbach’s alpha. Two tests were then employed to re-examine the structural validity of the TEQ. Parallel analysis and Velicer’s minimum average partial test (O’Connor, 2000; Steger, 2006; Velicer, 1976) are statistical methods that enable one to objectively determine the number of factors in a dataset. Parallel analysis provides the eigenvalues from a factor analysis of a randomly permuted dataset. Here, random permutations were performed of raw TEQ data (matching for sample size, number of items, and scoring range). The eigenvalues of the random permutations from the 95th percentile are then plotted and compared with the real data. The number of factors present in the data is observed at the point of intersection on the Scree plot. Next, Velicer’s minimum average partial test was performed to determine the number of factors (or components) in the TEQ. In the minimum average partial test, a complete principal components analysis is performed, after which the first principal component is partialled out of the correlations among the variables and the average squared partial correlation is noted. This procedure is repeated using the first two principal components, then the first three, etc. The number of components whose partialling out resulted in the minimum average partial is the number of components related to systematic, rather than unsystematic, variance in the original correlation matrix.
The TEQ correlated positively with the IRI subscales of Empathic Concern, r = .74, p < .001, Perspective Taking, r = .29, p < .01, and unlike in Study 1, Fantasy, r = .52, p < .001. Scores on the TEQ also correlated with the behavioral measures of social comprehension, MIE: r = .35, p < .01; IPT-15: r = .23, p < .05. This was true even though these two measures themselves were uncorrelated, r = .08, p > .45. The lack of correlation between the MIE and IPT-15 illustrates the problematic heterogeneity that is commonly observed with regards to empathy measurement (Ickes, 1997), and emphasizes the need for a measure that represents core empathy, or what is common among these diverse measures. Furthermore, these associations with behavioral measures of interpersonal sensitivity demonstrate validity extending beyond agreement with other self-report measures. Importantly, the magnitude of these associations is not trivial. The association with the MIE falls within the top third of all effect-sizes observed in psychology for measures that do not share method-variance, and the correlation with the IPT-15 lies within the middle third (Hemphill, 2003).
Unlike the TEQ, the IRI subscales demonstrated a slight negative or no relationship with the MIE, Empathic Concern: r = - .15, p < .05; Perspective Taking: r = -.16, p < .01; Personal Distress, r = -.14, p < .05; Fantasy: r = -.06, p > .30. Additionally, the IRI exhibits statistically nonsignificant relationships with the IPT-15, which are weaker but similar in value to the TEQ, Empathic Concern: r = .17, p > .10; Perspective Taking: r = .20, p > .05; Personal Distress: r = -.11, p > .30; Fantasy: r = .10, p > .40. Thus, although the TEQ is highly related to the Empathic Concern subscale of the IRI (Study 1), it performs better than the IRI when predicting actual social cognitive performance on measures related to empathic accuracy.
Unlike Study 1, gender differences were observed in this sample (Table 4). Consistent with previous self-report measures of empathy (e.g. Davis, 1983), a moderate effect was observed: women scored higher than men.
Item-remainder coefficients for the TEQ were sound, ensuring that all the items assess the same construct, ranging from .37 - .71, and internal consistency was good, Cronbach’s α = .85. An examination of the Scree plot of the real and permuted data (Figure 1) indicated that the number of factors in the dataset is one. Velicer’s minimum average partial test found systematic variance in the TEQ related to a single component with the smallest average squared correlation of .0231 (Table 5). The parallel analysis and the minimum average partial test provide converging evidence that the TEQ comprises a single factor.
In order to explore further the psychometric properties of the TEQ, we once again investigated convergent and discriminant validity through associations with self-report measures of empathy and Autism Spectrum Disorder symptomatology, as well as test-retest reliability on a second set of responses given by returning participants from Study 2. The aim of this study was to extend the findings from Study 1 by examining the relation of the TEQ to additional measures of social cognitive processing related to empathy, as well as the stability of our measure over time. We included a new self-report measure of empathy developed by Baron-Cohen and Wheelwright (2004), the Empathy Quotient. The development of this 80-item questionnaire was theoretically-driven and it was evaluated psychometrically on individuals with Asperger’s Syndrome and matched neurologically-intact controls. Because this scale was not available when Study 1 was conducted, it was not included in the original battery given to our respondents. As predicted by Baron-Cohen and Wheelwright (2004), individuals with Asperger’s Syndrome scored lower on this measure of empathy than controls. We expected TEQ scores to be positively associated with the Empathy Quotient, and negatively associated with the Autism Quotient.
Sixty-five University of Toronto students (46 female) aged, on average, 18.6 years (SD = 2.3) returned from Study 2 a mean of 66.1 days (SD = 6.35, range = 57-84) following their initial participation and received course credit for participating.
The validity of the TEQ was examined by correlating its total with the Empathy Quotient and Autism Quotient. Additionally, item-remainder coefficients and Cronbach’s alpha were calculated. Test-retest reliability was determined by calculating the correlation between returning participants’ scores attained during Study 2 and re-administration of the TEQ. In order to assess an effect of attrition, a paired-samples t-test was calculated to determine differences in TEQ score between test administrations. Gender differences in the TEQ were assessed by an independent samples t-test and Cohen’s d.
As predicted, the TEQ correlated positively with the Empathy Quotient, r = .80, p < .001, and negatively with the Autism Quotient, r = -.33, p < .01. Item-remainder coefficients for the TEQ were sound, ranging from .34 - .71 (see Table 1). Moreover, the internal consistency of our measure remained good, α = .87. Finally, the TEQ demonstrated high test-retest reliability, r = .81, p < .001. Differences in TEQ means (Table 3) were not significant between test administration, t(64) = 1.51, p > .10. As in Study 2, a moderate effect of gender was observed (Table 4).
The construct of empathy has assumed various definitions, as reflected by the heterogeneous nature of current self-report measures of empathy. In an EFA, we determined what was shared by the corpus of empathy questionnaires by determining a single common factor. Items forming this factor were then used to construct a new unidimensional scale, the TEQ, for the assessment of empathy. This new scale captures the underlying consensus among questionnaire measures currently in use, and may prove an important tool in capturing performance on this elusive construct. The items represented in this single factor suggest that, among current measures of empathy, the most commonly measured construct reflects primarily an emotional process, or an accurate affective insight into the feeling state of another. The results of Studies 1 through 3 demonstrate that the TEQ possesses a robust single factor structure, high internal consistency, convergent validity with existing self-report scales, as well as behavioral measures of interpersonal skills and high test-retest reliability. Overall, the TEQ is a psychometrically sound, easily administered and brief self-report measure of empathy.
Emphasis on the emotional components of empathic responding in the TEQ is consistent with the approach taken by other researchers in forming self-report measures of empathy (e.g., Mehrabian & Epstein, 1972). For example, a confirmatory factor analysis of the IRI found one general dimension of empathy at the apex, Empathic Concern; this dimension overlaps to a great extent with Perspective Taking and Fantasy (Cliffordson, 2002). Consistent with this finding, the TEQ correlated highly with the IRI subscales of Empathic Concern (Study 1 & 2), and to a lesser degree, Perspective Taking (Study 1 & 2) and Fantasy (Study 2). Taken together, these results suggest that the four-factor (i.e., multiple subscale) solution implicit in the IRI may not be necessary to capture empathic responding in self-report measures.
Cognitive accounts of empathy, although not mutually exclusive to affective accounts, emphasize aspects of social responding involving the ability to take the perspective of another (Allport, 1961; Mead, 1934), role-taking (Mead, 1934) and the ability to infer and predict another’s behavior or mental state (Baron-Cohen & Wheelwright, 2004; Dennett, 1987). The TEQ demonstrated an association with this cognitive account, correlating with the IRI subscales of Perspective Taking and Fantasy, described previously as the cognitive components of empathy (Davis, 1983). This association suggests significant overlap across the “cognitive” and “affective” components of empathy described in the literature, where inter-correlation of emotional and cognitive accounts of empathic responding may indicate shared processes (for similar accounts of Theory of Mind reasoning, see Leslie, Friedman & German, 2004). Indeed, evidence from neuroimaging and monkey research suggests that the cognitive and affective empathy may be mediated in different domains but are represented by the same underlying process in viscero-motor mirror neurons, neurons which fire in response to both executing and observing a goal-directed action or emotional experience of another (Gallese, 2003; Gallese, Keysers & Rizzolatti, 2004).
The TEQ contains 16 questions that encompass a wide range of attributes associated with the theoretical facets of empathy. The affective aspect of empathic responding is thought to be related to such phenomena as emotional contagion (Lipps, 1903; Eisenberg & Miller, 1987), emotion comprehension (Haxby, Hoffman & Gobbini, 2000), sympathetic physiological arousal (Levenson & Ruef, 1992) and con-specific altruism (Rice, 1964); all of which are represented in TEQ items. Two items specifically target the perception of an emotional state in another that stimulates the same emotion in oneself (items 1 and 4). One item assesses emotion comprehension in others (item 8). Other items address the assessment of emotional states in others by indexing the frequency of behaviors demonstrating appropriate sensitivity (items 2, 7, 10, 12, 15). The TEQ also contains items tapping sympathetic physiological arousal (items 3, 6, 9 and 11) and altruism (items 5, 14 and 16). Finally, one item probes the frequency of behaviors engaging higher-order empathic responding, such as pro-social helping behaviors (item 13). Eight items are negatively scored (2, 4, 7, 10, 11, 12, 14, 15), reflecting the frequency of situational indifference towards another individual on the above described parameters. Taken together, these items represent a wide variety of empathy-related behaviors described in the current literature surrounding this process.
We predicted that the TEQ would diverge from measures surveying Autism Spectrum Disorder, since the latter taps deficits in social processing among other symptoms of this disorder. Consistent with this prediction, the TEQ shows a negative correlation with poor interpersonal and social responding, as partially assessed by the Autism Quotient, a measure of Autism Spectrum Disorder symptomatology (Baron-Cohen, Wheelwright, Skinner, et al., 2001), demonstrating concurrent validity. As expected, the magnitude of this association was not too great, in light of the fact that the Autism Quotient also measures other symptoms of Autism not related to social skill. The TEQ demonstrated convergent validity in the positive correlations observed, not only with self-report measures of empathy, but with two behavioral measures that require the processing of complex interpersonal stimuli. Interpersonal information must be interpreted accurately in order to facilitate the task of responding in an empathic fashion (Bernieri, 2001). This is in contrast to previous findings, where empathy questionnaires and behavioral tasks often do not correlate (Ickes, 1997; cf. Zaki et al., 2008). Importantly, tasks such as the MIE and IPT that directly assess interpersonal sensitivity demonstrate a higher degree of ecological validity than do self-report tasks. Behavioral measures of interpersonal sensitivity, however, carry the disadvantage of being time- and effort- intensive. The TEQ provides a quick and easy way of assessing interpersonal sensitivity in a way consistent with these behavioral measures, while providing substantial time-savings and ease of administration. Notably, the IRI, a commonly used self-report measure of empathy, demonstrated weaker and statistically unreliable associations with these same tasks in our dataset (see also, Mar, Oatley, Hirsh, dela Paz & Peterson, 2006).
The TEQ also correlated highly with a significantly lengthier measure of empathic responding, the 80-item Empathy Quotient (Baron-Cohen & Wheelwright, 2004). Shorter questionnaires such as the TEQ are especially useful for inclusion in mass-testing packets, internet research, or in any other instance where time and participant fatigue is an issue.
In developing the TEQ, we created a parsimonious scale that is short, clear and homogenous and has strong psychometric properties including a robust single factor structure, high internal consistency, construct validity and test re-test reliability. One limitation of this study is that our data were derived from a relatively small sample, composed of college-aged students. Further work is required to assess the generalization of our findings to a wider age range. The observed central tendency and variability of the IRI, Autism Quotient, EQ, MIE, and IPT-15 across our studies are, however, consistent with previously publications, suggesting that the current samples are generalizeable. Inconsistent gender differences, with effect sizes ranging from trivial to moderate, will need to be addressed in larger sample sizes. The TEQ, with its brevity and ease of administration, could be useful in patient populations.
Altered empathic responding has been reported in patients with Axis I (Clinical Syndromes; O’Connor, Berry, Weiss & Gilbert, 2002; Deardorff, Kendall, Finch & Sitarz, 1977) and Axis II (Developmental and Personality Disorders; Guttman & Laporte, 2000; Tantam, 1995) psychiatric disorders, as well as in neurological patients with “acquired sociopathy” (Blair & Cipolotti, 2000), frontal lobe lesions (Eslinger, 1998) and frontotemporal lobar degeneration (Rankin, Kramer & Miller, 2005). These deficits pose serious challenges to the quality of life of the patient, family members and caregivers. Work is currently underway in our laboratory to develop a second caregiver-report measure based on the TEQ. Deficits in empathic understanding may be better understood through assessment and quantification, leading to effective intervention.
We thank Ewa Munro and Pheth Sengdy for assistance in compiling the questionnaire measures. This study was supported by Canadian Institutes of Health Research (MGP-62963) and National Institute of Child Health and Human Development (HD42385-01) grants to B.L.
Below is a list of statements. Please read each statement carefully and rate how frequently you feel or act in the manner described. Circle your answer on the response form. There are no right or wrong answers or trick questions. Please answer each question as honestly as you can.