|Home | About | Journals | Submit | Contact Us | Français|
Two distinct literatures have emerged on the functionality of the anterior temporal lobes (ATL): in one field, the ATLs are conceived of as a repository for semantic or conceptual knowledge. In another field, the ATLs are thought to play some undetermined role in social-emotional functions such as Theory of Mind. Here we attempted to reconcile these distinct functions by assessing whether social semantic processing can explain ATL activation in other social cognitive tasks. Social semantic functions refer to knowledge about social concepts and rules. In a first experiment we tested the idea that social semantic representations can account for activations in the ATL to social attribution stimuli such as Heider and Simmel animations. Left ATL activations to Heider and Simmel stimuli overlapped with activations to social words. In a second experiment we assessed the putative roles of the ATLs in the processing of narratives and theory of mind content and found evidence for a role of the ATLs in the processing of theory of mind but not narrative per se. These findings indicate that the ATLs are part of a neuronal network supporting social cognition and that they are engaged when tasks demand access to social conceptual knowledge.
The function of the anterior temporal lobes (ATLs) is not well understood (Olson et al., 2007). Among the reasons for this are the technical challenges for functional imaging posed by their anatomical location. The ATL is proximal to the air filled cavities of the nasal sinuses resulting in inhomogeneities in the magnetic field of functional MRI (fMRI) scanners leading to signal loss (Devlin et al., 2000). More influential has been findings from patients with semantic dementia, a variant of fronto-temporal dementia (FTD) that prominently affects the ATLs. These patients suffer from a progressive deterioration of the ATLs (Davies et al., 2004; Mummery et al., 2000) which frequently causes semantic dementia (Hodges et al., 1992; Snowden, 1989). Semantic dementia patients show impairments in tasks requiring access to semantic knowledge of words and objects (Rogers et al., 2004); (Snowden, 1989; Warrington, 1975). Their deficits are observed over a wide variety of semantic categories and representational formats (e.g., pictures, words, environmental sounds) (Garrard, 2002; Patterson et al., 2007; Rogers et al., 2006b) even though other aspects of memory, cognition, and language processing remain relatively intact (Hodges et al., 1992; Snowden, 1989). Similarly, when transcranial magnetic stimulation is applied over the lateral ATLs of healthy individuals, synonym judgment as well as object naming is slowed (Pobric et al., 2007) and this pattern can be observed in both hemispheres (Lambon Ralph et al., 2009).
These and other findings have motivated scientists to propose that the ATLs are critical for amodal domain- general aspects of semantic processing (McClelland and Rogers, 2003; Patterson et al., 2007; Rogers et al., 2006a; Rogers et al., 2004). This “semantic hub” model, proposes that the ATLs mediate communication between the modality specific sensory regions distributed throughout the cortex that encode representations of object attributes (McClelland and Rogers, 2003; Rogers et al., 2006a; Rogers et al., 2004).
A problem for the semantic hub model is that ATL- activation is not evident across the vast majority of fMRI studies on semantic memory (Thompson-Schill, 2003). Semantic hub proponents have argued that this is due to signal loss in the ATLs due to susceptibility artifacts (Devlin et al., 2000). Indeed, attaining precise BOLD imaging of the ATLs is still problematic unless an optimized pulse sequence and voxel size are used (see methods). In a recent review Visser and colleagues (Visser et al., 2009) investigated factors contributing to the inconsistencies in the imaging literature regarding the likelihood of finding ATL activation. They identified four factors: (1) the distortion artifacts in imaging studies (2) a limited field of view (FOV) (3) the use of a high subtraction task and (4) lack of statistical power due to the failure to use a ROI- based approach. However, even when taking these factors into consideration imaging studies on semantic memory remain inconsistent.
However, there is a diverse class of stimuli and tasks that reliably evoke ATL activations in fMRI studies: stimuli and tasks with social or socially relevant content. ATL activations are often reported when participants attribute thought, intentions or beliefs to others (Theory of Mind: ToM) and the ATLs have therefore been suggested to be part of a neural network underlying social cognition (Frith and Frith, 2003; Olson et al., 2007).
The idea that the ATLs may have some role in social processing is rendered more credible by the fact that a wide range of ToM tasks and stimuli evoke activations in this region: brief vignettes (Saxe, 2006), cartoons (Gallagher et al., 2000), and even simple animations of geometric shapes that evoke the attribution of intentions (Castelli et al., 2000; Martin and Weisberg, 2003; Ohnishi et al., 2004; Schultz et al., 2003; Tavares et al., 2008), first created by Heider and Simmel in 1944 (Heider, 1944). Moreover, other high-level social tasks evoke activations in the ATL: moral judgments (Moll et al., 2005), socio-emotional stories (Ferstl and von Cramon, 2007) and sounds evoking a social scene such as footsteps (Saarela and Hari, 2008). The special responsiveness the ATLs to social stimuli would also explain early findings showing ATL activation to familiar and famous faces (Sergent et al., 1992) which has previously been interpreted as evidence for the view that the ATLs function is in representing “unique entities” (Gorno-Tempini and Price, 2001; Tranel, 2006). Converging evidence for the putative social role of the ATL is found in human neuropsychological data and in ablation studies in non-human primates (for a review see Olson et al., 2007).
There are several ways to reconcile these two literatures – one pointing towards a role of the ATL as a semantic hub, the other towards some underdetermined role in social processing. One possibility is that the ATLs contain functional subdivisions, each separately concerned with aspects of semantics and social cognition. One subdivision may be based on laterality. The majority of research on semantic memory has yielded a left- sided bias in cortical activation (Devlin et al., 2002; Joseph, 2001; Martin and Chao, 2001; Thompson-Schill, 2003). In contrast, neuroimaging studies of social cognition have reported a right hemispheric bias in favor of social stimuli (Moll et al., 2002; Zahn et al., 2009) and the right ATL is dominant in studies of familiar face perception (Grabowski et al., 2001; Kuskowski and Pardo, 1999; Nakamura et al., 2000). Likewise, resection of the right ATL can diminish the ability to recognize, or recall information about famous or personally familiar faces (Fukatsu et al., 1999; Glosser et al., 2003; Tsukiura et al., 2003).
A second possibility is that the ATLs, along with distributed networks along the superior temporal sulcus lack a specific, fixed functional role. In this view, the activity in the ATLs is determined by coactivations of cell populations in other parts of a distributed neural network. In their recent review of experiments involving activity in the STS, Hein & Knight (Hein and Knight, 2008) proposed that the same brain region can support different cognitive functions depending on task-dependent network connections.
A third possibility, and the focus of this paper, is that the diverse stimuli and tasks that evoke ATL activity have a more general process in common. Zahn and colleagues (Zahn et al., 2007) recently proposed that the ATL is involved in the processing of social concepts, a type of semantic memory. Evidence for this view comes from an fMRI study in which participants were required to judge the semantic similarity of word pairs from two classes of lexical stimuli: words describing human social semantic concepts (honor -brave) versus words describing biological function (nutritious - useful). The results showed bilateral ATL activations when participants judged the relatedness of social concepts as compared with non-social concepts. In this view the ATLs are activated in social cognitive tasks because these tasks require the retrieval of semantic information specific to social situations. This account explains the ATL involvement in a variety of social cognitive tasks and nicely reconciles these findings with the literature on semantic processing.
The social semantic hypothesis makes a critical prediction: tasks that involve the recovery of social semantic information should show overlapping activations in the ATL even when tasks are substantially different in design, stimulus quality, and task demands. The test of this prediction is the motivation and rationale of our first experiment.
In Experiment 1 we compared the functional overlap between two perceptually and cognitively distinct tasks. In a conjunction design, it is important to choose task pairs that are as different as possible, so that activations in shared cortical areas can be convincingly attributed to the shared cognitive process and not the physical stimulus properties or task similarities (Friston et al., 1999). In the social task, a classic social attribution task, Heider and Simmel (HSim) animations was used (Heider, 1944). The perception of social agency in this task is largely dependent on the particular pattern of visual motion which evokes the perception of animism, complete with human-like interaction and the attribution of thought and intentions. In the control condition, the motion fails to evoke the perception of agency or animism. In the lexical task, participants were required to make similarity judgments about word pairs that had either a social or non-social meaning (Zahn et al., 2009; Zahn et al., 2007). The social semantic hypothesis predicts that we should observe overlapping ATL activations in these two tasks. In contrast, if both tasks activate the ATLs but do not overlap, this would suggest that the ATLs have functional subdivisions, with distinct regions processing semantic stimuli and social stimuli.
Fifteen neurologically normal participants (7 female; mean age: 27.86; SD: 5.15; 8 male; M: 26.25; 4.74) volunteered for this fMRI experiment. All participants were right handed, native English speakers with normal or corrected-to-normal vision. Informed consent was obtained according to the guidelines of the Institutional Review Board of the University of Pennsylvania and every participant received monetary compensation for participation in the experiment. All participants except two (the experimenter and his coworker) were naïve in respect to the purpose of the experiment and were debriefed after the experiment.
The Heider and Simmel (HSim) paradigm consisted of animations of small geometric shapes like the ones used by Heider and Simmel (1944). The “social” and “non-social” movies were used in a previous fMRI study (Schultz et al., 2003) and were generously provided to us by Robert Schultz and his coworkers. We added a third non- animated control condition to this paradigm in order to isolate effects that are solely attributable to the motion of the stimuli. Similar to the stimulation used by Heider and Simmel, social movies consisted of three white geometric shapes (circle, triangle and diamond) that moved against a black background. A white square (box) was located in the center of the screen with one wall that opened as if on a hinge allowing the shapes to open and shut the door, and to enter, chase or drag other shapes inside. In each of the 15 s movies the object's motion suggested personal agency and reciprocal and contingent interactions that were easily interpreted as being of social nature. They were scripted to follow a social story such as a hide and seek scenario, two objects conspiring against another etc. After watching the full movie the participants were asked to decide by pushing a button if all three shapes were “friends” or not. Half of the films had interactions suggesting friendship between all three objects. The films were scripted so that task-critical interactions occurred in the final few seconds of the film. This forced participants to attend throughout the entire film in order to give a correct answer.
The control condition was as similar as possible to the social condition but without invoking the impression of a social interaction between the objects. The static appearance of the 15 sec movies were identical to that of the social movie condition. In contrast to the experimental condition however, participants were told that they would observe three “bumper cars” (BCar) like the ones seen in amusement parks. The cars collided on their motion paths and participants were asked to monitor the movements and collisions of the cars in order to determine whether they were all the same weight.
The third control condition consisted of three still slides that were selected randomly from the frames that constituted the movies of the two animated conditions. Each slide was presented for 5 sec and the presentation of the slides was followed by a 3 sec. response epoch. Participants were asked to watch the three slides passively and decide with a button press after the offset of the third slide whether all objects were outside the box in all three slides. Each of the three condition blocks of the HSim paradigm were 18 sec in duration and were preceded by a 3 sec prompt announcing the upcoming condition, task and response side (see supplement 5). Each condition was presented 3 times during one run in pseudo-random order resulting in a run duration of 3 min and 36 sec.
The lexical stimuli and the associated psycholinguistic parameters for our experiment were used in a previous experiment (Zahn et al., 2007) and were generously provided by Roland Zahn and colleagues. From this list 60 social (Soc) (John et al., 1991) and 60 animal function (An) (McRae et al., 2005) word pairs were selected based on their rated descriptiveness. In their study, descriptiveness correlated positively with activation in the ATLs and we therefore chose the word pairs with the highest scores on this variable. As revealed by several uncorrected independent t-tests, the selected social semantic and animal function word pairs were not different regarding written word frequency (Francis and Kucera, 1982) t(118) = 0.871; p = 0.38 and descriptiveness t(118) = -1.52; p = 0.13 but differed significantly in terms of imageability t(118) = 16.75; p <0.001 and concreteness t(118) = 18.63; p <0.001. Imageability and concreteness were also highly correlated with each other with r = 0.93; p < 0.001. It is important to point out, however, that in the study by Zahn and others (Zahn et al., 2007) both concreteness and imageability were not correlated with activation in the ATLs. However, we reserved the option to test for effects of both variables by creating 5 bins of word pairs of varying concreteness and assigned them to our 5 lexical runs. We counterbalanced the order of their presentation allowing us to test for the effects of these variables on ATL activation in an additional analysis (see supplement 1). For a more comprehensive description of the psycholinguistic stimulus properties, please refer to the supplemental material in Zahn and others (Zahn et al., 2007).
In each individual trial, word pairs were presented as white letters on a black background for the duration of 3 sec, located above and below the center of the screen respectively. After 3 sec a question mark appeared in the center of the screen for 3 sec. Participants were instructed to respond during this period as to whether the two words were semantically related or not. After the offset of the question mark, the next word pair appeared on the screen. Each block contained the presentation of 3 word pairs with 3 response periods totaling 18 sec in duration.
A third control condition was introduced that was intended to provide insight into cortical activation patterns associated with semantic retrieval per se rather than a specific semantic class. For that, we used an equally demanding non-semantic task adapted from a procedure described in Pobric and others (Pobric et al., 2007). While it is difficult, if not impossible, to create a task with stimuli that is completely devoid of semantic meaning, a previous study by Pobric and others has shown that transcranial magnetic stimulation (TMS) to the ATLs that impaired performance in a lexical semantic task did not reduce performance in a number comparison task. Here, we used a similar task that we adapted to be more amenable to our experimental setup. Instead of word pairs, participants were presented with number pairs between 0 and 99. These were presented for 3 sec and were located above and below the center for the screen. After the presentation of the numbers, a question mark was presented on the center of the screen for 3 sec, and participants had to indicate via button press whether the lower number was in the range of +/- 5 of the number above. Analogous to the lexical task, each block was 18 sec in duration and contained 3 responses and was preceded by a 2 sec prompt heralding the forthcoming task and response side assignment (see supplement 5). The number pairs were assigned pseudo-randomly to the blocks.
All participants received a standardized instructions 15-min computer-based instruction and practice session. After placing the participant into the scanner an approximately 10 min long high- resolution anatomical scan was obtained. The anatomical image was used to fit the volume of covered brain tissue acquired in the functional scan. Participants received 10 functional runs (5 HSim runs and 5 word runs) of 3 min 36 sec duration each (72 TR's) in a counterbalanced order. Response-side allocation was also counterbalanced between participants. As part of each individual run participants received a 15 sec instruction about the upcoming task and response-side allocation. The experimenter checked with the participant on an intermittent basis and allowed for breaks if needed. The duration of the scan session was approximately 45 min.
Neuroimaging sessions were conducted at the Center for Functional Neuroimaging at the University of Pennsylvania on a 3.0 T Siemens Trio (Erlangen, Germany) using an eight-channel multiple-array Nova Medical (Wilmington, MA) head coil.
Functional T2*-weighted images sensitive to blood oxygenation level-dependent contrasts were acquired using a gradient-echo echo-planar pulse sequence (repetition time (TR), 3 s; echo time (TE), 20 msec; FOV= 240 × 240, voxel size, 2.5 × 2.5 × 2.5 mm; matrix size, 96 × 96, flip angle = 90°) and automatic shimming. This pulse sequence was chosen based on functional and anatomical data from a pilot scan. Here, one participant was scanned at rest under varying parameters of voxel-size (2 mm, 2.5 mm, 3 mm), TE (20ms, 30ms, 32ms) and automatic versus manual shimming to the ATLs with an axial slice orientation. Signal coverage in the ATLs varied between parameter combinations and was best for the described pulse sequence in regard of signal gain in the ATLs. Visual inspection of the co-registered functional image confirmed signal coverage in the ATLs in all participants. However, signal coverage was weaker in inferior lateral aspects of the temporal lobes in the inferior temporal gyrus in the middle section of Brodmann area 20. Signal loss in the orbitofrontal cortex and in the anterior-most aspects of the frontal lobes above the frontal sinuses was observed and varied between participants.
Thirty interleaved axial slices with 2.5 mm thickness were acquired to cover the temporal lobes. On the basis of the anatomical information of the structural scan the lowest slice was individually fitted to cover the most ventral aspect of the inferior temporal lobes. We attained partial brain coverage. The coverage of brain tissue in the dorsal direction varied greatly between participants depending on head size. In all participants the dorsal parts of the frontal and parietal lobes were not covered.
The 10 functional runs were preceded by a high-resolution structural scan. The T1-weighted images were acquired using a three-dimensional magnetization-prepared rapid acquisition gradient echo pulse sequence (TR, 1620 msec; TE, 3 msec; FOV=192 × 256 mm, inversion time, 950 ms; voxel size, 0.9766 × 0.9766 × 1 mm, matrix size, 192 × 256 × 160, flip angle=15°, 160 contiguous slices of 1.0 mm thickness). Stimuli were rear projected onto a Mylar screen at the end of the scanner bore with an Epson (Long Beach, CA) 8100 3-liquid crystal display projector equipped with a Buhl long-throw lens (Navitar, Rochester, NY). Participants viewed the stimuli through a mirror mounted to the head coil. Responses were recorded using a four-button fiber optic response pad system, of which the outer left and outer right buttons indicated responses. The stimulus delivery was controlled by E-Prime software (Psychology Software Tools Inc., Pittsburg, PA) on a windows laptop located in the scanner control room.
fMRI data were preprocessed and analyzed using Brain Voyager software (Goebel et al., 1998). The preprocessing of the functional data included a correction for head motion (trilinear/sinc interpolation), the removal of linear trends and frequency temporal filtering. The data were coregistered with their respective anatomical data and transformed into Talairach space (Talairach, 1988). The resulting volumetric time course data were then smoothed using a 6mm gaussian kernel.
For all blocks, a canonical hemodynamic response function (HRF) was modeled spanning the entire 15 sec of the both movie conditions and the still image condition. The HRF for the word blocks spanned the presentation of 3 word pairs (and number pairs).
To see whether our manipulation resulted in a typical activation in the ATLs we subjected our data to a random effects GLM on a group level, for the individual conditions in the movie and word runs respectively. The GLM for the movie runs contained the HSim condition, the bumper car condition, and the still image condition as predictors of interest. Social words, animal function words, and numbers served as predictors of interest in the word runs. We applied a false discovery rate (FDR) at α < 0.05 to explore the effects of the critical contrasts in the entire covered brain tissue. We expected the effects in the word condition to be smaller since we were unable to use a regression approach as in Zahn et al. (2007) due to the blocked design of our experiment. Therefore we reserved the option to apply an anatomically constrained regions of interest (ROI) approach to analyze possible effects in word contrasts in the ATLs.
We defined ROIs in the ATLs on the basis of the HSim vs. BCar contrast image that was derived from a fixed effects GLM on a participant level. The center of the ROI was identified as the peak voxels of the activation that survived a α < 0.05 FDR on the fixed effects GLM of each individual participant. The anatomical location was constrained to BA 38 as verified by the Talairach coordinate of the peak voxels using the online Talairach applet (www.talairach.org/applet/). We expected some inter-individual variability in the cortical location of within the ATL to this contrast.
We used a cluster size with an upper limit of 125 voxels (5×5×5) of continuously significant voxels to define ROI's in the bilateral ATLs. Our principal scientific question about functional overlap between activation in social semantic networks and HSim animations warranted a more lenient threshold because we intended to avoid a type II error. Differences between social concepts and animal function concepts were then tested within these anatomically and functionally defined ROI's using a multi-subject ROI analysis based on the General Linear Model and subsequent contrasts of interest (t-statistics).
In addition we identified the peak activations for the HSim vs. BCar and the Soc vs. An contrasts on the basis of a fixed effects GLM computed for each individual participant and plotted them into a Talairach transformed brain that was rendered transparent (Figure 2). This descriptive analysis allows the inspection of the distribution of activations within the ATLs for each contrast and the assessment inter-individual variability and the overlap between clusters.
Participants understood all tasks and responded satisfactorily. In the HSim task participants identified whether the “people were all friends” in 82% of the cases and determined correctly in 66% of the cases that the bumper cars were all the same weight and were near ceiling performance in the SImg task (91% correct). The responses between the HSim and the BCar task differ significantly (t(14) = 3.42; p < 0.05) which was likely due to the fact that movements of the bumper cars were slightly more ambiguous in regards to their weight than the abstract objects in the HSim task in regard to their social interaction.
Participants also responded adequately in the word runs. Average correct responses to the number comparison task was 91%. There were no objectively correct or incorrect responses in the semantic similarity task, because the semantic similarity between two words varied on a continuum.
Although our research question concerns activation in the ATLs specifically, we first report whole-brain activations to provide a comprehensive picture of the activations in our contrasts, and to enable readers to make comparisons with the extant literature. For locations of significant differences of the random effects analysis (FDR; α = 0.05), their peak coordinates in Talairach space, and p- values please refer to Table 1 (see supplement 2.). A visual depiction of data, FDR- corrected, is shown in Figure 1.
The HSim condition resulted in stronger and more extensive activations than the BCar condition (Figure 1, panel A). Activations to the HSim condition extended from the most posterior locations on the superior temporal sulcus (STS) into the ATLs. This activation was observed in both hemispheres but was more pronounced in the right hemisphere consistent with previous findings (Castelli et al., 2000; Gobbini et al., 2007; Schultz et al., 2003). Activations were also observed in bilateral inferior frontal gyri (IFG), the fusiform gyrus (FG), and the middle frontal gyri of both hemispheres, in line with prior findings.
The result of the statistical contrast between the BCar and the SImg condition was distinctly different from contrast between the two main dynamic conditions (Figure 1, panel B). The BCar movies activated the primary visual cortex and spread dorsally encompassing the lateral occipital cortex and bordered the posterior superior temporal sulcus (pSTS) in its ventral vicinity, corresponding the human homologue of motion-sensitive cortex MT/V5. Activation was also observed in the pulvinar.
As can be seen in Panel C in Figure 1, the social semantic word task engaged the left ATL stronger than the animal function task (Soc > An). The cluster is located on the lateral surface of the anterior-most section of the middle temporal gyrus. The social semantic task also produced a larger activation in the right middle temporal gyrus (not seen in Figure 1). However, this cluster was smaller, the effect of a lesser magnitude and located considerably more posterior than the cluster in the left ATL. There were also brain areas that were more engaged in the An condition, most notably anterior portions of the parahippocampal gyrus, and the left posterior middle temporal gyrus.
For this analysis we collapsed the two lexical conditions and compared them to our third “non-semantic” condition, the number task (See Figure 1 Panel D). Activations were observed in the left inferior frontal gyrus (IFG), the left ATL, the left STS, bilateral insula, and the caudate nucleus.
The number task did not activate the ATL. Activations were observed in posterior aspects of the middle temporal gyrus, the precuneus, and along the posterior sections of the fusiform gyrus.
A fixed effects GLM on an individual participant basis revealed that 11 out of 15 participants showed significant effects in the social attribution contrast, HSim vs. BCar, in the left ATL and 12 in the right ATL. In the social word contrast, Soc vs. An, significant clusters were found in 11 participants in the left ATL but for only 8 participants in the right ATL. We conducted a group random effects analysis of the social word contrast within clusters defined by the social attribution contrast in each individual participant to assess the overlap between tasks. The GLM revealed significant differences within the left ATL [t(10) = 2.85; p = 0.017] but overlap only approached significance in the right ATL [t(11) = 2.03; p = 0.07].
In a more descriptive analysis we explored peak activations for both contrasts and their distance for each individual participant. The peak activations for both contrasts clustered in the rostral- most aspect of the ATL (Figure 2). The center of the activation cluster in the HSim condition was remarkably similar in both hemispheres (left: x: -44; y: 15; z: -18; right: x: 44; y: 14; z: -19) and were located in the anterior STS. The average coordinate for the word contrast was at x: -41; y: 20; z: -24 in the left hemisphere and at x: 46; y: 14; z: -21 in the right hemisphere. The distance between the centers of the clusters between the HSim and word contrast was therefore very small. The average Euclidean distance between peak activations of each contrast, calculated for each participant and averaged, was 11mm in the left hemisphere and 14mm in the right hemisphere.
Experiment 1 tested the hypothesis that retrieval of social concepts, a type of semantic memory, underlies activations observed in the ATLs during social cognitive tasks. To test this hypothesis we employed a conjunction paradigm measuring overlap between a visual social attribution task and a verbal social word judgment task. The results supported our hypothesis, showing clear overlap in the left ATL. Overlap in the right ATL only approached significance despite slightly larger power of the test.
To our surprise the contrast between the collapsed semantic tasks and the number task was associated with activation in the left IFG, but not the ATLs. This may be explained by the distributed nature of cortical representations of concepts. By collapsing over categories we may have introduced additional variability that lead to the negative finding.
The question remains as to how best explain the right lateralized ATL activations observed in the Heider and Simmel task, and social attribution tasks more generally. This question was the focus of Experiment 2. A review of the literature indicated that there were two plausible hypotheses. The first hypothesis was that the right ATL activations caused by the Heider and Simmel task were due to the fact that animations that evoke social attribution also induce the formation of narratives. Several studies have linked ATL activations to the comprehension of narratives. For instance, Mazoyer and colleagues (Mazoyer, 1993) measured regional cerebral blood flow with PET while participants listened to lists of words, sentences containing pseudo words, semantically anomalous sentences and native and non-native stories. Activations were observed bilaterally in polar aspects of the ATL (for similar results see (Fletcher et al., 1995b). Maguire and colleagues (Maguire et al., 1999) found that the temporal poles were activated to coherent as compared to non-coherent narratives. The authors proposed that the temporal pole is involved when sentences are linked to form a narrative (Hickok and Poeppel, 2007).
Stories constitute narratives when sentences are tied into a structure to convey discourse level information that is not encoded in individual sentences (Xu et al., 2005). This generates a conceptual structure that makes it possible to process information in a larger context, incorporating the past knowledge and therefore access to long-term memory of the recipient. When participants are asked to report on their perception on typical HSim stimuli, they spontaneously form relatively elaborate narratives (Castelli et al., 2000; Heider, 1944). Participants relate perceived events and link them into a coherent string of causally related events enacted by the abstract protagonists. This aspect of the HSim animations is distinctly different from the BCar condition since the simple collisions between the shapes do not allow the formation of narratives. Here, events are semantically similar (collisions) and do not appear to be causally related to one another.
The second hypothesis was that the right ATL activations evoked by the HSim task were due to comprehension of the mental state of the protagonists in the movies, or in other words, ToM. HSim animations require one to extract levels of meaning that are not explicitly encoded in the stimulus. This requires the formation of inferences about the mental states of the actors. As noted earlier, ToM manipulations frequently activate the ATLs.
In Experiment 2 participants were exposed to the Heider and Simmel paradigm and three lexical conditions: ToM stories, stories without ToM content, and unlinked sentences. One goal of this experiment was to test whether the activations in the ATLs to social cognition tasks (particularly the right) can be explained by the narrative structure that accompanies many such tasks. A second goal was to test whether the activations can be explained by processes underlying ToM cognition.
Thirteen neurologically normal participants (9 female; mean age: 27.86; SD: 5.15; 3 male; M: 26.25; 4.74) volunteered for this fMRI experiment. One participant was excluded from further analysis because of technical problems during the scan.
As in Experiment 1, the HSim paradigm and the story conditions were delivered in separate runs during one session. One session consisted of 3 HSim runs and 5 “story runs” and lasted about 45 min including the anatomical scan. Participants' responses were closely monitored but were not included in the analysis since their mere purpose was to keep the participants focused on the story content.
We employed the Heider and Simmel paradigm again (see Experiment 1 methods), but this time without the SImg condition. The HSim and the BCar condition alternated throughout one run. Three HSim runs were delivered during one session and were interleaved with 5 story runs containing the new experimental conditions. A HSim run lasted about 4.5 min and 5 HSim blocks and 5 BCar blocks were delivered during one run.
The stories consisted of short 3 sentence vignettes and were constructed to match each other in story content, word count (mean word count: ToM: 36.5; nNarr: 36.1; nToM: 36.3) and all stories contained social content in form of an interaction between two people. Three stories in each condition were delivered during one run (9 stories per run). Each run lasted about 4.5 min and 5 story runs were presented during one session. There were therefore 15 stories for each condition in the experiment (45 total). All stories were delivered for 13 sec., preceded by a prompt (2 sec) and followed by a response period of 6 sec and 6 sec rest (see supplement 5). Example vignettes for all conditions can be found in supplement 3.
We used a novel approach that used communicative intent instead of the false belief task which has become a standard tool for the assessment of ToM processes. We reasoned that communicative intent is a more ecologically valid means to assess ToM processes since ToM is likely to be most prevalent in everyday communication and is therefore more closely related to social interaction as is evoked in the HSim condition. Healthy adults and typically developing children usually integrate the context of a situation, prosody, nonverbal gestures, and past knowledge about the sender effortlessly to extract the meaning of a message that lies behind it's explicitly stated content (Schultz von Thun, 2003).
The 15 ToM stories that were used in this experiment consisted of short narratives involving two people. The last sentence contained a remark of one person to the other that pertained to the scenario described in the previous sentences and that could potentially be interpreted in several ways (“Look ahead, the light is green.”). The context of the story was designed to evoke an interpretation of that remark that would go beyond its mere factual content (“The light is green, not red.”) to convey an appeal (“Its time to start driving.”). In order to understand the implied message of the remark the reader has to understand the context of the story as delivered by the narrative and, most importantly, the mental state of the sender of the message. These implied meanings were easy to grasp as in the example above. ToM stories were preceded by a prompt (2 sec) to pay attention to the thoughts (“Thoughts”) and intentions of the protagonists in order to understand the statement. Each ToM story was followed by a 6 sec presentation of a statement that referred to the remark in the story. The participants were asked to indicate with a button press whether the statement reflected the implied meaning of the message correctly.
The non-theory of mind (nToM) condition consisted of 15 story vignettes that were similar to the ToM narratives including a social interaction but were missing the critical ToM manipulation described above. The non-narrative (nNarr) condition consisted of 3 sentences that were not thematically linked. The content of the sentences was closely matched to the content of the stories of the ToM and nToM conditions including a social interaction. nToM and nNarr stories were followed by the 6 sec presentation of a sentence that either contained a correct (50%) or incorrect statement about the factual content of the story. Participants were prompted to make a right or left button response to indicate whether the statement was correct.
For the story conditions, a canonical hemodynamic response function (HRF) was modeled for the stories (15 sec) including the 2 sec prompt, response periods (6 sec) and rest (6 sec) separately (see supplement 5).
For the ROI analysis in experiment 2 we followed a similar procedure to experiment 1. We defined ROIs in the ATLs on the basis of the HSim vs. BCar contrast image on a single participant level. The center of the ROI was identified as the peak voxels of the activation that survived a α < 0.05 FDR on the fixed effects GLM of each individual participant. As in experiment 1 the anatomical location was constrained to BA 38 and the cluster size to 125 voxels. The differences between story conditions was tested using a multi-subject ROI analysis. Otherwise, the imaging procedure and analysis were identical to the one used for the first experiment.
As in Experiment 1, participants understood all tasks and responded satisfactorily in the training session. Participant's responses were closely monitored during the experiment to reassure proper compliance and alertness. Performance n the HSim task was comparable to that in the first experiment.
For locations of significant differences of the random effects analysis (FDR; α = 0.05), their peak coordinates in Talairach space, and p- values please refer to Table 2 (supplement 4). A visual depiction of data, FDR corrected, is shown in Figure 3.
The visual inspection of the HSim vs. BCar contrast at the whole volume level revealed a remarkably similar pattern of activation to that observed in Experiment 1. Activations extended along the STS ending into the ATL with additional effects in the FG and IFG.
Differences between story conditions were not robust enough to manifest themselves in a random effects analysis that was FDR corrected for multiple comparisons on a whole-volume level. A more lenient fixed effects analysis (for display purposes only) is shown in Figure 3.
The most prominent activations to the theory of mind contrast were in the ATLs. No other systematic effects were found at our threshold, which is quite different from reports in the extant literature, which emphasizes activations in the TPJ (Saxe and Wexler, 2005). In contrast, the neural correlates of stories containing a narrative vs. no narrative, were found in the left supramarginal gyrus/angular gyrus. This was the only significant effect at the group level (random effects GLM; FDR-corrected; see Figure 3). When analyzed on a single participant level (fixed effects GLM, at a threshold of p < 0.001) only 5 out of 12 participants showed a significant effect in the ATLs.
We conducted a random effects GLM on the group level within pre- defined ATL ROIs for each individual participant (α = 0.05). There was significant overlap between the Heider and Simmel and ToM task in the right ATL [t(11) = 5.3; p = 0.000251] while the same test in the left hemisphere approached significance [t(11) = 2.09; p = 0.06].
To see whether non-social narratives overlapped with social attribution activations in the ATL, an additional analysis was performed with the nToM vs. nNarr contrast in the HSim ROI. No significant effects were found (left: [t(11) = 0.26; p = 0.8]; right: [t(11) = 0.34; p = 0.74]).
In Experiment 2 we asked whether the narrative nature or the mentalizing demands of the Heider and Simmel stimuli contributed to the engagement of the ATL. Our results did not support the narrative hypothesis because we did not observe significant overlap between activation to non-social narratives and the HSim task. Instead, narratives needed to have a ToM component to activate the ATL. In particular, ToM stories overlapped with the HSim task strongly in the right ATL, and somewhat in the left ATL.
The goal of this study was to test the hypothesis that the ATL is involved in the processing of social concepts and whether the engagement of these concepts can explain ATL activity in other social cognitive tasks. Our findings support this hypothesis by showing overlapping left ATL activations to two very different tasks whose only similarity was their underlying social semantic structure: a complex visual motion task that elicited social attribution (Heider and Simmel stimuli) and a lexical task requiring a semantic similarity judgment between abstract social and non-social words. We further tested whether ATL activation in the HSim task was attributable to its narrative structure or to its requirement to attribute mental processes, such as theory of mind. We found evidence for the latter by showing overlap between the HSim task and a ToM task in the right ATL.
The activations observed to the HSim stimuli were similar to the ones observed in prior fMRI studies using similar stimulation (e.g. (Castelli et al., 2000; Gobbini et al., 2007; Schultz et al., 2003). Activations extended along the length of the STS beginning in temporo-parietal junction (TPJ) into the ATL. The posterior STS is commonly associated with biological motion processing (Hein and Knight, 2008; Puce and Perrett, 2003) while more medial regions on the left are associated with linguistic processing (Hickok and Poeppel, 2007; Shalom and Poeppel, 2008). The anterior-most aspect of the STS, the ATL, has been functionally associated with social processing across a wide variety of stimuli and tasks (reviewed in (Olson et al., 2007) with a right hemispheric lateralization evidenced in studies using non-verbal stimuli (e.g. (Hari and Kujala, 2009; Siegal and Varley, 2002). In line with this, we found that activations to social animations were more pronounced in the right hemisphere. Interestingly, it appears that semantic dementia patients with right dominant lesions have more social processing deficits than their left-sided counterparts (Mychack et al., 2001) (Thompson et al., 2003).
The social semantic contrast using word stimuli revealed focal ATL activations, replicating previous findings by Zahn and others (Zahn et al., 2007). Our data therefore support the notion that the ATL's contain neural representations of social semantic concepts. However, in contrast to Zahn and others' results social semantic processing engaged the left ATL more than the right. While activations in the left ATL were located in the most rostral section of the extension of the STS and MTG, the right hemisphere differences were more apparent in the lateral sections. Interestingly, this picture was different when we inspected the clusters of peak activations of individual participants. The distributions were now more similar with a center of gravity in the rostral- most section of the extension of the MTG and STG in the bilateral ATL. This discrepancy is likely due to the inter-participant variability of the locus of activation with somewhat less variability in the left hemisphere. This reveals that the analysis on the group level can be misleading and confirms the appropriateness of a single participant-ROI approach for our experiment.
The critical part of our analysis was to determine the overlap between the HSim task and the lexical social semantic task. Indeed, activation overlapped in the left, but not the right, ATL. Our data suggest that the social attribution in HSim- like animations causes the activation of social semantic networks in the ATLs. This provides a possible explanation for ATL activation in social cognitive tasks in general, as they are likely to evoke social concepts as well. For example, stories or pictorial cartoons designed to test neural components of ToM will invariably evoke social concepts, even if not verbalized, especially since they often involve interactions between people or anthropomorphic agents (e.g. (Frith, 2008; Gallagher et al., 2000; Moll et al., 2002; Moll et al., 2005). It is important to emphasize that in this view, social semantic networks represent abstract representations of concepts that can be evoked not only by lexical stimuli, such as social words or stories, but potentially through stimuli in any sensory modality.
Access to social conceptual representations may be accomplished through mechanisms also involved in language processing. The left IFG which has been consistently implicated in language production and in semantic retrieval (Thompson-Schill, 2003). In line with this literature, we found that the HSim task and the lexical semantic task robustly activated the left IFG. It is a matter of current debate whether the role of the left IFG reflects either effortful semantic access (Badre and Wagner, 2002; Wagner et al., 2001) or the regulatory control of selection processes needed to select between competing sources of information (e.g. (Thompson-Schill et al., 1997).
Unlike several prior neuroimaging studies (Bottini et al., 1994; Castelli et al., 2000; Fletcher et al., 1995a; Humphries et al., 2006; Humphries et al., 2005; Maguire et al., 1999; Mazoyer, 1993; Vandenberghe et al., 2002) our narrative manipulation in Experiment 2 did not significantly engage the ATLs. One explanation for this discrepancy is that narrative manipulations on the paragraph-level are not as efficient as on the sentence-level (Xu et al., 2005). However, another, more likely explanation is that past imaging studies did not control for the social content of their stories and thus it is possible that the ATLs were activated because the narratives evoked social semantic knowledge. Regardless, it remains plausible that the ATLs have a role in narrative comprehension and production since narratives may evoke conceptual knowledge elicited by semantic and syntactic context.
In Experiment 2 we also examined whether the involvement of the ATLs in ToM cognition could explain their activation in complex stimulus scenarios involving social attribution as in the HSim task. Indeed, activation to the HSim task and the ToM stimuli overlapped in the right ATL and approached significance in the left ATL.
These results are in agreement with past imaging research showing activations in the ATL to various ToM tasks (Olson et al., 2007)) and a different literature showing activations to comprehension of irony and sarcasm. Like our ToM manipulation irony and sarcasm involve implicit communicative intent, but the intended meaning is opposite of what is explicitly stated and mostly has a negative connotation. For instance, Wakusawa and others (Wakusawa et al., 2007) used photographs of two actors engaged in a communicative interaction in a rich environmental context in which one of the protagonists made an ironic or metaphoric statement to the other. Conditions involving irony engaged the right temporal pole and the medial orbitofrontal cortex (for similar results see (Uchiyama et al., 2006; Wang et al., 2006)).
This brings us to the question of what component process of ToM tasks is processed in the ATLs? Considering the evidence for the involvement of the ATLs in social and semantic processing, it is likely that the ATLs contribute to the understanding of implied meaning through access to both general conceptual knowledge and to specific social conceptual knowledge. Background knowledge about social descriptors (e.g. words like friendly and devious) social rules, and social etiquette, as well as knowledge that is particular to the relationship between the sender and receiver, are critical for understanding an agent's actions and intentions. Of course, high-level ToM requires more than memory, it also requires inferential processing, and there is no evidence in our data or in the past literature, for an involvement of the ATLs in inferential processing per se. We argue that inferential processing is a necessary but insufficient prerequisite for the understanding of implied meaning or false belief. A candidate region for this process is the medial prefrontal cortex (mPFC) which is commonly activated in ToM tasks (e.g. (Carrington and Bailey, 2009) (Mitchell et al., 2005a, b).
The TPJ is also frequently activated to ToM tasks (Saxe, 2006). Our findings show that the ATL and TPJ were each sensitive to different types of semantic material: the ATL was more sensitive to stories with ToM content while the TPJ was more sensitive to the narrative aspects of the stories. The TPJ and ATL are anatomically connected via the middle longitudinal fasciculus (Schmahmann and Pandya, 2007) and functional imaging experiments involving complex language stimuli both regions are often activated together and thus a functional link can be assumed (Awad et al., 2007).
In sum, we propose that ToM processing requires access to social conceptual knowledge mediated by the ATLs and for this reason, the ATLs are frequently activated in ToM tasks. Our view is distinct from prior explanations of the ToM activations in the ATL. One explanation is that the ATLs are a store for personal semantic and episodic memories that are essential in social interactions (Gallagher and Frith, 2003). A related explanation is that the ATLs store mental scripts to provide a wider semantic and emotional context for understanding social interactions (Frith and Frith, 2003). There is little evidence that the ATLs store personal episodic memories or mental scripts insofar that they can be differentiated from semantic knowledge.
Our findings are compatible with the notion of the ATLs as a repository for semantic knowledge (Lambon Ralph and Patterson, 2008; Patterson et al., 2007; Rogers et al., 2004). In an extension of this view some semantic categories, such as social concepts occupy more specific locations within the ATL cortex (Zahn et al., 2009; Zahn et al., 2007) which explains why the ATLs have also shown to be engaged in seemingly different social cognitive tasks (Olson et al., 2007). The question remains however, why such few imaging experiments on semantic processing have implicated the ATLs?
In our experiment, the comparison between a lexical semantic task and a number comparison task failed to provide evidence for this notion (for similar findings see Simmons and colleagues (Simmons et al., 2009)). We speculate here that a possible answer lies in the different anatomical distributions of the neural representation of semantic categories within and between individuals. Some categories such as social concepts, living vs. non-living things or tools (see Devlin for a review (Devlin et al., 2002) may occupy more circumscribed and consistent cortical locations and may therefore appear in imaging studies with larger reliability. The modularity for these and possibly other semantic categories may be determined by their evolutionary relevant history (Caramazza and Shelton, 1998) whereas other, less salient categories may be more distributed and more dependent of each individual's ontogenetic exposure and are therefore more elusive in imaging experiments. Both types of categories, however, are equally affected by broad atrophy of the ATLs explaining why in most FTD patients semantic memory is broadly affected.
In our experiments we confirmed the role of the ATLs in social semantic processing. The engagement of social semantic representations is a likely explanation for the activation of the ATLs in other social cognitive tasks such as Heider and Simmel animations. We further tested the presumed role of the ATLs in the processing of narrative structure but could not confirm findings of past experiments. This could be explained by the fact that past investigators have not controlled for the social content of their stimulus material. Finally, we found that activation to theory of mind tasks overlapped with activation to the Heider and Simmel paradigm suggesting that tasks involving social attribution involve ToM processes. We speculate that ToM mentation may engage the ATLs by recruiting social semantic representations necessary for the understanding of mental processes of others.
We thank Roland Zahn for providing the lexical stimuli and expert advice, Bob Schultz for providing the Heider and Simmel stimuli, Marian Berryhill and David Drowos for general assistance, and Mark Elliot for assisting us with the design of the pulse sequence. We also thank the BrainVoyager team for their assistance with the analysis of the imaging data.
Funding: This work was supported by National Institutes of Health [MH071615 to I.O.] and National Institutes of Health grant [NS045839 to J.D.]. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of Mental Health or the National Institutes of Health.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.