|Home | About | Journals | Submit | Contact Us | Français|
To understand and remember stories, readers integrate their knowledge of the world with information in the text. Here we present functional neuroimaging evidence that neural systems track changes in the situation described by a story. Different brain regions track different aspects of a story, such as a character’s physical location or current goals. Some of these regions mirror those involved when people perform, imagine, or observe similar real-world activities. These results support the view that readers understand a story by simulating the events in the story world and updating their simulation when features of that world change.
The information available to readers when reading a story is vastly richer than the information provided by the text alone. For example, when reading about a soccer game, readers with a rudimentary knowledge of the sport are quickly able to grasp the meaning of the sentence “The midfielder scored a goal” even though the text does not explicitly state how the goal was made, who was involved, or where the action took place. These elaborate representations of the situations described by text – situation models – arise through the integration of a reader’s knowledge of the world with information explicitly presented in text (Kintsch & van Dijk, 1978). Situation models are proposed to guide ongoing comprehension, and thereby affect later memory (van Dijk & Kintsch, 1983).
Situation models are thought to function by maintaining and updating representations of information that is presented in a story. Multiple dimensions of the situation are maintained in situation models, including the characters and objects present, the spatial and temporal layout of the narrated situation, and the characters’ goals and intentions (Gernsbacher, 1990). Readers can use these different aspects of story-relevant information to index the degree of overlap between what they are currently reading and what has happened previously in the story. Readers may update their situation models at points when overlap is low (Gernsbacher, 1990; Zwaan & Radvansky, 1998).
Recent theories of reading comprehension suggest that the representations of these various situation model dimensions are based on the activity of brain regions involved in analogous perceptions and actions in the real world (Barsalou, 1999; Glenberg, 1997; Zwaan, 2004). These theories suggest that the same representations used for making or watching a goal kick are activated when reading about a goal kick. Behavioral evidence provides some support for this claim: After reading a sentence describing an action, people are faster to recognize a picture that is consistent with the action than a picture that is inconsistent with the action (Zwaan, Stanfield, & Yaxley, 2002), and are faster to make movements consistent with the action than movements that are inconsistent with the action (Glenberg & Kaschak, 2002).
Neuroimaging studies of single word reading have also provided some initial support for the hypothesis that readers’ representations of word meaning are grounded in visual and motor representations. These studies have demonstrated that brain regions involved in reading action words are some of the same regions involved in performing analogous actions in the real world. For example, reading verbs such as “run” or “kick” activates brain regions that are selectively activated when moving one’s foot (Pulvermüller, 2005). One limitation of these studies is that they used restricted lists of single words. The processing of such stimuli may differ substantially from the processing of meaningful stories. However, these results do suggest a strong but untested prediction about the brain regions that should be active during story reading: The brain regions involved in tracking different dimensions of a reader’s situation model should correspond to regions that have a role in seeing and acting out similar activities in the real world.
To test this claim, we recorded brain activity using functional magnetic resonance imaging (fMRI) while participants read four short narratives. Each narrative was coded on six different dimensions of story information thought to be relevant to readers’ situation models (Zwaan & Radvansky, 1998): references to temporal information (e.g., “immediately”), changes in the causal relationships between narrated activities (i.e., when the activity described was not caused by an activity described previously), points when the subject of the text changed (character changes), changes in characters’ spatial locations (e.g., moving from one room to another, or from one point to another within a room), changes in characters’ interactions with objects (e.g., when characters picked something up or put something down), and points when a character initiated a new goal (see Figure 1a). We then identified brain regions whose activity significantly increased at points when each of these aspects of the story situation had changed. In this way, we were able to determine whether the regions activated at these points were similar to the regions activated when observers or actors perceive or carry out analogous activities in the real world.
All 28 participants were right-handed, native English speakers (ages 19-34, 20 women), and all gave informed consent according to the guidelines set forth by Washington University. Five participants had data from only two (n = 1) or three (n = 4) stories due to equipment malfunction or participant fatigue.
Four narratives were taken from the book One Boy’s Day (Barker & Wright, 1951), and described the everyday activities of a seven year-old boy. The narratives described Raymond getting up and eating breakfast (“Waking up”), playing with his friends on the school ground (“Play before school”), performing an English lesson in school (“Class work”), and participating in a music lesson (“Music lesson”). For the current series of studies, all references to Raymond’s interactions with the observers who recorded his activities were deleted (these references were rare), and the scenes were shortened where necessary to keep the length of each narrative below 1,500 words (Waking up, 1368 words; Play before school, 1104 words; Class work, 1182 words; Music lesson, 1404 words). All stimuli can be downloaded from http://dcl.wustl.edu/DCL/stimuli.html.
An LCD projector was used to project stimuli onto a screen positioned at the foot of the scanner, and participants viewed the stimuli through a mirror connected to the head coil. Stimulus presentation and timing were controlled by PsyScope software (Cohen, MacWhinney, Flatt, & Provost, 1993) running on an Apple PowerMac G4 computer (Apple, Cupertino, CA). A PsyScope button box was used to record responses during the behavioral testing session.
Each narrative was presented one word at a time to minimize eye movements, with each word remaining on the screen for 200 ms, followed by a 150 ms/syllable blank delay. Participants practiced this reading method on a separate narrative prior to scanning until they reported being comfortable with word-by-word reading.
The four narratives ranged in length from 8.5 to 10.9 minutes, and the order of the narratives was counterbalanced across participants. The first and fourth authors coded the narratives for situation changes at the level of clauses. Clauses were defined by identifying verbs together with their arguments. Complement clauses, subordinate clauses, and relative clauses that were dominated by a larger unit were grouped with those larger units.
We assessed whether or not a given clause contained a change in any of six situational dimensions (see Zacks, Speer, & Reynolds, in press). Spatial changes consisted of changes in the locations of characters of the narrative focus, such as moving from one room in a house to another or moving from one region of interaction within a room to another (e.g., “Raymond raced down the terrace”). Object changes occurred when a character interacted in an object in a new way (e.g., Raymond picking up a candy Easter egg). Character changes occurred whenever the subject of a clause was different than the subject of the previous clause. Causal changes occurred whenever a clause described an activity that was not directly caused by an activity described in the previous clause (e.g., a character initiating a new action). Goal changes occurred whenever a character started an action with a new goal. Although there were no temporal changes, each clause was coded for the presence or absence of a temporal reference (e.g., “immediately” or “slowly”). Mean inter-rater reliability across the situation changes was .77 as measured by Cohen’s Kappa, and disagreements were resolved by discussion.
Participants were told in advance that they would be given a comprehension test at the end of the session. Mean accuracy on this 20-item, 4-alternative multiple-choice test was 82.74% (SEM = 2.14%), indicating that participants were comprehending the narratives. Participants returned for a second, unscanned behavioral testing session in the laboratory (see Speer, Reynolds, & Zacks, 2007), but only the data from the scanning session are relevant to the current study.
Images were acquired on a 3-T Siemens Vision MRI scanner (Erlangen, Germany). High-resolution (1 × 1 × 1.25 mm) structural images were acquired using a sagittal MP-RAGE T1-weighted sequence. Blood oxygen level dependent (BOLD) functional images were acquired using a T2*-weighted asymmetric spin-echo echo-planar sequence, with 32 slices (4.0 × 4.0 mm in-plane resolution) acquired every 2.048 s. A T2-weighted fast turbo spin-echo scan was acquired in the same planes as the functional scans to map the functional data to the structural data. The functional data were pre-processed to correct for timing offsets, slice intensity differences, and participant movement, and warped to a standard stereotactic space with isotropic voxels (3 × 3 × 3 mm) (Talairach & Tournoux, 1988). Data were then smoothed with a Gaussian filter (2 mm full-width half-maximum).
Each participant’s brain response to each of the situation changes was estimated using the general linear model (GLM). Individual clauses were treated as trials in a rapid event-related data analysis. The clause start variable coded the onset of each trial. Clauses varied considerably in duration, and the interval between successive instances of each type of change varied considerably, which made it possible to accurately estimate the independent effects of each type of change (Maccotta, Zacks, & Buckner, 2001; Zacks et al., 2001). Six additional variables coded which (if any) situation changes occurred during each clause. The clause starts and situation changes were each coded as a 500-ms impulse at the beginning of the clause and convolved with a canonical hemodynamic response function with time constant = 1.25 s, delay = 2.0 s (Boynton, Engel, Glover, & Heeger, 1996) to generate regressors from the GLM. Ten additional regressors coded for effects of no interest (terminal and non-terminal punctuation, differences across each BOLD run, and the linear trend within each BOLD run). Participants with fewer than four BOLD runs had fewer regressors coding for differences across and linear trends within BOLD runs. Paired sample t-tests compared each of the situation changes to the clause start variable in order to generate maps of t-statistics for each of the six situation changes for each participant. The t-statistic maps were converted to z statistics and thresholded to control the map-wise false positive rate at p = .01 (clusters of at least 4 contiguous voxels with z values greater than 4.5; McAvoy, Ollinger, & Buckner, 2001). These maps were combined to create a composite map illustrating the voxels that responded significantly to one of the situation changes or to multiple situation changes. (A single category was used for those voxels that responded to more than one change to simplify the visualization.) The map was projected onto the cortical surface using CARET with the PALS atlas (Van Essen et al., 2001; Van Essen, 2002; http://brainmap.wustl.edu/caret; http://brainmap.wustl.edu:8081/sums/directory.do?id=636032).
To characterize the activated regions, local maxima in the statistical map for each situation change were identified, subject to the constraint that no two maxima were closer than 20 mm. Each significant voxel was assigned to the closest local maximum to define regions of interest for reporting and for further analyses. In order to test regional selectivity, region-based analyses asked whether, after removing the variance in the BOLD data associated with the situation change used to define each region, any of the remaining situation changes accounted for substantial additional variance. We used a hierarchical regression approach. In stage one we fit linear models for each region predicting the fMRI signal for each participant from the nuisance variables, the clause start variable, and the situation change variable used to define the region. In stage two we used each of the remaining situation change variables as the sole predictors in a simple regression model of the residuals from the stage one model. This was performed separately for each participant and the regression coefficients from the stage two models were compared to zero in t-tests with subject as the random effect (df = 27). Regions for which none of the t statistics exceeded 1.0 were characterized as selective for a single situation change. For a region with an effect of one of the other variables that was conventionally “medium” in size (d = .5; Cohen, 1988), the power to detect that effect by this criterion is .89; for a region in which two situation changes have medium effects, the power is .99.
The regions responding to situation changes are illustrated in Figure 1B and listed in Table 1. Figure 1B shows all brain voxels that were associated with one or more situation changes, with those that were significantly associated with two or more situation changes colored pink. Activity in a number of regions changed during processing of the different types of changes. Furthermore, the neural responses to particular types of changes in the stories occurred in the vicinity of regions that increase in activity when viewing similar changes or when carrying out similar activities in the real world.
Adjacent and overlapping regions in bilateral posterior superior temporal cortex (Brodmann’s areas 22/39) responded to changes in characters and goals. These regions also increase in activation when observing goal-directed, intentional actions relative to non-goal directed, meaningless motion (Decety & Grezes, 1999). Changes in characters’ goals also were associated with increased activation in prefrontal cortex (BA 9, 44, 46), damage to which results in impaired knowledge of the typical order and structure of daily, goal-directed activities (Wood & Grafman, 2003).
Regions that increased for character-object interactions included several regions considered part of the human grasping circuit (Castiello, 2005). One region of the lateral precentral sulcus (BA 6) likely corresponds to the premotor hand area (e.g., Ehrsson, Geyer, & Naito, 2003); another region, in the postcentral cortex (BA 2/40) likely corresponds to the somatosensory hand representation (Porro et al., 1996) and adjacent anterior intraparietal cortex (Johnson et al., 2002). Consistent with these regions’ involvement during grasping, both the precentral and postcentral activations were lateralized to the left hemisphere. The character-object interactions that were associated with these increases typically referred to characters putting down or picking up objects (e.g., “Raymond laid down his pencil”).
Two bilateral superior frontal regions (BA 6) responded to changes in characters’ spatial locations. The locations of these regions fall within the 95% confidence intervals for functionally defined frontal eye fields (FEF), which increase in activation during saccadic eye movements relative to fixation (Speer, Swallow, & Zacks, 2003). Regions in right and left parahippocampal cortex, which increase in activation when processing changes in spatial location (Burgess, Maguire, & O’Keefe, 2002), also showed increased activation in relation to changes in characters’ spatial locations.
Regions that increased during temporal references included the inferior frontal gyrus (BA 45/47), insula (BA 44), intraparietal sulcus (BA 7), medial posterior cortex (precuneus and cingulate gyrus, esp. BA 23/31) and anterior cingulate gyrus (BA 32), as well as posterior and anterior white matter tracts. The neurophysiology of time perception in this range of durations (seconds to minutes) is not well understood, so there are few if any neuroimaging data with which to compare these results. However, the cortical activations do correspond well with those observed in a recent study comparing stories with temporal inconsistencies to stories with emotional inconsistencies (Ferstl, Rinck, & von Cramon, 2005). (The extensive activations in white matter were unexpected and await further empirical confirmation.)
Figure 1B suggests that a core network comprising the medial posterior cortex (precuneus, posterior cingulate cortex, the temporoparietal junction) and the lateral posterior frontal cortex were activated by multiple situation changes. Of note, all the regions that responded to causal changes also responded to other situation changes. However, Figure 1 also suggests that some brain regions were selectively activated by only one type of situation change. Given that a region shows a significant response to one situation change, the mere failure to detect significant responses to other changes is weak evidence of selectivity—particularly given the stringent statistical thresholds used here. To provide a direct assessment of selectivity for a single change, we performed a set of hierarchical regression analyses (see Imaging Data Analysis, above). The subset of regions that were determined to respond selectively to a single type of situation change are marked with asterisks in Table 1 and illustrated in Figure 2. These included responses to character changes in the posterior superior temporal sulcus and in medial frontal cortex, responses to goal changes in lateral frontal cortex, responses to object changes in premotor cortex, and responses to time changes in the left frontal operculum and anterior cingulate cortex.
An additional analysis was conducted to identify regions that might play a role in determining when perceptual and motor representations of characters, goals, etc. should be updated in a reader’s situation model. Because readers update their situation models when incoming information conflicts with information maintained in the active situation model (Zwaan & Radvansky, 1998), the more dimensions that change at a given point in the story, the more likely that the active situation model is updated. This analysis coded for the total number of situation changes present in each clause in the GLMs rather than the type of changes (0, 1, 2, or ≥3 changes). A linear contrast identified voxels whose activation linearly increased with increasing numbers of changes, and the resulting t-statistics were generated in the same manner as the t-statistics for the individual situation model changes.
The number of changes in a given clause was related to activation in many of the change-related regions, such as dorsolateral prefrontal cortex (BA 9/46), posterior parietal cortex (BA 7/40), posterior cingulate cortex (BA 7/29/31) cortex, and bilateral hippocampi (BA 36) (compare Figure 1 and Figure 3, and see Table 2). This sensitivity to the number of changes in a clause may reflect the increased processing demands at points where multiple aspects of the narrated situation are changing, the higher probability of encountering a change on a given dimension, or the process of updating the situation model.
These results suggest that readers dynamically activate specific visual, motor, and conceptual features of activities while reading about analogous changes in activities in the context of a narrative, while reading: Regions involved in processing goal-directed human activity, navigating spatial environments, and manually manipulating objects in the real world increased in activation at points when those specific aspects of the narrated situation were changing. For example, when readers processed changes in a character’s interactions with an object, precentral and parietal areas associated with grasping hand movements increased in activation. Previous studies of motor execution and motor imagery provide strong evidence that the portion of premotor cortex identified in this study performs computations that are specific to motor planning and execution (Ehrsson et al., 2003; Michelon, Vettel, & Zacks, 2006; Picard & Strick, 2001). These results suggest that readers use perceptual and motor representations in the process of comprehending narrated activity, and these representations are dynamically updated at points where relevant aspects of the situation are changing.
Several recent studies have reported modality-specific brain activation using paradigms in which participants made judgments about individual words (Hauk, Johnsrude, & Pulvermuller, 2004; Hauk & Pulvermuller, 2004; Goldberg, Perfetti, & Schneider, 2006a; Goldberg, Perfetti, & Schneider, 2006b) or phrases (Aziz-Zadeh, Wilson, Rizzolatti, & Iacoboni, 2006; Noppeney, Josephs, Kiebel, Friston, & Price, 2005). However, such paradigms leave open the possibility that evoked responses could reflect, in part, cognitive operations that are specific to the specific word or phrase judgment task. By contrast, the current paradigm used continuous reading of extended passages with no overt judgment task.
Although a number of regions responded selectively to a particular type of change, there were also a number of regions whose activity increased for more than one type of situation change (compare Figures Figures11 and and2).2). These regions may be particularly important for indicating when the representations of characters, goals, etc. should be updated in a reader’s situation model. Because readers update their situation models when incoming information conflicts with information maintained within the active situation model, increasing the number of aspects of the situation that are changing may increase the likelihood that the active situation model is updated (Gernsbacher, 1990; Zwaan & Radvansky, 1998). This updating process should be associated with the perception that a new narrative event has begun (Zacks, Speer, Swallow, Braver, & Reynolds, 2007). Indeed, previously reported analyses of these data provided evidence that when changes occur readers tend to perceive that a new event has begun (Speer et al., 2007; see also Zacks et al., in press).
Figure 3 indicates that the number of changes in a given clause was related to activation in many of the regions depicted in Figure 1, such as dorsolateral prefrontal cortex (BA 9/46), posterior parietal cortex (BA 7/40), posterior cingulate cortex (BA 7/29/31) cortex, and bilateral hippocampi (BA 36). This sensitivity to the number of changes in a clause may reflect the increased processing demands at points where multiple aspects of the narrated situation are changing, or the higher probability of encountering a change. However, a region in the anterior cingulate cortex (BA 32), which was not involved in processing any of the individual changes, also increased in activity with increasing numbers of changes. Given the role of the anterior cingulate cortex in monitoring external and internal conflict (Brown & Braver, 2005), activation in this region may serve as a cue for the reader to update the current situation model, or begin constructing a new model. Additional studies are needed to determine the reason for this relation between activation and the number of changes in a reader’s situation model.
The collection of medial brain regions associated with situation changes in the current study closely resembles a network of regions that have been recently associated with the act of projecting one’s self into a remembered, anticipated, or imagined situation (Buckner & Carroll, 2007). These regions are functionally connected to the hippocampi (Vincent et al., 2006), which were also observed to increase in activity with increasing numbers of situation changes. This convergence is consistent with the idea that readers construct simulations of situations as they read a text, and that this process is similar to those of recalling previous situations or imagining potential ones.
Overall, these data make a strong case for embodied theories of language comprehension, in which readers’ representations of situations described in language are constructed from basic sensory and motor representations (Barsalou, 1999; Glenberg, 1997; Zwaan, 2004). However, the use of perceptual and motor representations to guide story comprehension may be an example of a more general, fundamental principle of cognitive function. Brain regions involved in motor function are active when viewing another person execute an action (Rizzolatti & Craighero, 2004). When viewing a movie, somatosensory and motor cortices increase in activity during scenes showing close-ups of features such as hands and faces (Hasson, Nir, Levy, Fuhrmann, & Malach, 2004), and similar correspondences exist between the regions involved in perceiving and later remembering auditory and visual information (Wheeler & Buckner, 2004). Thus, the use of sensory and motor representations during story comprehension observed in the current study may reflect a more general neural mechanism for grounding cognition in real-world experiences. Language may have adopted this general mechanism over the course of human evolution to allow individuals to communicate experiences efficiently and vividly.
NKS is now at the Western Interstate Commission for Higher Education; JRR is now at the University of Denver; KMS is now at the University of Minnesota. This research was supported by a grant from the National Institute of Mental Health (NIH RO1-MH70674), and a dissertation research award from the American Psychological Association. We thank Rebecca Hedden and Carol McKenna for assistance with data collection, and Dave Balota and Randy Buckner for comments on a previous draft of the manuscript.