|Home | About | Journals | Submit | Contact Us | Français|
Language processing in context requires more than merely comprehending words and sentences. Important subprocesses are inferences for bridging successive utterances, the use of background knowledge and discourse context, and pragmatic interpretations. The functional neuroanatomy of these text comprehension processes has only recently been investigated. Although there is evidence for right-hemisphere contributions, reviews have implicated the left lateral prefrontal cortex, left temporal regions beyond Wernicke’s area, and the left dorso-medial prefrontal cortex (dmPFC) for text comprehension. To objectively confirm this extended language network and to evaluate the respective contribution of right hemisphere regions, meta-analyses of 23 neuroimaging studies are reported here. The analyses used replicator dynamics based on activation likelihood estimates. Independent of the baseline, the anterior temporal lobes (aTL) were active bilaterally. In addition, processing of coherent compared with incoherent text engaged the dmPFC and the posterior cingulate cortex. Right hemisphere activations were seen most notably in the analysis of contrasts testing specific subprocesses, such as metaphor comprehension. These results suggest task dependent contributions for the lateral PFC and the right hemisphere. Most importantly, they confirm the role of the aTL and the fronto-medial cortex for language processing in context.
Language comprehension requires more than just understanding words and sentences. Over and above these linguistic processes, realized in left-sided perisylvian brain regions, a variety of additional cognitive processes are recruited. For creating a coherent representation of a story or a dialogue, it is necessary to bring in general world knowledge, to integrate the current utterance with the prior context, or to check the consistency of the resulting interpretation with the communicative situation. Because of these requirements, it is not surprising that neuroimaging studies of text comprehension have unveiled an extended language network1 (ELN) of brain regions to be involved during inferencing and interpretation.
Two single studies illustrating this network are the seminal PET study by Mazoyer et al. , as well as a recent experiment using fMRI [Xu et al., 2005]. The activation patterns include the anterior temporal lobes (aTL) bilaterally, extending into the temporal poles, as well as activation all along the superior temporal sulcus (STS) on the left side. Many studies have found inferior frontal gyrus (IFG) activation. Importantly, fronto- and parieto-medial activations are reported when coherent language is read or heard. Based on reviews of imaging studies, Ferstl [in press] and Mar  have shown that these results are rather stable across studies.
There are several open questions, however. The first question concerns the overlap of the ELN with the network identified to be crucial for Theory-of-Mind processes [ToM; Frith and Frith, 2003; Gallagher and Frith, 2003]. ToM or mentalizing refers to the ability to attribute other people’s motivations, emotions, actions, and thoughts to their mental states and to their beliefs about the world. During communication, ToM processes help to understand the speaker’s or writer’s intentions, and some pragmatic theories in linguistics equate ToM and language processing in context [cf. Frith and Frith, 2003; Sperber and Wilson, 1995]. The brain regions consistently reported in ToM studies are the aTL, the temporo-parietal junction (TPJ), and most importantly, the dorso-medial prefrontal cortex (dmPFC). We have found this latter region to be involved during inference processes in text comprehension [Ferstl and von Cramon, 2001, 2002; Ferstl et al., 2005]. When inspecting the available neuroimaging literature on text comprehension, it can be seen that several studies were actually designed to test ToM processes rather than language comprehension [e.g., Fletcher et al., 1995; Vogeley et al., 2001]. Thus, it is an open question whether the apparent overlap between the ToM network and the ELN might be an artefact of including ToM studies in reviews of text comprehension research [e.g., Mar, 2004], or whether it is because of an overlap of the processes recruited in the two domains.
The second question concerns the role of the lateral prefrontal cortex for language comprehension in context. Although there is neuropsychological evidence for text comprehension deficits after left PFC lesions [Ferstl et al., 2002], many imaging studies show clear dominance of temporal regions for text comprehension [e.g., Crinion et al., 2002]. Moreover, the lateral PFC encompasses a number of subregions with dissociable functions [Bookheimer, 2002; Brass et al., 2005]. The more ventral2 PFC including the IFG is important for phonological, syntactic, and semantic processes. More dorsal PFC regions in the middle frontal gyrus and inferior frontal sulcus are important for executive functions and working memory. Although the linguistic processes are expected to be comparable to those on the word and sentence level, it is likely that the dorso-lateral PFC contribution, related to executive functions, increases with the task demands. Whether it is crucial for text comprehension however, or merely a byproduct of memory or executive functions is an open question.
The third question concerns the contribution of the right hemisphere (RH). Neuropsychological theories and clinical observations have long focussed on the importance of the RH for text comprehension and pragmatic interpretation [e.g., Beeman, 2005; Beeman and Chiarello, 1998; Brownell et al., 2000]. However, the evidence from both patient studies and imaging studies is mixed [Ferstl, in press; Lehmann and Tompkins, 2000; Mar, 2004]. For many of the subprocesses of text comprehension, there are studies showing clear RH activations and others that do not [metaphors: Rapp et al., 2004 vs. Bottini et al., 1994, situation model building: Maguire et al., 1999 vs. George et al., 1999, etc.; inferences: Mason and Just, 2004; Ferstl and von Cramon, 2001]. Depending on the particular focus, the interpretations of these studies, as well as overarching reviews of this work, stress the role of the RH [e.g., Bookheimer, 2002; Mar, 2004] or the functions of the left fronto-medial and lateral PFC regions [Ferstl, in press].
To further investigate these issues, we present a quantitative review of 23 relevant text comprehension studies. The analysis was conducted using a meta-analysis method developed by Neumann et al. , which combines activation likelihood estimation [ALE; Turkeltaub et al., 2002] with replicator dynamics [Schuster and Sigmund, 1983; see also Lohmann and Bohn, 2002, for application in functional imaging]. In ALE, single activation peaks are considered evidence for underlying probability distributions. Based on these distributions, cortical regions or clusters are determined in which nonrandom activation can be assumed. The ALE result consists of a number of cortical areas that play a central role in processing the cognitive task of interest. In a second step, functionally related sub-networks are identified. The goal is to group ALE clusters together, based on their co-occurrence in the investigated studies. In order to do so, activation peaks falling within the ALE-clusters are subjected to a replicator process.
The contrasts included in the meta-analysis were collected from studies on higher level language comprehension. Studies were included in which either connected text was presented (e.g. stories) or in which pragmatic interpretations were required (e.g. metaphor comprehension). Studies on the comprehension of single sentences, in particular those concerned with syntactic or semantic processes, were excluded. Because of the small number of relevant articles, studies using fMRI as well as PET, auditory as well as visual presentation, and all languages were considered (e.g. German, Italian, French, Japanese, and English). Care was taken to exclude any contrasts in which ToM processes were targeted explicitly.
Four separate analyses were conducted based on a classification of the contrasts reported in the studies. First, all contrasts in which connected language was compared to a resting baseline were collected (Rest). To subtract perceptual processes contrasts using a nonlanguage baseline, such as speech played backward, were considered (Language). The third analysis used contrasts in which word or sentence level language processing was subtracted by comparing coherent to incoherent text (Coherence). Finally, despite the heterogeneity of the relevant studies, an exploratory analysis was conducted using coordinates reflecting specific subprocesses of pragmatic language comprehension, in particular situation model building and metaphor comprehension; Special).
The first goal of the study was to replicate the results obtained in prior reviews for language comprehension compared with a resting or nonlanguage baseline [Ferstl, in press; Mar, 2004]. In particular, the importance of the aTLs, the fronto-medial cortex, and the TPJ were to be confirmed. Second, the role of the right hemisphere was of interest. By assessing the coactivations of regions using replicator dynamics, we attempted to gain insight into the interplay between regions, and possibly between homolog regions in the two hemispheres. A comparison of the four analyses was expected to shed light on the specificity of the regions for language processing in context. Of particular interest was whether the fronto-medial cortex could be associated with coherence building.
Twenty three available neuroimaging studies were included in which connected text was presented. Studies using single sentences were included only if the task explicitly focussed on pragmatic or interpretative features (e.g. metaphor comprehension).
To identify relevant articles, a literature search using the database Current Contents® (all editions) was conducted in the fall of 2005. Current Contents is provided by the Institute for Scientific Information-Thomson Scientific® (http://scientific.thomson.com/) and contains bibliographic references of over 7,500 journals, including medical and psychology journals from 1993 onward. The results of a search using the key words discourse, text, context, and comprehension were screened for imaging studies meeting the requirements. Because not all authors use these keywords, in particular if other issues were in the focus of the study [e.g. Giraud et al., 2000], the search results were supplemented with other relevant studies that had come to our attention. Studies were excluded that used an anatomically based region of interest analysis without providing coordinates in a standard stereotactic space. Contrasts using special populations were excluded as well (e.g., left handers, brain injured patients).
All reported contrasts were categorized according to the baseline used. First, nine contrasts from seven studies were identified in which text comprehension was compared with a resting baseline without stimulus presentation (Rest). Second, 16 contrasts from 11 studies were chosen that used a perceptual, nonlanguage baseline, such as speech played backwards or nonletter strings (Language). Third, 13 contrasts from 10 studies were identified in which coherent, comprehensible text was compared with an incoherent language baseline, such as word lists or unrelated sentences (Coherence). Finally, nine contrasts from six studies were considered in which specific processes such as metaphor comprehension or topic continuity were targeted by considering direct comparisons of qualitatively different language comprehension tasks (Special). Three of these studies concerned metaphor comprehension and moral interpretation, whereas the others were concerned with situation model building, i.e., with setting up a globally coherent representation of connected text [Kintsch, 1998]. It is important to keep in mind that this last category combines heterogenous subprocesses that might not yield consistent, overlapping results.
Care was taken to exclude contrasts specifically probing ToM processes. For instance, from the study by Fletcher et al.  only the contrast comparing physical stories to unrelated sentences was entered into the analyses, but not the contrast comparing ToM stories or all stories to unrelated sentences. The complete list of contrasts is given in Tables I–IV.
For each of the four categories, all coordinates listed in the articles were entered. If necessary, MNI coordinates were transformed into Talairach coordinates using the formula proposed by Matthew Brett (www.mrc-cbu.cam.ac.uk). The total number of coordinates entering the analyses was 93 for Rest, 156 for Language, 76 for Coherence, and 48 for Special.
The analysis was performed as proposed by Neumann et al. . First, activation likelihood estimation (ALE) [Derrfuss et al., 2005; Lancaster et al., 2005; Turkeltaub et al., 2002] was applied to the list of peak coordinates of the activations reported in the original articles. Single activation peaks were represented by three-dimensional Gaussian probability distributions with a standard deviation of 4 mm (FWHM = 9.4 mm). This standard deviation was chosen to approximately match filter sizes commonly used in fMRI studies. The union of the distributions yielded empirical activation likelihood estimates for all voxels, including voxels that were not in the collection of reported peaks. A voxel in close vicinity to one or several peak coordinates thereby received a higher ALE compared with voxels far removed.
Using Monte Carlo simulations, an ALE map for randomly distributed activation peaks was computed, containing as many activation peaks as reported. Specifically for 1,000 times, voxels of a brain volume mask were randomly chosen to represent activation peaks according to a uniform distribution, and ALE values were calculated for all voxels in the brain volume. The entire volume of the brain normalized to the standard size as provided by the software package LIPSIA [Lohmann et al., 2001] served as mask volume, but the distribution of random activation peaks was restricted to the area spanned by the minimum and maximum Talairach coordinates of the empirical activation peaks. The resulting 1,000 ALE maps were averaged to yield the map that served as null hypothesis against which the significance of the empirical ALE values was tested [Turkeltaub et al., 2002]. Voxels were considered significantly activated if they exceeded an ALE threshold corresponding to an α-level of 0.05% (uncorrected). Topologically connected areas of at least three voxels exceeding this threshold were considered as clusters of activation.
From the coordinates falling within the determined clusters, a co-occurrence matrix was formed, recording for each pair of clusters the number of co-occurrences in the individual contrasts. This matrix was then subjected to a replicator process. Based on the principles of natural selection, this process determines subnetworks of ALE clusters with the property that every cluster included in the network co-occurs relatively often with every other network member [Lohmann and Bohn, 2002; Neumann et al., 2005; Schuster and Sigmund, 1983]. Specifically, at the beginning of the replicator process, each ALE cluster was assigned a so-called membership value. After convergence of the replicator process, a cluster was considered a member of the so-called dominant network, if its membership value exceeded the average membership value 1/n, where n is the number of ALE clusters included in the analysis. This way, a network of cortical areas that are likely to be most relevant for the cognitive task was chosen. After identifying this dominant network, its clusters were removed from the co-occurrence matrix and the replicator process was repeated to yield a second network from the remaining clusters [Lancaster et al., 2005; Lohmann and Bohn, 2002; Neumann et al., 2006]. For each of the resulting networks, the contributing contrasts were inspected. To uncover findings that have been replicated at least once, only those replicator networks are reported which co-occurred in at least two studies.
The contrasts in which language was compared with a resting baseline yielded a bilateral, fronto-temporal network of regions listed in Table V and displayed in Figure 1. In addition to large mid superior temporal gyrus activations, there were significant homolog regions in the lateral aTLs, reaching from the STS into the temporal pole. Taken together, the sizes of the temporal activations were comparable in the two hemispheres. Lateral prefrontal activations included the frontal operculum in both hemispheres, a left-sided premotor region in the precentral gyrus and a right-sided area in the middle frontal gyrus close to the junction of the precentral and inferior frontal sulci. The only medial region showing a significant contribution was the presupplementary motor area (pre-SMA; BA 6/8).
The results of the replicator dynamics yielded a dominant network consisting of the two anterior temporal foci (R1, R2) and the left STG region (R3). These regions were concurrently activated in three of the seven studies.
The language contrasts, compared with a perceptual baseline, uncovered a large network of fronto-temporal regions, remarkably similar to that found for the Rest contrasts. These results are shown in Table VI and Figure 2. The most obvious difference is that the language network is more clearly left-lateralized. Once more, there were bilateral contributions of the aTLs, and left-dominant middle and posterior temporal regions. In addition to these regions, which were only slightly more ventral than the STG regions reported in the Rest analysis, a focus in the inferior temporal lobe and a contribution of the right collateral sulcus emerged. Finally, there were prefrontal activations in the triangular part of the left IFG and in the right precentral sulcus, as well as a focus in the left lateral cerebellum.
The replicator dynamics identified the bilateral anterior temporal regions to be dominant (L1, L2). Seven contrasts from five studies contributed to this result. When removing this first replicator solution, the left IFG (L8) and the left posterior STS (L3) proved to form a subnetwork. These regions might be considered part of the perisylvian language cortex and thus to reflect word and sentence level processes.
The contrasts reflecting the comprehension of coherent, comprehensible text showed bilateral aTL activation. The foci are more medial than those in the two previous analyses. All other regions were in the left hemisphere. As before, the mid and posterior middle temporal gyrus, as well as the left IFG were part of the significant network. In addition, four left-medial areas proved to be important for coherence building: two regions lay in the posterior cingulate cortex and inferior precuneus (PCC/prec, BA23/31) one in the dmPFC (BA 10), and one in the ventro-medial prefrontal cortex [BA 11]. In the recently introduced terminology by Ramnani and Owen , these areas are the polar and rostral sections of BA 10.
Based on two studies, the replicator dynamics identified the left aTL(C1) and the mid MTG (C5) as coactivated. The second subnetwork consisted of the dmPFC (C8) and the more posterior of the two PCC/prec regions (C9). Although this latter result resembles the findings from the coherence judgment task [Ferstl and von Cramon, 2001, 2002], two studies from other laboratories gave rise to this finding [Bottini et al., 1994; Xu et al., 2005] (Fig. 3, Table VII).
The contrasts reflecting special text interpretation processes such as metaphor comprehension or the understanding of the moral of a story are to be taken with caution. Only nine studies entered into this analysis and only 48 coordinates were used. Nevertheless, the results once more yield bilateral anterior temporal regions. However, the RH area is now larger, and the LH peak lies more ventrally, at the anterior end of the inferior temporal gyrus. In addition, there were three regions in the left IFG (BA45/47/11). Right-sided foci included two dorso-lateral PFC regions, close to the inferior frontal and precentral sulci (BA9/46), and the TPJ.
Because of the heterogeneity of the contrasts, only the two largest regions, the right aTL and the right IFS, were based on coordinates from more than one study. Similarly, none of the replicator dynamics solutions reflected co-occurrence in more than one study (Fig. 4, Table VIII).
To visually illustrate the stability of the anterior temporal contribution, an overlap map of the ALE maps was calculated. There was no overlapping result for all four analyses. Excluding the Special network, only one voxel was common to the Rest, Language, and Coherence comparisons. The location of this voxel in the left aTL is shown in Figure 5.
The meta-analysis of language processing in context took advantage of a two-step procedure. In the first step, significant clusters of activation were determined based on activation likelihood estimation. In the second step, replicator dynamics were used to identify subnetworks of coactivated regions. Previous studies using similar numbers of peak coordinates as used here have established the method as reliable [Lancaster et al., 2005; Neumann et al., 2005]. Moreover, the analyses are based on a larger number of studies than prior qualitative reviews [e.g., Gernsbacher and Kaschak, 2003; Mar, 2004] so that we can be confident about their results.
The most striking result of the meta-analysis is the stability of activation in the aTLs. In all four analyses, which were in part based on different studies from different laboratories using different scanning methods, modalities, and materials, bilateral involvement of this region emerged. An illustration of the overlap between the results uncovered a small region in the left aTL that contributed to three of the four results. The main difference between the analyses was that for the coherence analysis, the activation was more medial than for the Rest, Language, and Special analyses.
In all four analyses, the coordinates of the homolog regions in the left and right hemispheres were in remarkably close vicinity. In the Rest and Language analyses, the replicator dynamics solution identified the left and right aTL regions as the dominant subnetwork. This result suggests a bilateral function of the aTL in which the homolog regions are closely interrelated.
The meta-analyses corroborate the role of the aTLs for text comprehension. Although their specific function is still under debate, a crucial contribution to sentence comprehension has been appreciated recently [for review see Stowe et al., 2005]. The theoretical proposals for aTL functionality include memory processes, in particular for autobiographical and emotional, episodic memory, and semantic processes, in particular category specific retrieval of proper names or animate entities [e.g., Damasio et al., 1996; Leveroni et al., 2000; Maratos et al., 2001; Martin and Chao, 2001]. The finding of sensitivity to syntactic violations [Friederici et al., 2003] might either be explained by resulting difficulties with semantic integration or by slight shifts in the anatomical location. In studies on language interpretation, the activations tend to be at the anteriormost end of the STS, reaching into the temporal pole, whereas activations related to syntactic features lay more dorsally, in the anterior temporal plane. Attempts to dissociate these anatomical regions [Humphries et al., 2005] are often difficult because lack or violations of syntactic structure immediately affect the ease of semantic integration [Ferstl, in press].
Because the temporal lobes are multimodal association areas, it is likely that syntactic, semantic, and episodic information sources are indeed integrated to transform the language input into a meaning based representation. The most parsimonious account, in the framework of text comprehension, is that the aTL implements propositionalization, the process required for combining words into semantically based content units [Kintsch, 1998]. This process, similar to the concept of semantic encoding [Stowe et al., 2005], clearly utilizes prosodic, syntactic, and lexical information to derive a semantic representation. It is an open question whether the more medial anterior temporal regions uncovered in the coherence contrasts have a different functionality, for example related to autobiographical memory [Fink et al., 1996].
In all four analyses, middle and posterior temporal activations were significant. These activations were in the mid portion of the superior temporal gyrus in the Rest analysis, reflecting the fact that the majority of contrasts entering into this analysis used auditory presentation. Turkeltaub , in their meta-analysis of word reading, reported bilateral peaks very close to ours. These middle temporal activations are slightly more ventral, in the STS, when a perceptual baseline is subtracted [cf. Dehaene et al., 1997]. In the language analysis, there were three foci along the left STS and the adjacent part of the middle temporal gyrus. For more specific contrasts, the activation peaks appear to lay more posteriorly and dorsally. In particular, the Coherence analysis and the Special analysis yielded foci in the left posterior middle temporal gyrus and the right TPJ, respectively. Bavelier et al. , in an early study on sentence processing, analyzed the time course of temporal activation. They found a spread of activation from mid superior temporal lobe both anteriorly into the temporal pole and posteriorly into the TPJ. Thus, the mid portion is likely to implement basic language perception, whereas the anterior and posterior temporal lobes are concerned with integration and interpretation.
The activations in the lateral PFC lay predominantly in the IFG. One activation fell into the dorsal triangular part close to the inferior frontal sulcus (BA 44/45), traditionally named Broca’s area. This region proved important for language processing compared with a perceptual baseline. Because this region co-occured with activation in superior temporal gyrus (or Wernicke’s area) this result is likely to reflect language processing on the word and sentence level. This activation can be considered to reflect language processing on the word and sentence level.
Most of the IFG activations were in the opercular part or reaching into the orbito-frontal cortex (BA47/12). Three of these activations were found in the Special analysis, and come from one study only [Ferstl et al., 2005]. More readily interpretable are the bilateral activations in the frontal operculum for the Rest analysis. This finding has often been reported for auditory language processing, independent of context or comprehensibility [Meyer et al., 2000]. However, the left-sided opercular region was also significant in the analysis of Coherence. Thus, it is likely that its function goes beyond language perception. Proposals include the processing of sentence level context [Baumgärtner et al., 2002] or the need to make decisions based on recently encountered stimuli [Petrides et al., 2002; Ferstl et al., 2005].
Interestingly, the meta-analyses reported here reflect the lack of large dorso-lateral prefrontal activations in many of the studies. There were two right-sided foci. In the Rest analysis, the right middle frontal gyrus proved active, whereas in the Special analysis a slightly more ventral region in the inferior frontal sulcus emerged.
Medial activations proved reliable in two of the four analyses. An area in the presupplementary motor area (pre-SMA) appeared when language processing was compared with a resting baseline. The peak of this activation is almost identical to that found in the meta-analysis of word reading [Turkeltaub et al., 2002]. Thus, it is likely that the activation is related to inner speech used for encoding both auditory and written language.
Consistent with previous reports on inferencing, we also found medial activations in the meta-analysis of coherence building. There was a pair of activations in the fronto- and parieto-medial cortices, which replicator dynamics suggested to be coactivated. Because of cortico-cortical connec tions of these two regions [Barbas, 1992], coactivation of fronto- and parieto-medial areas has been reported in numerous studies [e.g., Zysset et al., 2002, 2003], and in particular in studies on inferencing [Ferstl and von Cramon, 2001, 2002]. The replicator solution reported here was based on coordinates from two studies from other laboratories, though [Bottini et al., 1994; Xu et al., 2005]. This finding shows that the results obtained with the coherence judgment task [Ferstl and von Cramon, 2001, 2002] generalize to other materials and tasks.
The small ventro-medial region in BA 11 (or, more specifically, in rostral BA 10 in the terminology recently suggested by Ramnani and Owen, 2004) was also significant in the Coherence analysis. A number of recent results have implicated this region for emotion processing in the context of verbal tasks and on verbal humor [Ferstl et al., 2005; Goel and Dolan, 2001, 2004; Siebörger et al., 2004]. And indeed, among the studies contributing to this result is an experiment in which funny cartoons provide the background knowledge for loosely connected sentences [Maguire et al., 1999]. The activation uncovered by the meta-analysis might thus reflect concurrent affective reactions elicited by the language stimuli.
There were right hemisphere activations in all analyses. Besides the aforementioned regions in the right temporal lobe, right prefrontal activations were reliable in the Rest and Special analyses. For auditory presentation, the finding of bilateral frontal operculum activation replicates results from sentence processing studies [e.g., Meyer et al., 2000]. The more dorsal right PFC activation found for Special contrasts was based mostly on the study by Caplan and Dapretto . However, these authors reported bilateral patterns, including left PFC activations.
The prediction of the RH being important for inference processes [Beeman, 2005; Beeman and Chiarello, 1998; Mason and Just, 2004] was not supported. The analysis of coherence building yielded a clearly left dominant network, with the right aTL as the only significant RH region. It is important to note, however, that some imaging studies reporting RH activation for inference processes had to be excluded because of the lack of reported coordinates [George et al., 1999; Mason and Just, 2004]. As argued elsewhere [Ferstl, in press], it might be that a region of interest analysis, as conducted in these studies, compensates for the higher anatomical variability of the RH, and is thus more powerful to detect RH contributions.
It would be desirable to conduct a meta-analysis of ToM studies using the mathematical methods presented here for an objective comparison. However, the results presented strongly suggest an overlap between the ELN and the regions implicated for ToM processes in qualitative reviews [Frith and Frith, 2003]. Although in contrast to other reviews, specific contrasts testing for ToM using verbal materials were excluded, the aTL, TPJ, and dmPFC regions were clearly significant in several analyses. The most striking result was the network obtained in the Coherence analysis. In addition to the aTLs, the dmPFC, and the posterior STS, this network included the IFG (BA 45/47) and the lower precuneus, regions closely connected to the dmPFC [Ramnani and Owen, 2004]. In particular, the Language and the Coherence analyses yielded posterior STS activations bilaterally, slightly more ventral than the TPJ region identified by Saxe and Kanwisher  for ToMstory comprehension. Thus, the dmPFC seems to be specific for the successful interpretation of coherent language, rather than have a role in language comprehension in general.
Despite this overlap between the networks, there is no obvious causal direction. Although any communication might entail a ToM component [cf. Frith and Frith, 2003; Sperber and Wilson, 1995], many ToM tasks are explicitly verbal. For instance, the ToM-stories used by Saxe and Kanwisher , but not their control texts, require elaborate inference processes. Even nonverbal ToM tasks [e.g., Castelli et al., 2000] are likely to elicit verbalization or narrativization [Bruner and Feldman, 1993]. In the only imaging study specifically designed to dissociate verbal inference from mentalizing processes [Ferstl and von Cramon, 2002], we found both coherence and ToM to be sufficient, but neither of them necessary, for engaging the dmPFC. Thus, we have argued for a domain-independent general process encompassing inferencing, evaluation, and ToM [Ferstl and von Cramon, 2002; Zysset et al., 2002, 2003], going beyond the processing of self-relevant stimulus materials [cf. Northoff and Bermpohl, 2004].
One serious limitation of the method is that only those coordinates can be included that the authors choose to report. Studies using anatomically defined regions of interest rather than standard stereotactic coordinates have to be excluded altogether. Some studies include only one peak coordinate for each connected region, even if it is rather large, and others report submaxima within these regions. Some authors use a masking procedure and some do not use whole-head measurement but only scan a predefined section of the brain. Thus, the reported results might be a subset of the relevant regions only.
Note further that the included studies have not been weighted for the number of subjects or scans, or the statistical threshold reported. Although the possible influence of such weighting has been discussed in the literature for some time [e.g., Lancaster et al., 2005; Neumann et al., 2005; Nielsen et al., 2004; Turkeltaub et al., 2002], to our knowledge no agreed-upon solution exists to date. This is because of the complex relationship between the relevant factors [see Lancaster et al., 2005 for a discussion of this issue]. Thus, changing or altering the data as little as possible seemed to be the most objective choice.
A second problem, specific to the issue at hand, is that there still is a lack of relevant empirical data. Particularly for the Special analysis, the large variety of tasks, methods, and presentation modalities does not allow us to attribute the resulting network to specific subprocesses of text comprehension. For example, an evaluation of lateral prefrontal activation can only be thoroughly investigated when the number of available studies renders feasible a comparison across tasks with differing complexity. Further experimentation is needed so that the suggestions reported here become more definite. In particular, it is necessary to conduct well-designed studies to evaluate specific subprocesses of comprehension and pragmatic interpretation.
The application of recently developed meta-analysis methods to the functional neuroanatomy of text comprehension has identified a network of fronto-temporal brain regions, extending beyond the perisylvian language cortex. In addition to left inferior frontal and posterior superior temporal regions, there was an extremely stable contribution of the aTLs. Importantly, the specific role of fronto- as well as parieto-medial regions for inference processes was strengthened. In contrast to previous reviews of text comprehension studies [e.g., Gernsbacher and Kaschak, 2003; Mar, 2004], the meta-analysis did not provide evidence for a special role of the right hemisphere for inference processes. An attribution of right prefrontal and right temporal functions to special comprehension tasks [e.g., Caplan and Dapretto, 2001; Ferstl et al., 2005] requires further empirical replications.
1For lack of a better term, and consistent with conventional usage, the term “network” is used throughout the manuscript to refer to a set of cortical regions contributing to the same functional tasks. We do not claim that the meta-analysis method provides insight into whether these regions are anatomically or even functionally connected.
2Following the neurological convention, the terms, “dorsal” (high) and “ventral” (low) are used when referring to the inferior-superior dimension of brain regions, corresponding to the z-axis in stereotactic space.