Localization of language areas prior to brain surgery can help determine the risk of post-operative aphasia and may be useful for modifying surgical procedures to minimize such risk. Anterior temporal lobe resection is a common and highly effective treatment for intractable epilepsy (Wiebe et al., 2001
; Tellez-Zenteno et al., 2005
), but carries a 30–50% risk of decline in naming ability when performed on the left temporal lobe (Hermann et al., 1994
; Langfitt & Rausch, 1996
; Bell et al., 2000
; Sabsevitz et al., 2003
). In addition to retrieval of names, the left temporal lobe is classically associated with speech comprehension (Wernicke, 1874
). These seemingly different language functions both depend on common systems for processing speech sounds (phonology) and word meanings (lexical semantics), both of which are located largely in the temporal lobe (Indefrey & Levelt, 2004
; Awad et al., 2007
). Identification of these phonological and lexical-semantic systems is therefore an important goal in the presurgical mapping of language functions.
Functional magnetic resonance imaging (fMRI) is used increasingly for this purpose (Binder, 2006
). FMRI is a safe, non-invasive procedure for localizing hemodynamic changes associated with neural activity. Many fMRI studies conducted on healthy adults have investigated the brain correlates of speech comprehension, though with a variety of activation procedures and widely varying results. There has been little systematic, quantitative comparison of these activation protocols. There is at present little agreement, for example, on which type of procedure produces the strongest activation, which is most specific for detecting processes of interest, and which is associated with the greatest degree of hemispheric lateralization.
Speech comprehension protocols can be categorized in general terms according to stimulus and task factors (). Speech is an acoustically complex stimulus that engages much of the auditory cortex bilaterally in pre-linguistic processing of spectral and temporal information (Binder et al., 2000
; Poeppel, 2001
). Most models of speech perception posit a late stage of auditory perception in which this complex acoustic information activates long-term representations for consonant and vowel phonemes (Pisoni, 1973
; Stevens & Blumstein, 1981
; Klatt, 1989
). Many fMRI studies of speech perception have aimed to isolate this phonemic stage of the speech perception process by contrasting spoken words with non-speech auditory sounds that presumably 'subtract out' earlier stages of auditory processing. These non-speech control sounds have varied in terms of their similarity to speech, ranging from steady state 'noise' with no spectral or temporal information (Binder et al., 2000
), to 'tones' possessing relatively simple spectral and temporal information (Démonet et al., 1992
; Binder et al., 2000
; Desai et al., 2005
), to various synthetic speech-like sounds with spectrotemporal complexity comparable to speech (Scott et al., 2000
; Dehaene-Lambertz et al., 2005
; Liebenthal et al., 2005
; Mottonen et al., 2006
). As expected, the more similar the control sounds are to the speech sounds in terms of acoustic complexity, the less activation occurs in primary and association auditory areas in the superior temporal lobe (Binder et al., 2000
; Scott et al., 2000
; Davis & Johnsrude, 2003
; Specht & Reul, 2003
; Uppenkamp et al., 2006
Five types of protocols for mapping speech comprehension areas.
Another principle useful for categorizing speech comprehension studies is whether or not an active task is requested of the participants. Many studies employed passive listening to speech, whereas others required participants to respond to the speech sounds according to particular criteria. Active tasks focus participants' attention on a specific aspect of a stimulus, such as its form or meaning, which is assumed to cause 'top-down' activation of the neural systems relevant for processing the attended information. For example, some prior studies of speech comprehension sought to identify the brain regions specifically involved in processing word meanings (lexical-semantic system) by contrasting a semantic task using speech sounds with a phonological task using speech sounds (Démonet et al., 1992
; Mummery et al., 1996
; Binder et al., 1999
A final factor to consider in categorizing speech comprehension studies is whether or not an active task is used for the baseline condition. Performing a task of any kind requires a variety of general functions such as focusing attention on the relevant aspects of the stimulus, holding the task instructions in mind, making a decision, and generating a motor response. Active control tasks are used to 'subtract out' these and other general processes that are considered non-linguistic in nature and therefore irrelevant for the purpose of language mapping. Another potential benefit of such tasks is that they provide a better-controlled baseline state than resting (Démonet et al., 1992
). Evidence suggests that the conscious resting state is characterized by ongoing mental activity experienced as 'daydreams', 'mental imagery', 'inner speech' and the like, which is interrupted when an overt task is performed (Antrobus et al., 1966
; Pope & Singer, 1976
; Singer, 1993
; Teasdale et al., 1993
; Binder et al., 1999
; McKiernan et al., 2006
). It has been proposed that this 'task unrelated thought' depends on the same conceptual knowledge systems that underlie language comprehension and production of propositional language (Binder et al., 1999
). Thus, resting states and states in which stimuli are presented with no specific task demands may actually be conditions in which there is continuous processing of conceptual knowledge and mental production of meaningful 'inner speech'.
illustrates five types of contrasts that have been used to map speech comprehension systems. The first four are obtained by crossing either a passive or active task state with a control condition using either silence or a non-speech auditory stimulus. The last contrast uses speech sounds in both conditions while contrasting an active semantic task with an active phonological task. Though there are other possible contrasts not listed in the table (e.g., Active Speech vs. Passive Non-Speech), most prior imaging studies on this topic can be classified into one of these five general types. Within each type, of course, are many possible variations on both stimulus content (e.g., relative concreteness, grammatical class, or semantic category of words; sentences vs. single words) and specific task requirements.
Our aim in the current study was to provide a meaningful comparison between these five types of protocols through controlled manipulation of the factors listed in . Interpreting differences between any two of these types requires that the same or comparable word stimuli be used in each case. We used single words for each protocol, as these lend themselves readily to use in semantic tasks. Similarly, we used the same non-speech tone stimuli for the passive and active control conditions in protocols 2 and 4. Finally, because fMRI results are known to be highly variable across individuals and scanning sessions, we scanned the same 26 individuals on all five protocols in the same imaging session. The results were characterized quantitatively in terms of extent and magnitude of activation, specific brain regions activated, and lateralization of activation. The results provide the first clear picture of the relative differences and similarities, advantages and disadvantages of these speech comprehension mapping protocols.