|Home | About | Journals | Submit | Contact Us | Français|
The ability of written and spoken words to access the same semantic meaning provides a test case for the multimodal convergence of information from sensory to associative areas. Using anatomically-constrained magnetoencephalography (aMEG), the present study investigated the stages of word comprehension in real time in the auditory and visual modalities, as subjects participated in a semantic judgment task. Activity spread from the primary sensory areas along the respective ventral processing streams and converged in anterior temporal and inferior prefrontal regions, primarily on the left at around 400ms. Comparison of response patterns during repetition priming between the two modalities suggest that they are initiated by modality-specific memory systems, but that they are eventually elaborated mainly in supramodal areas.
Human interaction and thought crucially depend on words, which can take either auditory or visual form. During initial word processing stages, the acoustic signal or letter string are analyzed in their respective sensory modalities, followed by mapping of letter symbols or phonemes onto a word lexicon and, finally, semantic access and integration. The fact that the same semantic knowledge can be accessed by symbols in two different modalities allows exploration of the brain substrate that underlies retrieval of supramodal meaning.
Evidence from functional brain imaging suggests that language comprehension is subserved by modality-specific distributed networks (Cabeza and Nyberg, 2000). Spoken language as well as complex nonspeech stimuli evoke bilateral activation in the superior temporal cortices (Binder et al., 1997; Wise et al., 1991; Zatorre et al., 1992). Leftward speech-related asymmetry has been observed in temporal and left inferior prefrontal cortex (LIPC) (Friederici et al., 2000; Frost et al., 1999; Petersen et al., 1988; Price et al., 1996; Scott et al., 2000). Neuroimaging studies of reading, however, suggest a more clearly left-lateralized activity in ventral temporal and inferior prefrontal regions (Fiez and Petersen, 1998; Gabrieli et al., 1998).
The importance of the left anterior temporal cortex for language processing and semantic memory has been documented with PET (Devlin et al., 2002; Perani et al., 1999), in studies of patients with semantic dementia (Hodges et al., 1992) and with intracranial recordings (Halgren et al., 1994b; Nobre and McCarthy, 1995; Smith et al., 1986). Imaging studies suggest that the LIPC is involved in semantic language tasks and is sensitive to mnemonic manipulations such as priming (Bokde et al., 2001; Gabrieli et al., 1998; Roskies et al., 2001; Wagner et al., 2000). Indeed, the importance of both areas – left temporal and left inferior prefrontal cortices for supramodal semantic processing, has been accentuated by their contributions to repetition priming effects across both auditory and visual modalities (Buckner et al., 2000; Chee et al., 1999). However, the timing and the sequence of their involvement are not clear.
Benefiting from the excellent temporal resolution provided by event-related potentials (ERPs), language studies have described a negativity peaking at ~400 ms (N400) which is evoked by potentially meaningful material including spoken, written or signed words and pictures (Kutas and Federmeier, 2000). The N400 is modulated by priming and is commonly viewed as reflecting attempts to access and integrate a semantic representation into a current context (Brown and Hagoort, 1993; Halgren, 1990; Holcomb, 1993; Rugg and Doyle, 1994). The overlapping scalp distribution of the N400 evoked by auditory and visual stimuli suggests that it may reflect access to a supramodal semantic network, as a final common pathway originating in respective sensory-specific processing areas (Domalski et al., 1991; Gomes et al., 1997; Hagoort and Brown, 2000; Rugg and Nieto-Vegas, 1999). MEG studies of N400 relying on equivalent current dipole modeling method estimed the source of N400m (magnetic equivalent of N400) in the vicinity of the superior temporal sulcus - STS (Helenius et al., 1998; Helenius et al., 2002; Makela et al., 2001; Sekiguchi et al., 2001; Simos et al., 1997). Intracranial recordings suggest a widespread distribution of local N400 generators particularly in the anteroventral temporal lobe (Halgren et al., 1994b; McCarthy et al., 1995; Nobre and McCarthy, 1995; Smith et al., 1986), but also inferolateral prefrontal cortex, occipito-temporal cortex, middle and superior temporal gyri, and supramarginal gyrus (Halgren et al., 1994a; Halgren et al., 1994b).
Taken together, the available neuroimaging evidence suggests that the prefrontal and temporal regions contribute to semantic and mnemonic processing of words presented in both spoken and written form (Buckner et al., 2000; Petersen et al., 1988), and may represent the neural basis of supramodal processing. Conversely, ERPs and MEG provide complementary evidence suggesting that access to the central semantic store is reflected in the N400 component (Kutas and Federmeier, 2000). The present study attempts to integrate the two lines of evidence by comparing brain activity evoked by a semantic judgment task presented in auditory and visual modalities. Our underlying hypothesis is that the left inferior prefrontal and temporal areas are primarily engaged while accessing supramodal semantic knowledge at approximately 400 ms post-stimulus, subsequent to modality-specific and transitional processing stages.
The aim of this study was to examine the spatio-temporal characteristics (“where and when”) of word comprehension in real time in the auditory and visual modalities. High-resolution structural MRI was combined with temporally-precise whole-head high-density MEG and a distributed source modeling approach to estimate the anatomical distribution and hierarchical interdependence of the underlying neural networks (Dale et al., 2000; Dale and Sereno, 1993; Dhond et al., 2001). The task required subjects to estimate the physical size of an animal or object denoted by a word. By using repetition priming, this study also examined the brain areas showing memory-related differential responses to primed items and investigated their time-courses. Finally, the question of the nature of semantic processing and its dependence on the presentation modality was considered by comparing patterns of brain activity obtained in the same subjects across both auditory and visual forms of the same task.
Adequate data for both versions of the task were obtained in 9 subjects. Response times and performance accuracy on the task were monitored continuously. Good agreement between the subjects’ judgment and the pre-specified size criteria was suggested by high performance levels. Performance indices (d’, percent correct scores and reaction times) were analyzed with a repeated-measures ANOVA with factors of task (visual, auditory) and repetition (new and repeated stimuli). Based on d’, a bias-free measure of stimulus discriminability, performance was significantly better on repeated than new items, F (1,8) = 14.4, p < 0.005, d’ equals 2.64 and 3.96 for new and repeated items respectively. In contrast, d’ values did not differ between the two task modalities, F (1,8) = 0.1, p > 0.5). Examination of the percent correct scores revealed a slightly better performance on the visual form of the task (92.3 vs. 87.2 correct hits), F(1,8) = 5.38 p < 0.049 (uncorrected for multiple comparisons). However, there were also more false alarms in the visual version of the task (means = 10.4 vs. 3.4), F(1,8) = 6.02 p < 0.04 (uncorrected).
Faster reaction times were observed in the visual, as compared to the auditory task, F(1,8) = 67.8 p < 0.0001, and to the repeated, as compared to new items across both tasks, F(1,8) = 39.9, p < 0.0005. Reaction times and standard errors of the mean were as follows: 960.0 (40.5) and 759.8 (26.9) to novel and repeated written words, 1063.7 (27.2) and 981.8 (32.9) to novel and repeated spoken words respectively.
The cortical surface of each individual served as the solution space for current dipoles generating the MEG signal. Estimated dipole strengths across locations were transformed using noise estimates into dynamic statistical parametric maps - dSPMs (Dale et al., 2000). These maps indicate the statistical significance of estimated activity at each latency and cortical location. The maps were averaged across all subjects using cortical surface alignment of corresponding anatomical features (Fischl et al., 1999b). Cortical activity was displayed on “inflated” views of an averaged cortical surface, permitting a view of the activity estimated to lie within the sulci. Inspection of the overall activity patterns averaged across all nine subjects suggested that the earliest activity to novel words was confined to the visual or auditory sensory areas during tasks in the two respective modalities, followed by association brain areas in the anterior temporal and prefrontal areas. Estimated activity to novel words for selected latencies is presented in Figure 1. The overall activity pattern is described here for each modality seriatim.
In the auditory task, the earliest reliable activity was seen bilaterally at 55 ms in the superior temporal area, as a magnetic analogue of the event-related P50 component measured on the scalp (Figure 1). Activity estimated to lie in the subcentral gyrus most likely reflects blurring of the superior temporal activity; although these areas appear to be distinct on the inflated surface presentation, they are immediately adjoining in the actual brain volume. This initial activity was followed by a second peak at ~100ms (N100m) in the perisylvian/superior temporal plane region. At ~200 ms activity spreads forward and by 250 ms activity was estimated to lie in the anterior regions of the temporal lobe (AT), perisylvian area and posterior inferior prefrontal regions bilaterally. In addition to sustained contribution of the bilateral perisylvian region, the activity was left-dominant after 300 ms, relying primarily on the AT, anterior LIPC and bilateral ventromedial prefrontal (VMPF) areas. Timecourses of the estimated activity were plotted for selected locations. These waveforms (Figure 4) suggest sustained contributions from the left-dominant AT initially peaking at ~410 ms, followed by the aLIPC (at ~490ms) until ~700 ms.
In the visual task, activity started bilaterally in the occipital area and spread along the ventral visual pathway (Figure 1). The left-dominant activity followed the posterior-anterior axis peaking in the ventral occipitotemporal area at ~170 ms, at ~230 ms in the STS and inferolateral temporal area and at ~350 ms in the AT area, encompassing LIPC and orbitofrontal cortex bilaterally by 400 ms. This left temporo-frontal activity peaking at ~400ms may be the magnetic-equivalent of the N400, or N400m. Activity seen in the posterior occipitotemporal region at ~420 ms appears to be partially due to the visual offset response and it is superimposed over the N400m effects.
The overall activity patterns were analogous across modalities. In both tasks, activity started in modality-specific areas and progressed anteriorly via the respective ventral streams, engaging the left STS and left AT areas after ~250 ms. The anterior LIPC was recruited in both tasks particularly during the N400m period, although its contribution started earlier and was sustained for longer in the auditory version. Both tasks elicited similar patterns of activity during the N400m, including the left AT and LIPC as well as bilateral VMPF areas. Whereas activity was strongly left-lateralized during visual task presentation, the auditory version resulted in bilateral perisylvian activity.
Waveforms evoked by repeated words were subtracted from the waveforms obtained on novel trials for each individual subject. Dynamic SPMs of these differences were averaged across the nine participants (Dale et al., 2000; Dhond et al., 2001). Selected snapshots illustrate the early (~250 ms) and late repetition effects (after 300 ms) in Figures 2 and and33 respectively. Timecourses of the estimated noise-normalized dipole strengths (Figure 4) offer an alternative insight into the temporal dynamics of the activity estimated for each cortical location. Using within-subject statistical comparisons (Woodward et al., 1990), significance of the novel – repeat differences for the “early” (225–250 ms) and “late” (300–500 ms) estimated activity was additionally tested for several relevant regions of interest (ROI). Because they account for the variation of the activity within the sample, ROI-ANOVAs as performed here allow making inferences that can be generalized to the population and are equivalent to random-effects analysis in fMRI. Additional stringency of this test derives from the requirement that the differences in estimated activity coincide in space AND time across subjects: in space because exactly the same ROIs were used across all subjects not allowing any individual variations in localization estimates; in time since the same time window was tested for all subjects.
Repetition effects in the auditory modality were due to stronger responses to novel words (Figure 2, left panel). Within-subject statistical comparisons (ROI-ANOVA) of the estimated noise-normalized dipole strengths within 225–250 ms latency window suggested significant differences in the superior temporal plane, F(1,8) = 28.2, p < 0.001, STS, F(1,8) = 17.1, p < 0.01, and the temporopolar area, F(1,8) = 7.9, p < 0.05.
In contrast, in the visual modality it was the repeated words that evoked a stronger and earlier activity than novel words, particularly in the left inferior temporal and LIPC (Figure 2, right panel). ROI-ANOVA for the 225–250 ms latency window confirmed the repeat > novel effect only in the left inferolateral temporal, F(1,8) = 6.2, p < 0.05 and aLIPC, F(1,8) = 8.0, p < 0.05) regions. Furthermore, responses evoked by repeated words had shorter peak latencies than novel words in both the anterior temporal, F(1,8) = 10.7, p < 0.01 (means = 229 and 239 ms for repeated and novel words respectively) and LIPC area, F(1,8) = 7.1, p < 0.05 (means = 237 and 248 ms). These effects can be observed in the estimated timecourses (Figure 4). The repeat > novel effects were apparently limited to these regions in the left hemisphere because the differences were not significant in the neighboring regions, nor were they significant in the prefrontal or temporal areas on the right.
Activity subsequent to these early effects was significantly stronger to novel words in both modalities and the N400m was reflected in the prominent engagement of the left temporal, followed by the left prefrontal region. Dynamic SPMs of the difference between the novel and repeated words presented in Figure 3 suggest a considerable overlap in the estimated substrates of the N400m in the two modalities. In addition, ROI-ANOVAs in the 300–500 ms latency window confirmed significant repetition effects in the left anterior temporal region for both modalities including the temporopolar area, F(1,8) = 20.4, p < 0.01 for the auditory and F(1,8) = 11.3, p < 0.01 for the visual version, and the anterior STS regions, F(1,8) = 11.9, p < 0.01 (auditory) and F(1,8) = 12.8, p < 0.01 (visual). Similarly, significant repetition effects were seen in the anterior LIPC for the auditory, F(1,8) = 21.7, p < 0.01, and visual tasks, F(1,8) = 10.6, p < 0.01, as well as in the left prefrontal area superior to LIPC in both auditory, F(1,8) = 5.6, p < 0.05, and visual task versions, F(1,8) = 14.4, p < 0.01. Significant interactions between the factors of modality and repetition in the right hemisphere in the inferior prefrontal, F(1,8) = 9.0, p < 0.05, temporopolar F(1,8) = 14.9, p < 0.01 and anterior STS area, F(1,8) = 9.6, p < 0.05), suggested a bilateral nature of processing in the auditory and left-lateralized effects in the visual modality at this latency. Indeed, the repetition effects in the right hemisphere were significant for the auditory task in the inferior prefrontal, F(1,8) = 7.0, p < 0.05, temporopolar, F(1,8) = 17.3, p < 0.01, and anterior STS area, F(1,8) = 20.7, p < 0.005. In contrast, no difference was seen for the visual task in the right prefrontal, F(1,8) = 0.5, p < 0.5, right temporopolar, F(1,8) = 0.2, p > 0.5, or anterior STS regions, F(1,8) = 0.6, p < 0.5).
For the auditory task, the average peak latency of the activity estimated to lie in the anterior temporal region was 460 ms, followed by two prominent peaks estimated in the anterior LIPC with latencies of 490 and 600 ms (Figure 4). In contrast, the visual task evoked a bifurcated peak with latencies of 350ms and 410 ms in the AT region, followed by activity in the anterior LIPC peaking at 450 ms.
In sum, whereas repetition effects differed between the auditory and visual tasks in the earlier processing stages (before 300 ms), their substrates overlapped increasingly at later latencies, particularly during the N400m time window. The main overlapping activity was observed in the left temporal region, with a very similar pattern appearing already at ~350 ms, as well as the anterior LIPC and ventromedial prefrontal regions at later latencies, after ~400 ms. The temporo-prefrontal contributions were strongly left-lateralized in the visual task version only.
The view that language relies on distributed but interactive brain areas has been supported by a large number of studies (for reviews see Cabeza and Nyberg, 2000; Fiez and Petersen, 1998; Geschwind, 1965; Ingvar, 1983; Mesulam, 1998; Raichle, 1996; Warburton et al., 1996). This view assumes modality-specific lexical components accessing a central semantic system. In the current study, the auditory task utilized the phonological input route for understanding spoken words, in contrast to the orthographic route subserving reading. Neural substrates of these modality-specific routes were quite distinct during initial processing, but overlapping areas were subsequently activated during stages of semantic and contextual integration. This is particularly evident for the N400m repetition effect. The present evidence favors the claim that modulations in N400 amplitude reflect a supramodal semantic process with primary overlapping contributions from left inferior prefrontal, left temporal and medial prefrontal areas bilaterally. Additional contributions were estimated to originate in the right prefrontal and temporal areas during the auditory task only.
The spatiotemporal characteristics of the responses to spoken words observed in our study concur with other evidence. The earliest response was estimated to lie in the superior temporal area bilaterally at 55 ms after stimulus onset (Figure 1) and then spread to the auditory “belt” in the perisylvian region as reflected in N100m. Subsequently, activity spread along the auditory “ventral” stream into anterior and lateral areas of the STG, concurring with the analogy to the visual pattern recognition or “what” stream (Binder et al., 1997; Rauschecker and Tian, 2000). Ensuing engagement of the ventral prefrontal areas concurs with other evidence obtained during auditory pattern recognition in human (Rauschecker, 1998) and primate studies (Romanski and Goldman-Rakic, 2002).
In contrast, processing of written words started in the primary visual cortex (Figure 1) and proceeded anteriorly in the ventral visual stream, encompassing middle and superior temporal (Wernicke’s) areas, and finally the LIPC region. The activity pattern observed here corresponds closely to activations reported elsewhere (Dale et al., 2000; Dhond et al., 2001; Fiez and Petersen, 1998; Halgren et al., 2002) and agrees with classical models of language processing (Benson, 1979; Geschwind, 1965).
Whereas processing of written words was strongly left-lateralized in all processing stages subsequent to the visual cortex, hearing spoken words resulted in bilateral perisylvian activity, agreeing with other evidence (Belin et al., 2000; Binder et al., 1997; Cabeza and Nyberg, 2000; Helenius et al., 2002; Hickok and Poeppel, 2000; Wise et al., 1991). After ~300 ms, the response was left-biased in the auditory task version, especially in the anterior LIPC and anteroventral temporal regions, possibly indicating the access of semantic networks predominantly in the left hemisphere (Scott et al., 2000).
Faster and more accurate responses to repeated stimuli were observed in both task versions, replicating a well-established behavioral repetition priming effect (Schacter and Buckner, 1998). The earliest repetition priming effects were estimated primarily to the modality-specific areas (Figure 2). During auditory word presentation, novel items evoked stronger responses in the perisylvian areas after ~230 ms, suggesting that the auditory system could clearly distinguish the first phoneme of the highly familiar repeated words (total of ten) from the novel words that were presented only once. This may reflect perceptual priming based on mnemonic functions of the auditory association cortex (Näätänen et al., 2001).
In contrast, repeated written words showed a brief tendency towards increased activity of the ventral visual stream starting at ~230 ms, including left occipito-temporal and left posterolateral temporal regions (Figure 2). A similar effect in a stem-completion task was reported in the comparable area (left ventrotemporal area) and at a similar latency (200–245 ms) using the same aMEG technique as the one reported here (Dhond et al., 2001). In an ERP study of word repetition, Nagy et al. (1989) reported a transient repeated > novel effect with an onset at ~200 ms. Using fMRI, Poldrack et al. (1998) have observed increased activation in lateral inferior temporal regions to well practiced stimuli under the conditions of mirror-reading. This evidence suggests that priming may result in a more rapid and preferential access to certain structures along the ventral stream, presumably subserving a transitional stage between perceptual and conceptual function and potentially embodying direct lexical access (Humphreys and Evett, 1985). The latency and anatomical distribution of this effect is consistent with a suggestion placing the word-form processing stage after the early visual processing and before the phonological and semantic stages (Tulving and Schacter, 1990; Warrington and Shallice, 1980). The apparent activation of the more anterior areas, including AT and LIPC, may represent a fast processing of the highly familiar word-form, providing an “early pass” through the relevant parts of the network, the purpose of which may be to prime them for the impending input. The temporal resolution of the aMEG method made it possible to distinguish an early and brief increase to repeated items from a robust subsequent novel > repeat effect. Thus, processing of written words may benefit from prior exposure through multiple stages, including both perceptual (possibly form-based) and conceptual (semantically mediated) priming (Schacter and Buckner, 1998), resulting in a faster and more accurate stimulus identification.
The strongest priming effects were seen during the N400m time period in both auditory and visual tasks (Figure 3). Responses were larger to novel stimuli, and encompassed primarily the left temporal, LIPC and medial prefrontal areas in both modalities. Additional responses in the right temporal and prefrontal areas were observed in the auditory task version.
One of the goals of this study was to compare the N400m priming effect to that observed with hemodynamic methods. Indeed, overall activation patterns evoked by semantic tasks in fMRI studies are similar to the patterns observed in the present study. They encompass the left inferior prefrontal regions, and left temporal (word reading) or bilateral temporal (word hearing) areas, but commonly exclude the anterior temporal region (Binder et al., 1997; Buckner et al., 2000; Fiez and Petersen, 1998). In contrast, in the present study strong contributions to the N400m in both modalities were estimated to originate in the anterior temporal lobes. This observation is in accord with PET studies that reliably detect activation in the anterior temporal area during semantic tasks (Devlin et al., 2002; Perani et al., 1999; Tzourio et al., 1998). Based on direct comparisons of the activation patterns obtained with PET and fMRI with semantic tasks, it has been suggested that the absence of the anterior temporal activation in fMRI is most likely due to susceptibility artifacts in that region (Devlin et al., 2000; Veltman et al., 2000). Additional support for the role of the anterior temporal area in semantic and mnemonic functions comes from intracranial recordings. They indicate large locally-generated responses within the ~300–700 ms latency that are similar to faces and words, and maximal in anteroventral temporal areas (Halgren et al., 1994b; McCarthy et al., 1995). Converging evidence is provided by lesion studies. Atrophy in the left anterolateral temporal cortex seen in patients with semantic dementia correlates with impairments in semantic processing (Hodges et al., 1992).
Engagement of the aLIPC occurred subsequent to the temporal area in both auditory and visual versions of the task, though its contributions seemed stronger and more sustained in the auditory version. It was highly sensitive to item repetition as the activity was significantly reduced to repeated words. Even though the present results are consistent with the view that the aLIPC subserves guided semantic access (Kapur et al., 1994; Poldrack et al., 1999; Tagamets et al., 2000; Wagner et al., 2001), it is not possible to determine whether this activity also represents a phonological analysis (Hagoort et al., 1999), or even a more broadly conceptualized mediation of selection among competing alternatives (Thompson-Schill et al., 1997). Intracranial studies using cognitive tasks have reported locally generated N400 in the ventrolateral prefrontal cortex, (Halgren et al., 1994a), which is consistent with our estimated localizations.
The initial semantic access takes place in overlapping regions and at comparable latencies in both modalities. The observed lag in peak latency of ~50 ms and a protracted activity profile to spoken words estimated to the temporo-frontal areas suggest that the semantic integration of a spoken word may commence based on incomplete auditory input and proceeds by sustained integration of the unfolding acoustic stream. This is consistent with previous models of spoken word recognition (Hagoort and Brown, 2000; Marslen-Wilson, 1987) whose main idea is that the comprehension of a spoken word may emerge from a sustained interplay between the continual acoustic input and the “top-down” facilitation by the higher association areas (Van Petten et al., 1999). In this study, reaction times to spoken novel words were delayed for ~100 ms on average. In addition to a delay in the onset of semantic integration, other processes such as applying the task criterion to the results of semantic access/contextual integration, mapping and releasing motor response, may have contributed to that difference.
Results of other MEG studies of N400 (Helenius et al., 1998; Helenius et al., 2002; Makela et al., 2001; Sekiguchi et al., 2001; Simos et al., 1997) generally estimate the N400m source to the superior temporal sulcus, overlapping partially with our estimates. The differences could be explained by the adopted source modeling technique. A recent study by Halgren et al., (2002) analyzed the same sentence-terminal word data using two methods: fitting of an equivalent current dipole (ECD) that best matches the observed field pattern, as well as a distributed source modeling technique employed in the present study. The first method estimated the N400m ECD in the vicinity of the left superior temporal sulcus, replicating results of other similar studies (Helenius et al., 1998; Simos et al., 1997). However, the distributed solution additionally included anteroventral temporal, orbitofrontal and posteroventral prefrontal cortices on the left, consistent with the present findings and evidence from other neuroimaging techniques.
In the present study the N400m, as an index of supramodal contextual integration (Kutas and Federmeier. 2000), was estimated to originate primarily in the left inferior prefrontal and anterior temporal areas in both modalities. This apparently supramodal network that is engaged during contextual integration corresponds to neuroimaging studies of auditory vs. visual priming (Buckner et al., 2000; Chee et al., 1999). Its sensitivity to repetition priming implicates the N400m in the close mutually-reinforcing interaction between semantic and mnemonic processing. Our results are consistent with the “unitary semantic hypothesis” (Caramazza et al., 1990) stipulating that supramodal semantic stores can be accessed from any modality, after appropriate lexical access through the phonological or orthographic lexicon. Thus, behavioral priming is a result of “savings” in the multiple stages of processing starting relatively early during a modality-specific phase and continuing through the stages of accessing supramodal semantic stores. Even though our results suggest that the observed activity is largely overlapping between the two modalities, they are consistent with the view that the sensory input continues to contribute to the comprehension process.
Nine healthy males participated in MEG/EEG recordings during auditory and visual versions of the task on two separate occasions, in addition to the structural MRI scan. Subjects were all right-handed native English speakers between 22 and 30 years of age (mean = 24.7, SE = 0.77), without hearing or other neurological impairments. No structural brain abnormalities were apparent on their MRI scans. Signed statements of consent were obtained from all participants. They were monetarily reimbursed for their participation.
During a “size” judgment task, participants were presented with words denoting objects, animals, or body parts and were instructed to respond to those larger than one foot (e.g. tiger, shirt), and to refrain from responding to those smaller (e.g. cricket, medal). During practice, ten words were presented repeatedly and subsequently became “repeats”, presented on half of the trials and randomly mixed among “novel”, nonrepeated words that were presented only once. Most likely, this task was carried out by accessing propositional information rather than visual imagery. It presumably imposed rather low-level imagery demands because only the stimuli that were clearly larger or smaller than the reference standard were included in the lists and because all other visual characteristics were irrelevant to the performance.
Two parallel versions of the task with no word overlap were administered to the same participants in auditory and visual modality on two separate occasions, 4 months apart on average. Word lists used in the auditory and visual tasks were balanced for word frequency with means of 13.28 and 12.51 per million respectively, (Francis and Kucera, 1982). The words used in the auditory task were slightly shorter (1.4 syllables and 5.2 letters) than the words used in the visual task (2.1 syllables and 6.6 letters on average), due to presentation timing constraints. The repeated words were chosen to be representative of their respective category with respect to word frequency and length.
Word stimuli were recorded by a native male speaker and were equated for sound onset and duration (500 ms), as well as amplitude level by digitally editing the recorded waveforms. A total of 780 nouns were presented binaurally through plastic tubes at a comfortable level every 2.2 sec. During practice, ten words were presented 6 times and they subsequently became “repeats”, presented on half of the trials and randomly mixed among the 390 “novel”, nonrepeated words that appeared only once during the experiment. The response hand was switched midway through the experiment, in a balanced manner across subjects.
Words were presented on a computer-driven projection screen in front of a subject and subtended a <5% visual angle. The words were presented for 300 ms in helvetica font as white letters on a black background every 2 sec. During practice, ten words were presented four times each and subsequently became “repeats”, randomly presented on half of the trials. Six out of nine subjects were presented with a total of 320 words, one subject performed the task with a total of 480 words and two subjects had a total of 640 words. The subjects responded with their left hands throughout the experiment.
MEG was recorded from 204 channels (102 pairs of planar gradiometers) from the entire head with a planar dc-SQUID Neuromag Vectorview system in a magnetically and electrically shielded chamber. The signals were recorded continuously with 600 Hz sampling rate and minimal filtering (0.1 to 200 Hz). Category-based averages were constructed on-line from trials free of eyeblinks or other artifacts and were low-pass filtered at 30 Hz. Averaged waveforms from a representative subject are shown in Figure 5. The position of magnetic coils attached to the skull and main fiduciary points such as the nose, nasion and preauricular points were digitized with 3Space Isotrak II system for subsequent precise co-registration with MRI images.
MEG signals directly reflect the magnetic fields associated with synaptic currents with a millisecond precision. However, the spatial configuration of the intracranial generators cannot be uniquely determined based on extracranial measurements without prior assumptions (Hämäläinen et al., 1993). Anatomically-constrained MEG (aMEG) relies on the anatomical localization of each individual’s cortical surface from their MRI to constrain the inverse solution (Dale and Halgren, 2001). This approach assumes that the synaptic potentials giving rise to the summated MEG lie in the cortical gray matter of each subject, obtained from a high-resolution anatomical MRI (Dale et al., 2000; Dale and Sereno, 1993).
The cortical surface reconstructed for each individual from high-resolution 3-D T1-weighted MRI structural images (1.5T Picker Eclipse) was subsampled to ~2500 dipole locations per hemisphere (Dale et al., 1999; Fischl et al., 1999a). This cortical surface served as the solution space for the estimated current generators (dipoles), constraining the MEG solution to the gray matter. Dipole orientation was unconstrained. The forward solution was calculated using a boundary element model (Oostendorp and van Oosterom, 1991). Using a linear estimation minimum norm approach (Dale and Sereno, 1993; Hämäläinen and Ilmoniemi, 1994), dipole strength power is estimated at each cortical location every 5 ms and is divided by the predicted noise power (Dale et al., 2000). This has the effect of transforming power maps into dynamic statistical parametric maps (dSPMs), as well as making the point-spread function relatively uniform across the cortical surface (Liu et al., 2002). These noise-normalized estimates of the local current dipole power for each location fit the F-distribution and can be viewed as “brain movies”, or a series of SPM frames unfolding across time. Intersubject averaging comprises morphing each subject’s reconstructed surface into an average representation after aligning their cortical sulcal-gyral patterns (Fischl et al., 1999b) and averaging individual inverse solutions (Dhond et al., 2001). In addition to the overall activity patterns presented in Figure 1, repetition priming effects are presented as group average dSPMs of differences in the waveforms evoked by the novel and repeated words conditions. Significance of the average estimated activations is displayed on the images.
An alternative insight into the temporal dynamics of the activity estimated for each cortical location is offered by plotting the estimated noise-normalized dipole strength across all time points at selected locations (Figure 4). In order to further explore the statistical significance of particular comparisons, regions of interest (ROIs) were chosen for the relevant areas on the cortical surface (Figure 4). The same set of ROIs was used for all subjects by an automatic spherical morphing procedure (Fischl et al., 1999b). Since the same noise estimation is used across conditions, the differences in significance at a given location indicate differences in activity, permitting a direct statistical comparison of the activity to novel vs. repeated words. For the same reason, however, estimated dipole strengths cannot be directly compared across different locations. Average estimated noise-normalized dipole strength values were calculated from the estimated timecourses of the cortical points comprised in each ROI for each subject across both auditory and visual tasks and for both novel and repeated conditions within the selected latency windows. These values were submitted to within-subject ANOVAs for the “early” (225–250 ms) and “late” (300–500 ms) repetition effects.
We are grateful to Thomas Witzel, Sharelle Baldwin, Bruce Fischl, Jeremy Jordin, John Klopp, Jeffrey Lewine, Arthur Liu, Dave Post, Kim Paulson and Bruce Rosen. Supported by NIH grants NS18741 (EH), AA13402 (KM), NS39581 (AMD), EB00790 (AMD), EB00307 (AMD) and the MIND Institute (DOE grant DE-FG03-99ER62764).