Consistent with previous studies, the fMRI localizers revealed a focus of activity in the posterior STS that responded to both auditory and visual speech (). There was a high degree of intersubject variability in the standard coordinates of the STS multisensory area, especially in the anterior-to-posterior direction (mean × = -56 ± 4 mm; y = -27 mm ± 12 mm SD; z = 8 ± 9 mm).
fMRI-guided TMS of congruent and incongruent auditory-visual speech.
To verify the effectiveness of our stimuli, we examined subjects' percepts without TMS (). The non-McGurk control stimuli consisted of an auditory syllable (experiment 1) or a congruent auditory-visual syllable (experiment 2). When presented with these stimuli, subjects almost always reported a percept that matched the auditory stimulus (mean likelihoods and reaction times in ). The McGurk stimuli consisted of an incongruent auditory-visual syllable. For these stimuli, subjects rarely reported a percept that matched the auditory stimulus, instead reporting the McGurk percept of a fused syllable different from both the auditory and visual syllables.
Table 1 Likelihood of the McGurk percept and the reaction time in milliseconds (± SEM) for each condition of experiments 1 and 2. Audio-only: auditory-only control stimulus. McG AV: McGurk auditory-visual stimulus. Cong AV: Congruent auditory-visual control (more ...)
When TMS was delivered to the STS, subjects were significantly less likely to report the McGurk effect (experiment 1: P = 5e-5; experiment 2: P = 0.004). A concern was that non-specific effects of TMS could introduce a possible confound. For instance, the brief click of the TMS pulse could somehow interfere with auditory perception. To address this concern, we stimulated a control TMS site dorsal to the STS, producing a similar behavioral experience for the subject. The mean co-ordinates of the control site in standard space were (x,y,z) = (-42, -19, 46) a distance of 39 ± 12 mm (SD) from the STS site (-60, -35, 16). TMS of the control site did not reduce the likelihood of perceiving the McGurk effect (experiment 1: P = 0.2; experiment 2: P = 0.5). A second concern was that TMS of the STS might interfere with speech perception in general. However, TMS of the STS did not affect discrimination of the control stimuli (experiment 1: P = 0.5; experiment 2: P = 0.3). If multisensory integration in the STS is the basis of the McGurk effect, the relevant neural computation must occur in a relatively narrow time window after the auditory and visual stimuli are delivered but before perception occurs. To test this idea, in the third experiment the likelihood of the McGurk effect was measured while single-pulse TMS was delivered to the STS at a range of times. There was a significant effect of stimulation time on the McGurk effect [F(10,50) = 4.66, P = 0.0001]. This was driven by a reduction in the McGurk effect at four time points, spanning 100 ms before onset of the auditory stimuli to 100 ms after onset of the auditory stimulus (P < 0.05 by Mann-Whitney U test). At other times, STS TMS did not significantly change the McGurk percept ().
Subjects reported a variety of percepts during auditory-visual trials in which STS TMS disrupted the McGurk effect. The most common experience, reported 66% of the time, was a percept similar to auditory-only trials (e.g. TMS delivered with auditory “ba” + visual “ga” resulted in the percept “ba” instead of the McGurk percept “da”). The second most common experience was a percept between the auditory and McGurk percepts (e.g. between “ba” and “da”). Other reports were of a hybrid percept (e.g. “b-da”) or a completely different syllable (e.g. “ha”).