Behaviorally, we tested whether a co-occurring sound could enhance visual target detection sensitivity (d’), even though the presence of a sound carried no information about whether a visual target was present or absent (since sounds were equally likely on target and non-target trials). Given some prior multisensory research (e.g. McDonald et al., 2000
; Vroomen and de Gelder, 2000
; Frassinetti et al., 2002
; Noesselt et al., 2008
), and past proposals associated with the possible function of the putative principle of inverse-effectiveness (e.g. Stein et al., 1988
; Stein and Meredith, 1993
; Kayser et al., 2008
), we predicted that co-occurrence of a sound should benefit detection sensitivity (d-prime, d’) for the lower-intensity but not higher-intensity visual targets, as compared with their respective no-sound conditions.
plots the critical visual sensitivity scores, in formal signal-detection terms (i.e., d’ scores). As predicted, co-occurrence of a sound enhanced visual detection sensitivity, but only for lower-intensity not higher-intensity visual targets. This led to a significant interaction between visual intensity-level and sound presence (F(1,25) = 8.97; p=0.006) in a repeated-measures ANOVA. Visual detection sensitivity (d’) was affected by sound-presence only for the lower-intensity visual targets (t(25) = 5.49; p<0.001).
shows a comparable outcome for raw accuracy data, rather than signal detection d’. Supplementary Fig. S1a also shows the accuracy data when separated for subjects tested inside or outside the scanner, who did not differ (as further confirmed by a mixed-effects ANOVA that found no impact of the between-subject inside/outside scanner factor, p>.1; and likewise for all our other behavioral measures). As with visual d’, accuracy in the visual detection task revealed that the co-occurrence of a sound enhanced detection for the lower-intensity but not for the higher-intensity visual targets (even though the latter were not completely at ceiling). This led once again to a significant interaction between visual intensity-level and sound presence (F(1,25) = 20.79; p<0.001) in a repeated-measures ANOVA. Accuracy increased only when a sound was paired with a lower-intensity visual target (t(25)=7.16, p<.001).
For completeness, we also analyzed criterion and reaction time measures (see supplementary material, Fig. S1b-c). Interestingly these showed a very different pattern to our more critical measures of sensitivity (d’) and hit-rate. Co-occurrence of a sound merely speeded responses overall (see supplementary Fig. S1b) regardless of visual intensity and even of the presence/absence of a visual target (see supplementary Fig. S1b legend). Thus RTs were simply faster in the presence of a potentially alerting sound, regardless of visual condition. Hence this particular RT result need not, strictly speaking, be considered multisensory in nature, as only the auditory factor influenced the RT pattern. In terms of possible fMRI analogues, the RT pattern would therefore correspond simply to the main effect of sound presence (which we found, as reported below, to activate auditory cortex, STS, and MGB, as would be expected). Hence we did not explore the RT effect any further. Nevertheless we note that the overall speeding due to a sound is broadly consistent with a wide literature showing that a range of visual tasks, both manual and saccadic, can be speeded by sound occurrence (e.g. Hughes et al., 1994
; Doyle and Snowden, 2001
). In the past such overall speeding by a sound has often been discussed in the context of possible non-specific alerting effects (Posner, 1978
). Some behavioral studies have found more complex RT-patterns when varying both visual and
auditory stimulus intensities (Marks et al., 1986
; Marks, 1987
), but here with a constant (relatively high) auditory intensity, we found only overall speeding of manual RTs in the visual task, regardless of visual condition (see supplementary Fig. S1b).
Turning to the final behavioral measure of criterion, we found that participants adopted a higher criterion for reporting low-intensity target-presence (see supplementary Fig. S1c). But since within the signal-detection framework criterion is strictly independent of sensitivity, d’, this criterion effect cannot contaminate our critical d’ results. Moreover the criterion effect as a function of visual intensity applied here regardless of auditory condition (see supplementary Fig. S1c), so unlike the d’ and accuracy results need not be interpreted as reflecting any multisensory phenomenon.
Thus, only the critical behavioral measures of d’ and accuracy showed differential multisensory effects (i.e. that depended on both auditory and visual conditions), with co-occurrence of a sound genuinely enhancing perceptual sensitivity (d’) and accuracy for lower-intensity but not higher-intensity visual targets. This pattern of multisensory outcome for detection sensitivity and accuracy appears compatible with the idea, long associated with the putative PoIE for multisensory integration, that co-occurrence of events in multiple modalities might particularly benefit near-threshold detection (as for the lower-intensity, but not higher-intensity, visual targets here). Our analyses of fMRI data below test for neural consequences of the co-occurring sounds, for visual targets of lower- versus higher-intensity.
Modulation of local BOLD response to lower-intensity (versus higher-intensity) visual events by concurrent sound
For the event-related results from the main fMRI experiment, the most important contrast concerns a greater enhancing impact of the sound on lower-intensity than higher-intensity visual targets, analogous to the behavioral effect on visual detection d’ and accuracy. The critical interaction contrast is as follows:
(Lower Intensity Light with Sound) minus Lower Intensity Light alone) >
(Higher Intensity Light with Sound) minus Higher intensity Light alone)
This two-way interaction-contrast subtracts out any trivial effects due to visual-intensity per se, due to visual-frame presentation that signalled when a response was required on every trial, or due to sound-presence per se. See supplementary table S2
for details of the outcome for the visual-intensity or sound-presence contrasts, which all turned out as expected, i.e. higher activation of visual cortex due to increased visual intensity (main effect of high intensity > low intensity; supplementary table S2a
); and of auditory cortex due to sound presence (main effect of sound > no sound conditions; see supplementary table S2b
The critical two-way interaction contrast (as per the formula above) is analogous to the critical behavioral interaction that affected accuracy and d’ (), so may reveal the neural analogue of the sound-induced boosting of visual processing. We interrogated the visual, auditory, and candidate heteromodal audio-visual regions (via SPM inclusive masking) that had been defined independently of the main experiment by the separate localizers. We found the critical interaction to be significant not only in STS ( top right; and ), a known multisensory brain region; but also in extrastriate visual regions contralateral to the visual target (see , top left; and ), plus in posterior insula/Heschl’s Gyrus, i.e. likely to correspond with low-level auditory cortex ( top middle; see also ). Note that while showing the critical interaction effect, the response patterns within visual and auditory cortex also shows the overall modality-preferences one would expect for high-intensity visual or auditory stimuli, as confirmed also by the independent localizers. lists further areas showing an interaction outside the visual, auditory and heteromodal regions of main interest, for completeness.
Group-results: fMRI-results specific to co-occurring sounds boosting the response for lower-intensity visual targets more than higher-intensity targets
fMRI interaction effect (mirroring the behavioral benefit)
The plot for the group interaction contrast within voxel-wise normalized space in (top middle plot) shows for insula/Heschl’s Gyrus not only the anticipated (PoIE-like) increase in response when the sound co-occurs with a low-intensity visual target, but also an apparent lack of auditory response when the same sound is paired with a high-intensity visual target (although please note that zero on the y-axis in the plots of represents the session mean, rather than absolute zero). While in principle the latter unexpected outcome might potentially reflect sub-additive responses for high-intensity pairings (cf. Angelaki et al., 2009
; Sadaghiani et al., 2009
; Stevenson and James, 2009
; Kayser et al., 2010
), alternatively it might reflect the inevitable tendency for SPM interaction-contrasts to highly the most significant voxels showing the strongest interaction pattern (so in the present context, not only an enhanced response to the sound when paired with a lower-intensity visual target; but also some reduction in this response when paired with a higher-intensity visual target, at the peak interaction voxel in SPM).
Accordingly we next extracted the beta weights from the main experiment for subject-specific, individually-defined A1 regions of interest (ROI), as derived from the independent localizer runs (see , bottom), thereby circumventing any selection bias. This individual ROI analysis confirmed the interaction pattern for A1, showing significantly increased BOLD signal for low-intensity visual targets when paired with sounds versus without sounds there (F=8.33, p<0.05 for the interaction; post-hoc t =2.3, p<0.05, for the pairwise contrast; see , bottom). In this more sensitive individual ROI analysis of A1, free from any voxelwise selection biases, the ROI results showed robust auditory responses from primary auditory cortex, even for the significantly reduced response found when paired with a higher-intensity visual stimulus (, bottom).
fMRI responses in subject-specific, individually-defined visual and auditory thalamus plus A1/Heschl’s gyrus
Importantly, the critical sound-induced enhancement of visual responses was also found subcortically in the LGB (main bottom left; and ) and in the MGB ( bottom right; and ) in the group voxel-wise SPM analysis. Thus, in addition to multisensory STS, not only did visual fusiform and auditory A1 / Heschl’s gyrus show the critical interaction pattern cortically, but so did subcortical thalamic stages of the visual and auditory pathways.
As shown in the plots of from the group voxel-wise SPM analysis, for all of the affected areas (i.e. STS, visual cortex, auditory cortex, LGB and MGB) the co-occurrence of a sound enhanced the BOLD response for the lower-intensity visual-target condition more than for the higher-intensity visual target condition. In principle, one must consider whether the latter outcome could reflect some ‘ceiling’ effect for the BOLD signal in the high-intensity condition. However, the bar graphs in show that the BOLD-responses in the affected regions (with the exception of the fusiform gyrus) were typically higher for low-intensity stimuli paired with sounds, than for any of the high-intensity conditions, which thus argues against any ‘ceiling’ concerns in terms of BOLD level per se
. Moreover, even our higher-intensity visual stimuli had modest absolute intensities. Other work (e.g. Buracas et al., 2005
) suggests that visual BOLD signals typically saturate only for much higher luminance levels than those used here. Nonetheless we found the expected pattern of enhanced BOLD-responses for higher- vs. lower-intensity visual stimuli in fusiform, LGB, plus further visual regions (see Table S2
) when presented without sounds, as expected.
One aspect of the specific pattern of BOLD responses in group-normalized LGB (bottom left of main ) may appear somewhat counterintuitive, with an apparent decrease for high-intensity visual stimuli when paired with sounds (as had been found in group-results for the interaction pattern in A1 above).
To address this point and also provide further validation of the novel results at the thalamic level, we again supplemented the group voxel-wise analyses by identifying visual and auditory thalamic body ROIs in each individual subject (supplementary Fig. S2
; see above for the rationale of using complimentary ROI-analyses). We then assessed the experimental effects in these individual thalamic ROIs (i.e. which could now correspond to somewhat different voxels in different subjects, but for the same defined area). Since these ROIs were defined independently of the main experiment, via the separate localizers, this again circumvents any selection bias for the interaction contrast. This individual approach corroborated our group voxel-wise SPM results, while also removing the one unexpected outcome (for the LGB) with this more unbiased ROI approach. Again, we found enhanced BOLD signals when a sound is added to a lower-intensity visual target; but no significant change in response when the same sound is added to a higher intensity visual target (see , top left, for LGB ROI results from all subjects; see also supplementary Fig. S5
, its upper plot, for a confirmatory analysis on the subset of subjects who showed the most unequivocal LGB and MGB localization).
The pattern of activity for the fusiform interaction peak in the voxel-wise SPM analysis (top left of main ) showed one unexpected trend, namely a tendency for lower activation in the absence of a sound for a low-intensity visual target versus none. But this trend was nonsignificant (p>.1) so need not be considered further. In any case it may again simply reflect a selection bias for peaks of SPM interaction contrasts to highlight voxels that show apparent ‘crossover’ patterns, as explained above for other regions.
To summarize the BOLD activation results so far (Figs and ), predefined (inclusively masked) heteromodal cortex in STS showed a pattern of enhanced BOLD signal by co-occurring sounds only for lower-intensity but not higher-intensity visual targets (thus analogous to the impact on visual detection d’ found behaviourally, cf. ). Similar patterns were found in sensory-specific visual cortex, in sensory-specific insula/Heschl’s gyrus, and even in sensory-specific thalamus (LGB and MGB). Further analysis of individually-defined ROIs confirmed the interaction pattern in primary auditory cortex, MGB, and LGB (see , plus supplementary Figure S5
). This ROI analyses also confirmed that the few unexpected aspects of the interaction results from group-normalized space (i.e. apparent crossover interaction pattern for LGB; apparent loss of auditory response in presence of higher-intensity visual targets for insula/Heschl’s gyrus) were no longer evident for the more sensitive individual analyses of independently-defined ROIs. Those unexpected aspects of the group-normalized results should thus be treated with caution. By contrast all of our critical activations were found robustly in the individual ROIs, as well as for the voxel-wise group-normalized analysis.
While the observed pattern for MGB and A1 ROIs (, top right and bottom) was not identical, both showed the critical interaction, with strongest responses when the sound was paired with a low-intensity visual target. It appears that the MGB ROI tended to be somewhat more responsive to high-intensity visual targets in the absence of sound (albeit only as a nonsignificant trend) than for A1. This may reflect the fact that some subnuclei within the MGB receive visual inputs (Linke et al., 2000
), given that the BOLD signal will aggregated across different subnuclei; and/or it could reflect possible feedback signals from heteromodal STS.
Mechanistically, on the level of neuronal firing rates audio-visual integration may be rather complex, as different frequency bands of neural response can be differentially modulated by audiovisual stimuli (e.g. in the STS; see Chandrasekaran and Ghazanfar, 2009
). More generally, one mechanism potentially underlying multisensory integration in the time-frequency domain was proposed by Schroeder/Lakatos and colleagues in recent influential work (e.g. Lakatos et al., 2007
; Lakatos et al., 2008
; Schroeder et al., 2008
; Schroeder and Lakatos, 2009
) that primarily concerned tactile-auditory situations, rather than audio-visual as here. Tactile stimulation can phase-reset neural signalling in auditory cortex, thereby enhancing response to synchronous auditory inputs. Moreover, some overlapping audiotactile representations in the thalamus (Cappe et al., 2009
) have now been reported, as have learning-induced plastic changes of auditory and tactile processing due to (musical) training (Schulz et al., 2003
; Musacchia et al., 2007
). Related effects might conceivably impact on an audio-visual situation like our own, but potential phase-resetting would seem to require visual signals to precede auditory signals sufficiently to overcome the different transduction times (Musacchia et al., 2006
; Schroeder et al., 2008
; Schroeder and Lakatos, 2009
). This seems somewhat unlikely for the present concurrent
audio-visual pairings. To our knowledge, the earliest impact of concurrent visual stimuli on auditory ERP components has been found to emerge at ~50 ms poststimulus, well beyond the initial phase of auditory processing (Giard and Peronnet, 1999
; Molholm et al., 2004
). We note that Lakatos et al. (2009)
report phase-resets in macaque auditory cortex due to visual stimulation only after the initial activation.
The modulations we observe in visual
cortex (and LGB) might in principle reflect phase-resetting there (cf. Lakatos et al., 2008
for attention-related phase-resetting of visual cortex), and/or involve projections from auditory or multisensory cortex, that serve to increase the signal-to-noise ratio for the trials pairing a concurrent sound with low-intensity visual targets. In accord, Romei and colleagues recently reported an enhancement of TMS-induced ‘phosphene perception’ when sounds were combined with near-threshold TMS over visual cortex (Romei et al., 2007
). This phase-resetting could potentially be the underlying mechanism of our regional fMRI-effects and may reflect the functional coupling of distant brain regions.
Effects specific to pairing lower-intensity (versus higher-intensity) visual stimuli with sound on inter-regional effective connectivity
To assess functional coupling between brain regions, we next tested for potential condition-dependent changes in ‘effective connectivity’ between areas (i.e. inter
-regional coupling), for the affected thalamic bodies with cortical sensory-specific and heteromodal structures. Note that possible changes in inter-regional coupling are logically distinct from effects on local BOLD activations as described above, so can produce a different outcome. We tested for inter-regional coupling using the relatively assumption-free ‘psychophysiological interaction’ (PPI) approach (Friston et al., 1997
). We seeded the PPI analyses in (individually defined) left LGB or left MGB, and tested for enhanced ‘coupling’ with other regions, that arose specifically in the context of a lower- rather than higher-intensity visual target being paired with a sound (i.e. analogous interaction pattern to that found for behavioral sensitivity, d’; and for local BOLD activations above; but now testing for analogously condition-dependent changes
in the strength of functional inter-regional coupling
, rather than for local activations as in the preceding fMRI results section).
We found such enhanced coupling for the critical interaction effect (see and ), between left LGB with ipsilateral occipital areas including primary visual cortex (consistent with the visual
nature of the LGB, and thus providing further confirmation of that functional localization). Analogously, we also found such condition-dependent enhanced coupling of left MGB with ipsilateral Heschl’s gyrus (consistent with the auditory
nature of MGB, see and ). Beyond these sensory-specific coupling results for LGB or MGB seeds, we also found enhanced coupling of both MGB and
LGB with STS and putative MT+ (Campana et al., 2006
; Eckert et al., 2008
; see and ). This enhanced coupling (i.e. higher covariation in residuals) with STS and putative MT+ was again specific to the context of a sound being paired with a lower-intensity (rather than higher-intensity) visual target, i.e. to the very condition that had led to the critical behavioral enhancements of d’ and accuracy.
Enhanced inter-regional functional coupling for lower-intensity visual targets when paired with a concurrent sound (versus without a sound)
Effects for psychophysiological interaction (PPI analysis)
These new effects for MGB are in line with anatomical and electrophysiological studies reporting that subnuclei within the MGB receive some visual inputs (Linke et al., 2000
) and can respond to visual stimulation (Wepsic, 1966
; Benedek et al., 1997
; Komura et al., 2005
); plus demonstrations that the MGB is connected with STS (Burton and Jones, 1976
; Yeterian and Pandya, 1989
). To our knowledge, no direct connections of auditory regions nor of STS with LGB have been reported to date, though there is some evidence for direct connections of LGB with extrastriate regions (Yukie and Iwai, 1981
). Alternatively, the observed modulations in LGB and its condition-dependent coupling with other areas might in principle potentially involve early visual cortex which is anatomically linked with posterior STS (Falchier et al., 2002
; Ghazanfar et al., 2005
; Kayser and Logothetis, 2009
) and reciprocally connected with LGB.
Thus far, we have shown that: (i) co-occurrence of a sound significantly enhances perceptual sensitivity (d’) and detection accuracy for a lower-intensity but not a higher-intensity visual target, in apparent accord with the principle of inverse effectiveness; (ii) that a related interaction pattern is observed for BOLD activations in STS, visual cortex (plus LGB), and auditory cortex (plus MGB); (iii) we also find a logically analogous interaction pattern for inter-regional coupling. Specifically on this latter point we found enhanced coupling of the two thalamic sites (LGB or MGB) with their respective sensory-specific cortices, and also between both of these thalamic sites and STS (plus lateral occipital cortex possibly corresponding to MT+), for the particular context that led to enhanced behavioral sensitivity, i.e. with this effective connectivity being most pronounced when a lower-intensity visual target is paired with a sound.
Relation of subject-by-subject behavioral benefits to increased local brain activations and (separately) to increased strength of inter-regional coupling specifically when a lower- rather than higher-intensity visual target is paired with a sound
To test for an even closer link between brain activity and behavior, we next assessed whether our independently localized brain regions (i.e. the visually-responsive, auditorily-responsive, and candidate hetermodal areas identified by the separate blocked localizers) showed BOLD signals for which the critical interaction pattern correlated with subject-by-subject
behavioral benefits for the impact of adding sound to a lower- rather than higher-intensity visual target. We first tested for subject-by-subject brain-behaviour relations for the regional BOLD activations (i.e. for the basic contrasts of conditions). We regressed the subject-specific BOLD-interaction differences against the analogous behavioural difference. This revealed significant subject-by-subject brain-behavior regression coefficients in left visual cortex, contralateral to the visual target ( and ); plus left auditory cortex ( and ) and heteromodal STS ( and ). See also supplementary Figure S4
for a confirmatory brain-behavior regression with behavioral outliers removed from the analysis.
Subject-by-subject brain-behavior relations
Correlation of subject-specific behavioral effects with BOLD-response
Next we tested whether changes in the inter-regional ‘coupling’ of these thalamic structures with cortical areas (analogous to ) might relate to the subject-by-subject behavioral interaction outcome. When weighting the PPI analyses (seeded either in left LGB or left MGB, as individually defined) by the parametric, subject-by-subject size of the critical behavioral interaction, we found that both the LGB and the MGB independently showed stronger enhancement of coupling with bilateral STS (see and ) in the specific context of the sound-plus-lower-intensity-light condition, in relation to the impact on performance. The outcome of this weighted PPI analysis reveals that these thalamic-cortical neural coupling effects (for LGB-STS, and also separately replicated for MGB-STS) have some parametric relation to the corresponding behavioral effect in psychophysics.
Correlation of psychophysiological interactions (PPI) of LGB and MGN with subject-specific behavioral effects
When taken together the different aspects of our fMRI results clearly identify a functional corticothalamic network of visual, auditory, and multisensory regions. These regions are activated more strongly, and become more functionally integrated as shown by the inter-regional coupling data, when a task-irrelevant sound co-occurs with a lower-intensity visual target; the very same condition that led behaviorally to enhanced d’ and hit-rate in the visual detection task. This link is further strengthened by the brain-behavior relations we observed.