Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Neuroimage. Author manuscript; available in PMC 2008 January 28.
Published in final edited form as:
PMCID: PMC2214902

Multisensory Integration for Timing Engages Different Brain Networks


How does the brain integrate information from different senses into a unitary percept? What factors influence such multisensory integration? Using a rhythmic behavioral paradigm and functional magnetic resonance imaging, we identified networks of brain regions for perceptions of physically synchronous and asynchronous auditory-visual events. Measures of behavioral performance revealed the existence of three distinct perceptual states. Perception of asynchrony activated a network of the primary sensory, prefrontal, and inferior parietal cortices, perception of synchrony disengaged the inferior parietal cortex and further recruited the superior colliculus, and when no clear percept was established, only the residual areas comprised of prefrontal and sensory areas were active. These results indicate that distinct percepts arise within specific brain sub-networks, the components of which are differentially engaged and disengaged depending on the timing of environmental signals.


Integration of information received by the individual sensory systems provides us a coherent percept of the external environment (Stein and Meredith, 1993). Such multisensory integration in the brain enhances our ability to detect, locate and discriminate external objects and events. Sensory inputs that are temporally, spatially and/or contextually congruent are more effective in eliciting reliable behavioral responses than incongruent inputs (Stein et al., 1989, Sekuler et al., 1997, Calvert, 2001, Frens et al., 1995, and Calvert, et al., 2004). Incongruities can sometimes lead to dramatic perceptual illusions (e.g. the ventriloquist effect) (Bertelson, 1981). The relative timing of different sensory input signals is thus an important factor in multisensory integration and perception. Despite a few important studies suggesting that timing parameters influence multisensory processing (Bushara et al., 2001; see Calvert, et. al., 2004 for more references), our understanding of the underlying neural processes involved in multisensory integration for timing is still limited. To investigate the effects of relative timings of different sensory signals on multisensory phenomenon, we developed a rhythmic multisensory paradigm in the audio-visual domain motivated by the theoretical consideration of dynamical states of interacting phase oscillators. In this rhythmic paradigm, temporally congruent multisensory stimuli can be expected to cause a percept of synchrony, incongruent stimuli to cause a percept of asynchrony or a third possible perceptual state, the non-phase locked state, also known as phase wrapping or drift (Kelso, 1995). This third perceptual state will be referred as drift or neutral percept because it represents failure in multisensory integration and is qualitatively different from the percept of asynchrony. This notion of multisensory integration in a rhythmic paradigm is in contrast with the traditional view, which we wish to elaborate in the subsequent theoretical discussion of weakly interacting oscillators.

Following Kuramoto (1984), we represent the relative phase dynamics of weakly interacting periodic oscillators by the equation:

Φ ˙=Δω+n=1ansinnΦ+m=1bmcosmΦ

where Φ is the phase difference of the two oscillators, Φ˙ is its derivative in time, and an, bm and Δω are constant parameters. Here the actual mechanism of how the limit cycle is actually generated is of no relevance, but what matters is that the oscillation has a fixed period and is stable, i.e. it returns to its cycle after perturbations. The oscillator amplitudes will generally vary as a function of the individual phases. When a coupling is introduced into the two-oscillator system, then the oscillator phases will not be independent anymore, but will be captured by equation (1). All the dynamic properties, including the limit-cycle properties and nature of the coupling, will be absorbed in the coefficients an, bm and Δω. The two latter coefficients characterize the symmetry breaking intrinsic to the two-oscillator system. Hence for two identical oscillators with symmetric coupling, bm and Δω are zero (see also Haken, Kelso and Bunz, 1985). In most applied cases (Haken, 1996; Kelso, 1995), bm is close to zero and Δω captures the majority of the symmetry-breaking influence (Kelso, DelColle and Schöner, 1990). The coefficients an determine the phase dynamics in equation (1) and are limited to the first and/or second term, because the higher order contributions of the terms with coefficients an for n>2 generally average out (Rotating wave approximation: Haken 1983). From this it follows, that in-phase (Φ = 0) solutions and anti-phase (Φ = π) solutions always exist, whereas the stability is determined by the coefficients an. Following this reasoning and a rescaling of the temporal units to eliminate one of the coefficients, we obtain a minimal equation for the dynamics of two interacting periodic oscillators


with only two free parameters a and Δω. When we solve equation (2) numerically for randomly distributed initial conditions, we determine the resulting stable steady solutions as illustrated in figure 1. Four distinct regimes can be identified: monostable anti-phase, monostable in-phase, bistable in- and anti-phase, and a drift regime. The drift regime is obtained when the phases of the two oscillators do not lock and their phase difference Φ constantly changes. The bistable regime is defined by the overlapping stability regime of the in-phase solution of the bottom figure 1 and the anti-phase solution of the top figure 1. The major part of the outline of this region is indicated by the dashed line in figure 1. Here the resulting steady solution depends only on the initial condition. The residual areas of the parameter space in figure 1 contain the areas with only one possible steady solution.

Figure 1
The steady solutions of equation (1) are plotted as isoclines dependent on the parameters a and Δω for initial conditions close to in-phase (top) and anti-phase (bottom). Both anti-phase and in-phase solutions exist up to an upper bound ...

The above discussion of the phase dynamics of two weakly coupled periodic oscillators shows nicely that in any periodic experimental paradigm we expect the existence of anti- and in-phase solutions, a bistable regime and a drift regime (Kelso, et al., 1990). The details of the oscillator system under investigation will identify the extent of the stability regimes and symmetry breaking, but the general topology of the stability regimes in figure 1 will be conserved, though the actual structure can be distorted, reduced or increased in size. The only two free parameters, a and Δω, must then contain the entire information on the details of the system under investigation. In the context of a rhythmic multisensory paradigm involving two periodically driven sensory modalities, such as auditory and visual, the question arises if the present theory is applicable and multisensory integration actually involves the interaction of weakly coupled dynamic systems. Our theoretical considerations suggest the existence of two distinct perceptual states, a bistable regime and a regime (drift) in which both of these percepts are lost and no other clear percept is formed. Naturally the multisensory context suggests identifying the role of the in-phase state with the traditional notion of multisensory integration and the anti-phase state with segregation (Jirsa and Kelso, 2004). It is conceptually intriguing to realize that our theoretical discussion implies an equivalence of perceptual integration and segregation in the sense that both represent stable steady states. The loss of a stable steady state results in a neutral percept and corresponds to ‘drift’. This is in sharp contrast to traditional interpretations using paradigms following the hypothesis that segregation is loss of integration. This rhythmic paradigm affords not only a comparison of dichotomous conditions corresponding to congruent and incongruent stimuli, but also allows us to ask the more subtle question of which brain networks underlie the formation and dissolution of these various multimodal percepts as stimulus timing parameters change.

It is well-known that multisensory processing is mediated by a widely distributed network of brain regions involving multisensory convergence zones in cortical regions such as the superior temporal sulcus, intraparietal sulcus, posterior insula and frontal regions including premotor, prefrontal and anterior cingulate (Jones and Powell, 1970, Seltzer and Pandya, 1989 and Mesulam and Mufson, 1982), as well as subcortical structures including the claustrum, the superior colliculus and the hippocampus (Mesulam and Mufson, 1982, Pearson et. al, 1982, Mufson and Mesulam, 1989, Desimone and Cross et al, 1979, Vaadia et al., 1986 and Duhamel et al., 1991 and Meredith and Stein, 1996) (see for review Calvert et. al. 2004). Several recent studies suggest that multisensory interactions occur in unisensory areas and may not depend on feedback from higher cortical and subcortical areas (see Calvert et. al., 2004, Schroeder and Foxe, 2005 and Macaluso and Driver, 2005 for reviews). These results together may indicate that multisensory integration involves not only higher-level association cortices but also sensory-specific cortices.

Here we use functional magnetic resonance imaging and confirm the involvement of a distributed brain network for the multisensory processing of periodic auditory-visual stimuli. The relative timings of audio-visual stimuli play a crucial role in the formation and dissolution of stable percepts. By establishing several distinct perceptual states in timing parameter space of stimulus onset asynchrony (SOA) and stimulation rate, we further identify sub-networks responsive to multisensory integration in the perceptions of synchrony and asynchrony, and also to the failure of integration when no fixed percept is formed. The present results also highlight the specific roles of brain areas in different networks correlated with different perceptual states under periodic multisensory stimulation.

Materials and Methods


Eighteen subjects, 25 to 37 years of age, participated in the behavioral experiment. All the subjects were in good health with no past history of psychiatric or neurological diseases. Informed consent was collected from each subject prior to the experiment, and the study was approved by the Florida Atlantic University Institutional Review Board. Sixteen subjects, only with robust percepts, were included in the final behavioral response analysis excluding others who reported the percept of synchrony throughout the entire timing parameter space. Thirteen of those subjects participated in the fMRI experiment.

Experimental tasks and stimuli

Behavioral experiment

The behavioral experiment consisted of two sessions: a training session with 5 conditions, in which the participants became familiarized with the task, and a consecutive session with a total of 59 conditions, each in a 40–second block. Figure 2(a) shows the presentation of stimuli as a function of two timing parameters: stimulus onset asynchrony (SOA) (Δt) and stimulation rate (f). Stimulation rates were 0.5, 1, 1.5, 2, 3, and 3.5 Hz and SOAs were −200, −150, −100, −50, 0, 50, 100, 150, 200 ms. The participants were presented auditory and visual stimuli (440 Hz-30 ms tone from a speaker placed in front at a distance of 2.5 meters and 30 ms flash from a red LED placed at the same location) at different timing parameters and asked to report their perceptions of each stimulus event. At the end of each run, the subjects were required to categorize the percept as ‘simultaneous’ (S) if they had perceived the tone and the flash as synchronized events throughout the run. If the perception of synchrony was not established, then subjects were instructed to classify their percept according to the following categories: ‘AV (auditory stimuli precede visual stimuli)’, ‘VA (visual stimuli precede auditory stimuli)’, or ‘Can’t tell’. The last category was described by the participants as equivalent to the phenomenon of phase drift (“the stimuli seemed to be together, then drifted and got back together again”) or streaming (“the two stimuli were two separate entities, unrelated to each other”). In the remainder of the paper, we refer to this state of no clear percept by ‘Drift’.

Figure 2
(a). Experimental design. A visual (V) stimulus is presented after an auditory stimulus (A) at a time interval of Δt (SOA) and repeated at a rate f (stimulation rate). Behavioral response in the space of Δt and f: the negative values of ...

fMRI Experiment

After the analysis of subjects’ behavioral performance, we conducted the fMRI experiment using only those conditions that had produced highly consistent effects (stable percept or no fixed percept) among all the subjects (see Figure 2 (a) shaded boxes). Before each fMRI session, subjects were presented with stimuli used in the training session of behavioral experiment as a reminder of the task. This is how we made certain that significantly consistent perceptual effects would be obtained inside the fMRI scanner.

The fMRI experiments consisted of three functional runs of sensory (auditory, visual and auditory-visual) stimulation, and rest conditions in an on-off block design. In the stimulation conditions, a series of 440 Hz-30 millisecond auditory tones and/or 30 millisecond visual red flashes were presented through a pair of goggles and earphones in the scanner. The first and second runs were each 27.3 minutes long with a 24 second-on and 18 second-off block repeated 3 times per condition. In the first and the second runs, there were 6 unimodal conditions and 7 bimodal conditions for the stimulation rates of 0.5, 1.5 and 3.0 Hz and SOAs (−200, 0, 200), 0, (−100,0, 100) milliseconds between the onsets of a pair of auditory tone and visual flash. Only combinations of SOAs and rates shown by shaded boxes in Fig 2(a) were performed. Subjects were instructed to perceive a pair of auditory and visual stimuli as simultaneous events maintaining the same level of attention toward either modality. The third run was 9.0 minutes long with a 3-second -visual instruction, 24 second-on and 18 second-off block repeated 6 times per condition. In this run, there were 2 bimodal conditions each with SOA of 100 milliseconds presented at 1.0 Hz. In these two additional conditions, subjects were instructed with a visual cue at the beginning to perceive the stimuli either as simultaneous or as occurring in a sequence of auditory-visual (AV) stimuli. All of these conditions were presented in random order. Stimulus presentations were performed with the software, Presentation (Neurobehavioral Systems, Inc., San Francisco, CA).

fMRI Image Acquisition

A 1.5 Telsa GE Signa scanner was used to acquire T1-weighted structural images and functional EPI images for the measurement of the blood oxygenation level-dependent (BOLD) effect (Kwong et al., 1992 and Ogawa et al., 1992). The acquisition scheme and parameters used for the functional scans (546 scans in the first and second sessions and 180 scans in the third session) were as follows: echo-planar imaging, gradient recalled echo, TR = 3000 ms, TE = 40 ms, flip angle = 90 degree, 64 × 64 matrix, 30 axial slices each of thickness 5 mm acquired parallel to anterior-posterior commissural line.

fMRI Data Analysis

The data were preprocessed and analyzed using Statistical Parametric Mapping (SPM2; Wellcome Department of Cognitive Neurology, London, UK) (Friston et al., 1995; 1999). Motion correction to the first functional scan was performed within subject using a six-parameter rigid-body transformation. All 13 subjects included in this analysis had less than 4 mm of translation in all directions and less than 2.0 degrees of rotation about the three axes. The mean of the motion-corrected images was then coregistered to the individual’s 30-slice structural image using a 12-parameter affine transformation. The images were then spatially normalized to the Montreal Neurological Institute (MNI) template (Talairach and Tournoux, 1988) by applying a 12-parameter affine transformation, followed by a nonlinear warping using basis functions. These normalized images were interpolated to 2 mm isotropic voxels and subsequently smoothed with a 4 mm isotropic Gaussian kernel. A random-effects, model-based, statistical analysis was performed with SPM2 (Friston et al., 1995 and Friston et al., 1999) in a two level procedure. At the first level, a separate general linear model of the form: Y = X β + ε, was specified for each subject, where X = [ 1 X1 X2.Xi ..], β = [1 β1 β2 .. βi ..]T, and ε = N (0, σ2). Each X1,X2,.. Xi consists of a series of zeros for the off-block and ones for the on-block, and represents different stimulation conditions in each functional run and 6 motion parameters obtained from the realignment. Thus, β = (X*T X*)−1 X*T KY, where X* = KX, and K is a filter matrix dealing with non-sphericity due to possible serial correlations in the data Y. The individual contrast images were then entered into a second-level analysis, using a separate one-sample t-test for each term in the general linear model. Resulting summary statistical maps were then thresholded at p < 0.001 (uncorrected for multiple comparisons). These maps were overlaid on a high-resolution structural image in the Montreal Neurological Institute (MNI) orientation. We performed region-of-interest (ROI) analyses after anatomically defining the brain regions.

The masks covering ROIs were made by using SPM2 toolboxes: (i) MARINA (Masks for Region of Interest Analysis) (Walter, et. al, 2003), (ii) AAL (automated anatomical labeling (Tzourio-Mazoyer, et. al, 2002), and these masks were also cross-checked with a human brain atlas. We used (i) WFU-Pickatlas (Maldjian, et. al, 2003) to perform the ROI analysis, and (ii) Analysis of Functional Neural Images (AFNI) software (Cox, 1996) to extract time series from the ROIs. An interregional correlation analysis was performed to determine the networks of interdependent brain areas underlying different perceptual states.


Behavioral Performance

The number of subjects reporting a given percept allowed for a quantification of perceptual strength in the timing parameter space. These subjects’ behavioral performance indicated that the perception of auditory-visual stimuli changed qualitatively according to the values of timing parameters. As shown in Fig 2(a), four distinct perceptions were observed: perception of (i) synchrony (S), (ii) auditory leading visual stimuli (AV), (iii) visual leading auditory stimuli (VA), and (iv) changing order of stimuli (drift, D), in which subjects could report no clear percept with three perceptual states: asynchrony (A), synchrony (S) and drift (D). Notice in Fig. 2(b) the perceptual region of synchrony is asymmetrical around the SOA, extended more toward auditory leading visual stimuli (p<0.02, shown in Fig 2(b)). Below the stimulation rates of 2.5 Hz, perceptions of synchrony and asynchrony persisted, whereas above 2.0 Hz, there is a region of drift, or no fixed percept. Thus, in the timing parameter space of stimulus onset asynchrony stimulation rate explored here there exist three distinct percepts” asynchrony, synchrony and drift.

Brain Activations

The main effect of crossmodal processing was obtained by combining all the bimodal stimulation conditions. Random effects analysis of combined bimodal versus rest conditions showed bilateral activations at p < 0.001 in the inferior frontal gyrus (I), superior temporal gyrus (II), middle occipital gyrus (III) and inferior parietal lobule (IV) (a sagittal slice at x = −47.75 shown in Fig. 3). Interestingly, a negative contrast between bimodal conditions relative to rest was revealed in the posterior midbrain in the region of the superior colliculus (V), as shown in the coronal slice at y = −30 in Fig. 3. These results indicate that the network for multisensory processing of auditory-visual events is composed of a collection of cortical (sensory, parietal, and prefrontal) and subcortical (posterior midbrain) areas. Not all of these areas were active in unimodal stimulations alone. Table I shows all the activations (significance p<0.001, uncorrected for multiple comparisons, cluster size > 10 voxels) due to bimodal stimulation. Individual contrasts for bimodal conditions showed significant activations in the inferior parietal lobule for which the stimuli were perceived as asynchronous (+200 ms, 0.5 Hz), (−200 ms, 0.5 Hz) and (+100 ms, 1.0 Hz) (shown in Fig. 4 a–c). The associated mean time courses from the left inferior parietal lobule also showed significantly greater signal change (0.3 %, p<0.01) (not shown here). Activation of the inferior parietal lobule at (+100 ms, 1.0 Hz) corresponded to the percept of audition preceding vision. However, in the perceptual drift regime at (± 100ms, 3.0 Hz), there was no significant activation in the inferior parietal cortex. Our results indicate that a network composed of frontal, auditory, visual and inferior parietal areas is crucial for the formation of this percept. On the other hand, the disengagement of the inferior parietal areas from the activated network was related to perceptual drift. Nine subjects were able to establish the percept of synchrony between auditory and visual stimuli for the timing parameters (+100 ms, 1.0 Hz). A negative contrast of task versus rest revealed activations in the posterior midbrain in the region of superior colliculus (shown in Fig. 4 (d)). Once again, a network of areas is seen to support the perception of auditory-visual synchrony. Only this time the network includes the superior colliculus and not the inferior parietal lobule.

Figure 3
Activations related to crossmodal processing (p<0.001). Areas: inferior frontal (I), superior temporal gyrus (II), middle occipital gyrus (III), inferior parietal lobule (IV), and posterior midbrain (V). The activation in the posterior midbrain ...
Figure 4
(a–c). Activations related with the perception of asynchrony (p<0.001) for the following conditions: (A) (Δ t, f) = (200 ms, 0.5 Hz), ( B) (−200 ms, 0.5 Hz) (C) (100 ms, 1.0 Hz). The inferior parietal lobule (IPL) showed ...
Table I
Task-versus-Rest Contrasts

We combined the following conditions of run 1 and run 2: AV (± 200ms, 0.5 Hz) for asynchrony, AV (0 ms, 0.5–3 Hz) for synchrony, and AV (± 100ms, 3 Hz; 0 ms, 3 Hz) for drift, and performed the ROI analysis on the left and right inferior parietal areas and superior colliculus. With the same masks, we performed the ROI analysis on the conditions of asynchrony and synchrony from run 3. Table II shows that fMRI activations were significant (Family-wise multiple comparison corrections, PFWE-corr<0.01) in the inferior parietal cortex during asynchrony and in the superior colliculus during synchrony, but were not significant in both areas during drift. We further grouped all the conditions in all runs according to the observed perceptual states, that is, AV (± 200ms, 0.5 Hz; 100ms, 1Hz) for asynchrony, AV (0 ms, 0.5–3 Hz; 100ms, 1Hz) for synchrony, and AV (± 100ms, 3 Hz; 0 ms, 3 Hz) for drift, and performed interregional correlation analysis between frontal (F), auditory (A), visual (V), parietal (P) and superior colliculus (S or Sc). The conditions corresponding to asynchrony considered altogether include FAVP activations, those for synchrony include FAVS and for drift just FAV (Table I). Activations due to difference contrasts between conditions were also computed (see supplementary online material). For the interregional correlation analysis, we first made masks consisting of Frontal area (L+R inferior, middle, superior), Auditory area (L + R Heschls gyrus, superior temporal gyri/suci), Visual area (L + R Cuneus, inferior, middle occipital gyri), Parietal area (L +R inferior), and Superior colliculus (L+R)). Using this combined (FAVPS) mask, we extracted time-series from each voxel and then performed statistical tests to detect significant signal changes. The voxels that survived a significance level of p<0.001 were entered into further correlation analysis. We next computed correlations of all the remaining voxel time series with on-off waveforms corresponding to the conditions for asynchrony, synchrony and drift. We further reduced the number of voxels by choosing a significance level for the correlation values of p < 0.01 and calculated the average time-series representative of the average activity in an area. We finally calculated pairwise cross-correlations between these average timeseries for asynchrony, synchrony and drift. The cross-correlation reflects the levels of interpendence between activated areas. Figure 5 shows (i) the positively and negatively correlated brain areas with the on-off waveform for asynchrony, synchrony and drift (first row), (ii) the correlation values (second row), and (ii) the networks of brain areas based on the significant correlations (p < 0.01, except for S, which is p<0.05; correlation between S with A not significant during synchrony) (third row). These results show how the components of the FAVPS network change and reorganize with perceptual states.

Figure 5
Interregional correlation analysis: cross-correlations and the resulting networks during asynchrony, synchrony and drift. Significant positive (hot color) and negative (cold color) cross-correlation values of time-series with the on-off waveform during ...
Table II
Region-of-Interest Analysis


To gain a better understanding of the neurophysiological basis of how different perceptual states arise in a multisensory context, we developed a rhythmic multisensory paradigm motivated by ideas from the theory of informationally coupled dynamic systems (see also Lagarde and Kelso, 2005). We found, at the behavioral level, the existence of three perceptual states (asynchrony, synchrony and drift) as a function of timing parameters: SOA and stimulation rate. The notions of perceptual states adopted here were based on how we describe the dynamical states of coupled and hence interacting oscillations. Behaviorally, either a percept (synchrony or asynchrony) was formed and remained the same (steady state), or no clear percept was formed or the percept changed (non-steady state) throughout the duration of stimulation. We regarded the loss of a stable steady state as equivalent with the onset of a perceptually neutral state and referred to it as ‘drift’. The perception of synchrony was identified with multisensory integration and asynchrony with multisensory segregation, both characterized as stable steady states. Our fMRI results showed that prefrontal, auditory, visual, parietal cortices and midbrain regions re-grouped to form sub-networks responsive to these different perceptual states. These findings support the notion of a distributed network of brain areas in multisensory integration and segregation for timing, and further highlight the specific roles of the inferior parietal cortex and superior colliculus in multisensory processing (Bushara et. al, 2001, Calvert et. al., 2000).

The existing imaging, electrophysiological and anatomical literature points toward the fact that networks of brain areas, rather than any individual site, are involved in crossmodal processing although the components of these networks may be differentially responsive to synthesizing different types of crossmodal information (Calvert, 2001). Studies of non-rhythmic tasks (Bushara et al., 2001 and Assmus et al., 2003) have shown that the inferior parietal cortex is activated in the detection of asynchrony and in integrating spatial and temporal information. Primate studies also suggest that parietal cortex plays a role in time estimation (Leon and Shadlen, 2003), and functions together with the prefrontal cortex in the monitoring of temporal intervals (Onoe et al., 2001). Moreover, inferior parietal cortex appears to be involved in the perceptual analysis of gestures and complex actions (Hermsdorfer et al., 2001). Anatomical studies in primates indicate that the auditory cortex projects to inferior parietal and prefrontal cortices (Poremba et al., 2003) the cells of which have been observed to associate with visual and auditory stimuli across time (Fuster et al., 2000).

Consistent with the foregoing research, our results suggest that a network of areas comprising prefrontal, sensory and parietal cortices establishes the perception of asynchrony, whereas just the sense of the presence of timing association (without any specific relations, synchrony or asynchrony) activates only sensory and prefrontal areas. Brain activations associated with a negative contrast showed that the posterior midbrain (in the region of superior colliculus) is involved in the perception of synchrony. The superior colliculus is known to have bimodal neurons that can receive inputs from auditory and visual systems. Such bimodal neurons have been shown to exhibit additively more enhanced firing rates for synchronous than for asynchronous inputs (Stein and Meredith, 1993). In our findings, the superior colliculus and cuneus activations were revealed by a negative contrast, indicating their involvement by a higher than baseline activity. The timeseries from fMRI signal change extracted from the posterior midbrain and the cuneus (not shown here) also showed higher signal percent changes for the off-conditions than for the on-conditions. Although this reversed activity occurs reliably, interpretations are difficult based on our current understanding of the origin of the BOLD effect. The existing literature indicates three possible mechanisms for a negative BOLD response (NBR): (i) vascular stealing of blood (a hemodynamic effect) from unactivated regions, (ii) neuronal suppression, and (iii) increased neuronal activity and decreased oxygen supply (as in the ‘initial dip’ (Hu,, 1997)). Some reports of NBR in visual cortex (Shmuel, et. al., 2002 and Smith, et. al., 2004) and a recent study (Shmuel, et. al., 2006) point toward the second mechanism. The contraction of blood vessels following a neuronal response (Cauli, et. al., 2004) is also a possibility for NBR. Hemodynamic signals have been shown to correlate tightly with synchronized gamma oscillations (Niessing, 2005). The superior colliculus NBR observed here could be due to a decrease in underlying gamma-band neuronal synchrony and an increase in low-frequency oscillations in superior colliculi. This type of relationship has been shown heuristically in hemodynamic correlates of EEG (Kilner, et. al., 2005). A further possibility that cannot be ruled out is the greater recruitment of attentional resources during baseline conditions than during the task itself. However, this seems unlikely in the present task design in which participants were required to fixate throughout the functional run. Even though the exact mechanism of NBR cannot be resolved by fMRI experiments alone, our results indicate that the superior colliculus has a definite involvement in the perception of synchrony.

The perception of synchrony overall involved a network of prefrontal and sensory areas along with the superior colliculus, consistent with evidence of the presence of bimodal neurons in the superior colliculus. We examined all possible unimodal and bimodal conditions and their comparison contrasts. The differences: AV - (A+V) and (A+V)-AV contrasts did not show any significant activations (p<0.001, uncorrected). Thus we did not observe superadditivity effects in the present fMRI data. Although superadditivity is usually observed in the recordings of single neurons for multisensory integration (Wallace, et. al., 1996), this effect may not necessarily be translatable into the neuronal population response measured by BOLD fMRI. In some other recent fMRI studies (Beauchamp, 2004; Atteveldt, et. al., 2004), for example, superadditivity has not been observed either.

Within the limits of our fMRI experiment, we cannot infer how areas temporally interact to form multisensory percepts. It is not clearly established whether neural activity in these areas synchronizes or signals converge to multimodal from unisensory areas for the observed perceptual states. However, let us recall that our experimental paradigm was motivated by temporally interacting and coupled dynamic systems in the first place. The topological congruence of the parameter spaces in Figures 1 and and22 is intriguing and does indeed suggest that the three networks summarized in Fig. 5 emerge as a consequence of the interaction of oscillatory units. The rudimentary network present for ‘drift’ is a subset of the other two networks of ‘synchrony’ and ‘asynchrony’. The latter two networks lose one of their participating nodes (superior colliculus or parietal lobule) as the stimulus frequency increases, a phenomenon, which can be interpreted as a non-equilibrium phase transition (Kelso 1995; Haken 1996), and is most frequently identified with a so-called saddle node bifurcation in periodic scenarios. These considerations are very general but will be true for most situations involving weakly coupled limit cycle oscillators. In the current context of multisensory integration, we showed that the solution space of a hypothetical system of two coupled oscillators (Fig. 1) corresponds well to the perceptual solution space of human multisensory integration (Fig. 2) and to selected activations of brain regions (Fig. 5). In this line of thought, we are still missing the conceptual link between the hypothetical oscillator system and the temporal dynamics of the brain networks involved which we wish to speculate upon as follows: If the actual regions in Fig. 5 do indeed represent areas with oscillatory activity within a network, then the above mentioned saddle node bifurcation will result in a reorganization of the network as a function of the timing parameters. In particular, when the initial percept of ‘asynchrony’ or ‘synchrony’ is lost via an increase in stimulus frequency, then such will have its correlate in a reduced coherence (as a measure of degree of synchronization) in the temporal domain. We expect to see the reduced coherence in electroencephalography and/or magnetoencephalography, but only for the initial percept of ‘asynchrony’, and not for ‘synchrony’ due to the selective spatial activation as illustrated in Fig. 5. In particular, we expect parietal areas (present for ‘asynchrony’) to produce a detectable signal in the brain topographies.


Different distributed networks are activated when perceptually synchronous and asynchronous stimuli are processed. In particular, the superior colliculus is associated with the perception of synchrony and the inferior parietal cortex with the perception of asynchrony of auditory-visual signals. However, when there is no clear percept, these network components are disengaged and only a residual network comprised of sensory and frontal areas remains active. The present results demonstrate that the processes of perceptual integration and segregation engage and disengage different brain subnetworks of a potentially oscillatory nature, but leave only a rudimentary network in place if no clear percept is formed.

Supplementary Material



This work was supported by DARPA grant NBCH1020010, NIMH Grants MH42900 and MH01386, and NINDS Grant NS48220.


Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errorsmaybe discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.


  • Ashburner J, Friston KJ. Nonlinear spatial normalization using basis functions. Hum Brain Mapp. 1999;7:254–266. [PubMed]
  • Van Atteveldt N, Formisano E, Goebel R, Blomert L. Integration of letters and speech sounds in the human brain. Neuron. 2004;43:271–82. [PubMed]
  • Assmus A, Marshall JC, Ritzl A, North J, Zilles K, Fink GR. Left inferior parietal cortex integrates time and space during collision judgments. NeuroImage. 2003;20:S82–S88. [PubMed]
  • Beauchamp MS, Argall BD, Bodurka J, Duyn JH, Martin A. Unraveling multisensory integration: patchy organization within human STS multisensory cortex. Nature Neuroscience. 2004;7:1190–1192. [PubMed]
  • Bushara KO, Grafman J, Hallett M. Neural correlates of auditory-visual stimulus onset asynchrony detection. J Neurosci. 2001;21:300–304. [PubMed]
  • Bertelson P, Radeau M. Cross-modal bias and perceptual fusion with auditory-visual spatial discordance. Percept Psychophys. 1981;29:578–584. [PubMed]
  • Calvert GA. Crossmodal Processing in the human brain: insights from functional neuroimaging studies. Cereb Cortex. 2001;11:1110–1123. [PubMed]
  • Calvert GA, Campbell R, Brammer MJ. Evidence from functional magnetic resonance imaging of crossmodal binding in the human heteromodal cortex. Curr Biol. 2000;10:649–657. [PubMed]
  • Calvert GA, Spence C, Stein BE. The Handbook of Multisensory Processes. MIT Press; Cambridge, MA: 2004.
  • Cauli B, Tong XK, Rancillac A, Serluca N, Lambolez B, Rossier J, Hamel E. Cortical GABA interneurons in neurovascular coupling: relays for subcortical vasoactive pathways. J Neurosci. 2004;24:8940–8949. [PubMed]
  • Cox RW. AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. Comput Biomed Res. 1996;29:162–173. [PubMed]
  • Desimone R, Cross CG. Visual areas in the temporal cortex of the macaque. Brain Res. 1979;178:363–380. [PubMed]
  • Duhamel JR, Colby CL, Goldberg ME. In: Brain and Space. Paillard J, editor. Oxford University Press; New York: 1991. p. 223.
  • Frens MA, Van Opstal AJ, Van der Willigen RF. Visual-auditory interactions modulate saccade-related activity in monkey superior colliculus. Brain Res Bull. 1995;46:211–224. [PubMed]
  • Friston KJ, Holmes AP, Worsley KJ. How many subjects constitute a study ? NeuroImage. 1999;10:1–5. [PubMed]
  • Friston KJ, Holmes AP, Worsley KJ, Poline J-B, Frith CD, Frackowiak RSJ. Stastistical parametric maps in functional imaging: a general linear approach. Hum Brain Mapp. 1995;2:189–210.
  • Fuster JM, Bodner M, Kroger JK. Cross-modal and cross-temporal association in neurons of frontal cortex. Nature. 2000;405:347–351. [PubMed]
  • Haken H. An Introduction. 3. Berlin, Heidelberg, New York: Springer; 1983. Synergetics.
  • Haken H. Principles of brain functioning. Berlin, Heidelberg, New York: Springer; 1996.
  • Haken H, Kelso JAS, Bunz HH. A theoretical model of phase transitions in human hand movements. Biological Cybernetics. 1985;51:347–356. [PubMed]
  • Hermsdorfer J, Goldenberg G, Wachsmuth C, Conrad B, Ceballos-Baumann AO, Bartenstein P, Schwaiger M, Boecker H. Cortical correlates of gesture processing: clues to the cerebral mechanisms underlying apraxia during the imitation of meaningless gestures. NeuroImage. 2001;14:149–161. [PubMed]
  • Hu X, Le TH, Ugurbil K. Evaluation of the early response in fMRI in individual subjects using short stimulus duration. Magn Reson Med. 1997;37:877–884. [PubMed]
  • Jirsa VK, Kelso JAS. Integration and segregation of perceptual and motor behavior. In: Jirsa VK, Kelso JAS, editors. Coordination Dynamics: Issues and Trends Vol. 1. Springer Series in Understanding Complex Systems; Berlin-Heidelberg: 2004.
  • Jones EG, Powell TP. An anatomical study of converging sensory pathways within the cerebral cortex of the monkey. Brain. 1970;93:793–820. [PubMed]
  • Kelso JAS. Dynamic Patterns: Self-Organization of Brain and Behavior. MIT Press; Cambridge, MA: 1995.
  • Kelso JAS, DelColle J, Schöner G. G Action-Perception as a pattern formation process. In: Jeannerod M, editor. Attention and Performance XIII. Hillsdale, NJ: Erlbaum; 1990. pp. 139–169.
  • Kilner JM, Mattout J, Henson R, Friston KJ. Hemodynamic correlates of EEG: a heuristic. NeuroImage. 2005;28:280–286. [PubMed]
  • Kuramoto Y. Chemical oscillations, waves, and turbulence. Berlin, Heidelberg, New York: Springer; 1984.
  • Kwong K, Belliveau JW, Chesler DA, Goldberg IE, Weisskoff RM, Proncelet BP, Kennedy DN, Hoppel BE, Cohen MS, Turner R, Cheng HM, Brady TJ, Rosen BR. Dynamic magnetic resonance imaging of human brain activity during primary sensory stimulation. Proc Natl Acad Sci, USA. 1992;89:5675–5679. [PubMed]
  • Lagarde J, Kelso JAS. Binding of movement, sound and touch: Multimodal coordination dynamics. Experimental Brain Research. in press. [PubMed]
  • Leon MI, Shadlen MN. Representation of time by neurons in the posterior parietal cortex of the macaque. Neuron. 2003;38:317–327. [PubMed]
  • Macaluso E, Driver J. Multisensory spatial interactions: a window onto functional integration in the human brain. Trends Neurosci. 2005;28:264–271. [PubMed]
  • Maldjian JA, Laurienti PJ, Kraft RA, Burdette JH. An automated method for neuroanatomic and cytoarchitectonic atlas-based interrogation of fmri data sets. NeuroImage. 2003;19:1233–1239. [PubMed]
  • Meredith MA, Stein BE. Spatial determinants of multisensory integration in cat superior colliculus neurons. J Neurophysiol. 1996;75:1843–1857. [PubMed]
  • Mesulam MM, Mufson EJ. Insula of the old world monkey. III: Efferent cortical output and comments on function. J Comp Neurol. 1982;212:38–52. [PubMed]
  • Mufson EJ, Mesulam MM. Thalamic connections of the insula in the rhesus monkey and comments on the paralimbic connectivity of the medial pulvinar nucleus. J Comp Neurol. 1989;227:109–120. [PubMed]
  • Niessing J, Ebisch B, Schmidt KE, Niessing M, Singer W, Galuske RAW. Hemodynamic signals correlate tightly with synchronized gamma oscillations. Science. 2005;309:948–951. [PubMed]
  • Ogawa S, Tank DW, Menon R, Ellermann JM, Kim SG, Merkle H, Ugurbil KN. Intrinsic signal changes accompanying sensory stimulation: functional brain mapping with magnetic resonance imaging. Proc Natl Acad Sci, USA. 1992;89:5951–5955. [PubMed]
  • Onoe H, Komori M, Onoe K, Takechi H, Tsukada H, Watanabe Y. Cortical networks recruited for time perception: a monkey positron emission tomography (PET) study. NeuroImage. 2001;13:37–45. [PubMed]
  • Pearson RC, Brodal P, Gatter KC, Powell TP. The organization of the connections between the cortex and the claustrum in the monkey. Brain Res. 1982;234:435–441. [PubMed]
  • Poremba A, Saunders RC, Crane AM, Cook M, Sokoloff L, Mishkin M. Functional Mapping of the Primate Auditory System. Science. 2003;299:568–571. [PubMed]
  • Schroeder CE, Foxe J. Multisensory contributions to low-level, ‘unisensory’ processing. Curr Opin Neurobiol. 2005;15:454–458. [PubMed]
  • Schmuel A, Augath M, Oeltermann A, Logothetis NK. Negative functional MRI response correlates with decreases in neuronal activity in monkey visual area V1. Nat Neurosc. 2006;9:569–577. [PubMed]
  • Shmuel A, Yacoub E, Pfeuffer J, Van de Moortele P-F. Sustained negative BOLD, blood flow and oxygen consumption response and its coupling to the positive response in the human brain. Neuron. 2002;36:1195–1210. [PubMed]
  • Sekuler R, Sekuler AB, Lau R. Sound alters visual motion perception. Nature. 1993;385:308–308. [PubMed]
  • Seltzer B, Pandya DN. Frontal lobe connections of the superior temporal sulcus in the rhesus monkey. J Comp Neurol. 1989;281:97–113. [PubMed]
  • Smith AT, Williams AL, Singh KD. Negative BOLD in the visual cortex: evidence against blood stealing. Hum Brain Mapp. 2004;21:213–220. [PubMed]
  • Stein BE, Meredith MA. The Merging of the Senses. MIT Press; Cambridge, MA: 1993.
  • Stein BE, Meredith MA, Huneycutt WS, McDade L. Behavioral indices of multisensory integration: Orientation to visual cues is affected by auditory stimuli. J Cog Neurosci. 1989;1:12–24. [PubMed]
  • Talairach J, Tournoux P. Co-planar stereotaxic atlas of the brain. Thieme; New York: 1988.
  • Tzourio-Mazoyer N, Landeau B, Papathanassiou D, Crivello F, Etard O, Delcroix N, et al. Automated anatomical labeling of activations in spm using a macroscopic anatomical parcellation of the MNI MRI single subject brain. Neuroimage. 2002;15:273–289. [PubMed]
  • Vaadia E, Benson DA, Hienz RD, Goldstein MHJ. Unit study of monkey frontal cortex: active localization of auditory and of visual stimuli. J Neurophsiol. 1986;56:934–952. [PubMed]
  • Wallace MT, McHaffie JG, Stein BE. Visual response properties and visuotopic representation in the newborn monkey superior colliculus. J Neurophysiol. 1997;78:2732–2741. [PubMed]
  • Walter B, Blecker C, Kirsch P, Sammer G, Schienle A, Stark R, Vaitl D. MARINA: An easy to use tool for the creation of MAsks for Region of Interest Analyses [Abstract]. Presented at the 9th International Conference on Functional Mapping of the Human Brain; June 19–22; 2003. CD-ROM in NeuroImage.