The sensory signals that drive movement planning arrive in a variety of “reference frames”, so integrating or comparing them requires sensory transformations. We propose a model where the statistical properties of sensory signals and their transformations determine how these signals are used. This model captures the patterns of gaze-dependent errors found in our human psychophysics experiment when the sensory signals available for reach planning are varied. These results challenge two widely held ideas: error patterns directly reflect the reference frame of the underlying neural representation, and it is preferable to use a single common reference frame for movement planning. We show that gaze-dependent error patterns, often cited as evidence for retinotopic reach planning, can be explained by a transformation bias and are not exclusively linked to retinotopic representations. Further, the presence of multiple reference frames allows for optimal use of available sensory information and explains task-dependent reweighting of sensory signals.
Human psychophysical studies have described multisensory perceptual benefits such as enhanced detection rates and faster reaction times in great detail. However, the neural circuits and mechanism underlying multisensory integration remain difficult to study in the primate brain. While rodents offer the advantage of a range of experimental methodologies to study the neural basis of multisensory processing, rodent studies are still limited due to the small number of available multisensory protocols. We here demonstrate the feasibility of an audio-visual stimulus detection task for rats, in which the animals detect lateralized uni- and multi-sensory stimuli in a two-response forced choice paradigm. We show that animals reliably learn and perform this task. Reaction times were significantly faster and behavioral performance levels higher in multisensory compared to unisensory conditions. This benefit was strongest for dim visual targets, in agreement with classical patterns of multisensory integration, and was specific to task-informative sounds, while uninformative sounds speeded reaction times with little costs for detection performance. Importantly, multisensory benefits for stimulus detection and reaction times appeared at different levels of task proficiency and training experience, suggesting distinct mechanisms inducing these two multisensory benefits. Our results demonstrate behavioral multisensory enhancement in rats in analogy to behavioral patterns known from other species, such as humans. In addition, our paradigm enriches the set of behavioral tasks on which future studies can rely, for example to combine behavioral measurements with imaging or pharmacological studies in the behaving animal or to study changes of integration properties in disease models.
In this paper, we present two neural network models – devoted to two specific and widely investigated aspects of multisensory integration – in order to evidence the potentialities of computational models to gain insight into the neural mechanisms underlying organization, development, and plasticity of multisensory integration in the brain. The first model considers visual–auditory interaction in a midbrain structure named superior colliculus (SC). The model is able to reproduce and explain the main physiological features of multisensory integration in SC neurons and to describe how SC integrative capability – not present at birth – develops gradually during postnatal life depending on sensory experience with cross-modal stimuli. The second model tackles the problem of how tactile stimuli on a body part and visual (or auditory) stimuli close to the same body part are integrated in multimodal parietal neurons to form the perception of peripersonal (i.e., near) space. The model investigates how the extension of peripersonal space – where multimodal integration occurs – may be modified by experience such as use of a tool to interact with the far space. The utility of the modeling approach relies on several aspects: (i) The two models, although devoted to different problems and simulating different brain regions, share some common mechanisms (lateral inhibition and excitation, non-linear neuron characteristics, recurrent connections, competition, Hebbian rules of potentiation and depression) that may govern more generally the fusion of senses in the brain, and the learning and plasticity of multisensory integration. (ii) The models may help interpretation of behavioral and psychophysical responses in terms of neural activity and synaptic connections. (iii) The models can make testable predictions that can help guiding future experiments in order to validate, reject, or modify the main assumptions.
neural network modeling; multimodal neurons; superior colliculus; peripersonal space; neural mechanisms; learning and plasticity; behavior
Sensory processing in the brain includes three key operations: multisensory integration—the task of combining cues into a single estimate of a common underlying stimulus; coordinate transformations—the change of reference frame for a stimulus (e.g., retinotopic to body-centered) effected through knowledge about an intervening variable (e.g., gaze position); and the incorporation of prior information. Statistically optimal sensory processing requires that each of these operations maintains the correct posterior distribution over the stimulus. Elements of this optimality have been demonstrated in many behavioral contexts in humans and other animals, suggesting that the neural computations are indeed optimal. That the relationships between sensory modalities are complex and plastic further suggests that these computations are learned—but how? We provide a principled answer, by treating the acquisition of these mappings as a case of density estimation, a well-studied problem in machine learning and statistics, in which the distribution of observed data is modeled in terms of a set of fixed parameters and a set of latent variables. In our case, the observed data are unisensory-population activities, the fixed parameters are synaptic connections, and the latent variables are multisensory-population activities. In particular, we train a restricted Boltzmann machine with the biologically plausible contrastive-divergence rule to learn a range of neural computations not previously demonstrated under a single approach: optimal integration; encoding of priors; hierarchical integration of cues; learning when not to integrate; and coordinate transformation. The model makes testable predictions about the nature of multisensory representations.
Over the first few years of their lives, humans (and other animals) appear to learn how to combine signals from multiple sense modalities: when to “integrate” them into a single percept, as with visual and proprioceptive information about one's body; when not to integrate them (e.g., when looking somewhere else); how they vary over longer time scales (e.g., where in physical space my hand tends to be); as well as more complicated manipulations, like subtracting gaze angle from the visually-perceived position of an object to compute the position of that object with respect to the head—i.e., “coordinate transformation.” Learning which sensory signals to integrate, or which to manipulate in other ways, does not appear to require an additional supervisory signal; we learn to do so, rather, based on structure in the sensory signals themselves. We present a biologically plausible artificial neural network that learns all of the above in just this way, but by training it for a much more general statistical task: “density estimation”—essentially, learning to be able to reproduce the data on which it was trained. This also links coordinate transformation and multisensory integration to other cortical operations, especially in early sensory areas, that have have been modeled as density estimators.
Determining when, if, and how information from separate sensory channels has been combined is a fundamental goal of research on multisensory processing in the brain. This can be a particular challenge in psychophysical data, as there is no direct recording of neural output. The most common way to characterize multisensory interactions in behavioral data is to compare responses to multisensory stimulation with the race model, a model of parallel, independent processing constructed from the probability of responses to the two unisensory stimuli which make up the multisensory stimulus. If observed multisensory reaction times are faster than those predicted by the model, it is inferred that information from the two channels is being combined rather than processed independently. Recently, behavioral research has been published employing capacity analyses where comparisons between two conditions are carried out at the level of the integrated hazard function. Capacity analyses seem to be particularly appealing technique for evaluating multisensory functioning, as they describe relationships between conditions across the entire distribution curve, are relatively easy and intuitive to interpret. The current paper presents capacity analysis of a behavioral data set previously analyzed using the race model. While applications of capacity analyses are still somewhat limited due to their novelty, it is hoped that this exploration of capacity and race model analyses will encourage the use of this promising new technique both in multisensory research and other applicable fields.
capacity; hazard analysis; human aging; multisensory integration; psychophysics; race model
Fundamental observations and principles derived from traditional physiological studies of multisensory integration have been difficult to reconcile with computational and psychophysical studies that share the foundation of probabilistic (Bayesian) inference. We review recent work on multisensory integration, focusing on experiments that bridge single-cell electrophysiology, psychophysics, and computational principles. These studies show that multisensory (visual-vestibular) neurons can account for near-optimal cue integration during perception of self-motion. Unlike the nonlinear (super-additive) interactions emphasized in some previous studies, visual-vestibular neurons accomplish near-optimal cue integration through sub-additive linear summation of their inputs, consistent with recent computational theories. Important issues remain to be resolved, including the observation that variations in cue reliability appear to change the weights that neurons apply to their different sensory inputs.
The integration of multisensory information is essential to forming meaningful representations of the environment. Adults benefit from related multisensory stimuli but the extent to which the ability to optimally integrate multisensory inputs for functional purposes is present in children has not been extensively examined. Using a cross-sectional approach, high-density electrical mapping of event-related potentials (ERPs) was combined with behavioral measures to characterize neurodevelopmental changes in basic audiovisual (AV) integration from middle childhood through early adulthood. The data indicated a gradual fine-tuning of multisensory facilitation of performance on an AV simple reaction time task (as indexed by race model violation), which reaches mature levels by about 14 years of age. They also revealed a systematic relationship between age and the brain processes underlying multisensory integration (MSI) in the time frame of the auditory N1 ERP component (∼120 ms). A significant positive correlation between behavioral and neurophysiological measures of MSI suggested that the underlying brain processes contributed to the fine-tuning of multisensory facilitation of behavior that was observed over middle childhood. These findings are consistent with protracted plasticity in a dynamic system and provide a starting point from which future studies can begin to examine the developmental course of multisensory processing in clinical populations.
children; cross-modal; development, electrophysiology; ERP; multisensory integration
C. elegans is widely used to dissect how neural circuits and genes generate behavior. During locomotion, worms initiate backward movement to change locomotion direction spontaneously or in response to sensory cues; however, the underlying neural circuits are not well defined. We applied a multidisciplinary approach to map neural circuits in freely-behaving worms by integrating functional imaging, optogenetic interrogation, genetic manipulation, laser ablation, and electrophysiology. We found that a disinhibitory circuit and a stimulatory circuit together promote the initiation of backward movement, and that circuitry dynamics is differentially regulated by sensory cues. Both circuits require glutamatergic transmission but depend on distinct glutamate receptors. This dual mode of motor initiation control is found in mammals, suggesting that distantly related organisms with anatomically distinct nervous systems may adopt similar strategies for motor control. Additionally, our studies illustrate how a multidisciplinary approach facilitates dissection of circuit and synaptic mechanisms underlying behavior in a genetic model organism.
In everyday life our brain often receives information about events and objects in the real world via several sensory modalities, because natural objects often stimulate more than one sense. These different types of information are processed in our brain along different sensory-specific pathways, but are finally integrated into a unified percept. During the last years, studies provided compelling evidence that the neural basis of multisensory integration is not restricted to higher association areas of the cortex, but can already occur at low-level stages of sensory cortical processing and even in subcortical structures. In this article we will review the potential role of several thalamic structures in multisensory interplay and discuss their extensive anatomical connections with sensory-specific and multisensory cortical structures. We conclude that sensory-specific thalamic structures may act as a crucial processing node of multisensory interplay in addition to their traditional role as sensory relaying structure.
multisensory integration; cortex; fMRI; thalamus; anatomy
As sensory systems deteriorate in aging or disease, the brain must relearn the appropriate weights to assign each modality during multisensory integration. Using blood-oxygen level dependent functional magnetic resonance imaging of human subjects, we tested a model for the neural mechanisms of sensory weighting, termed “weighted connections.” This model holds that the connection weights between early and late areas vary depending on the reliability of the modality, independent of the level of early sensory cortex activity. When subjects detected viewed and felt touches to the hand, a network of brain areas was active, including visual areas in lateral occipital cortex, somatosensory areas in inferior parietal lobe, and multisensory areas in the intraparietal sulcus (IPS). In agreement with the weighted connection model, the connection weight measured with structural equation modeling between somatosensory cortex and IPS increased for somatosensory-reliable stimuli, and the connection weight between visual cortex and IPS increased for visual-reliable stimuli. This double dissociation of connection strengths was similar to the pattern of behavioral responses during incongruent multisensory stimulation, suggesting that weighted connections may be a neural mechanism for behavioral reliability weighting.
effective connectivity; intraparietal cortex; BOLD fMRI; structural equation modeling; weighted connections; area MT
To control targeted movements, such as reaching to grasp an object or hammering a nail, the brain can use divers sources of sensory information, such as vision and proprioception. Although a variety of studies have shown that sensory signals are optimally combined according to principles of maximum likelihood, increasing evidence indicates that the CNS does not compute a single, optimal estimation of the target's position to be compared with a single optimal estimation of the hand. Rather, it employs a more modular approach in which the overall behavior is built by computing multiple concurrent comparisons carried out simultaneously in a number of different reference frames. The results of these individual comparisons are then optimally combined in order to drive the hand. In this article we examine at a computational level two formulations of concurrent models for sensory integration and compare this to the more conventional model of converging multi-sensory signals. Through a review of published studies, both our own and those performed by others, we produce evidence favoring the concurrent formulations. We then examine in detail the effects of additive signal noise as information flows through the sensorimotor system. By taking into account the noise added by sensorimotor transformations, one can explain why the CNS may shift its reliance on one sensory modality toward a greater reliance on another and investigate under what conditions those sensory transformations occur. Careful consideration of how transformed signals will co-vary with the original source also provides insight into how the CNS chooses one sensory modality over another. These concepts can be used to explain why the CNS might, for instance, create a visual representation of a task that is otherwise limited to the kinesthetic domain (e.g., pointing with one hand to a finger on the other) and why the CNS might choose to recode sensory information in an external reference frame.
sensory integration; motor control; maximum likelihood; reference frames
Humans are remarkably adept at understanding speech, even when it is contaminated by noise. Multisensory integration may explain some of this ability: combining independent information from the auditory modality (vocalizations) and the visual modality (mouth movements) reduces noise and increases accuracy. Converging evidence suggests that the superior temporal sulcus (STS) is a critical brain area for multisensory integration, but little is known about its role in the perception of noisy speech. Behavioral studies have shown that perceptual judgments are weighted by the reliability of the sensory modality: more reliable modalities are weighted more strongly, even if the reliability changes rapidly. We hypothesized that changes in the functional connectivity of STS with auditory and visual cortex could provide a neural mechanism for perceptual reliability-weighting. To test this idea, we performed five blood oxygenation level dependent (BOLD) fMRI and behavioral experiments in 34 healthy subjects. We found increased functional connectivity between the STS and auditory cortex when the auditory modality was more reliable (less noisy) and increased functional connectivity between the STS and visual cortex when the visual modality was more reliable, even when the reliability changed rapidly during presentation of successive words. This finding matched the results of a behavioral experiment in which the perception of incongruent audiovisual syllables was biased toward the more reliable modality, even with rapidly changing reliability. Changes in STS functional connectivity may be an important neural mechanism underlying the perception of noisy speech.
audiovisual; speech; functional connectivity; STS; fMRI
Sensory processing disorder (SPD) is characterized by anomalous reactions to, and integration of, sensory cues. Although the underlying etiology of SPD is unknown, one brain region likely to reflect these sensory and behavioral anomalies is the superior colliculus (SC), a structure involved in the synthesis of information from multiple sensory modalities and the control of overt orientation responses. In the present review we describe normal functional properties of this structure, the manner in which its individual neurons integrate cues from different senses, and the overt SC-mediated behaviors that are believed to manifest this “multisensory integration.” Of particular interest here is how SC neurons develop their capacity to engage in multisensory integration during early postnatal life as a consequence of early sensory experience, and the intimate communication between cortex and the midbrain that makes this developmental process possible.
multisensory; integration; plasticity; sensory processing disorder; superior colliculus
Perception of our environment is a multisensory experience; information from different sensory systems like the auditory, visual and tactile is constantly integrated. Complex tasks that require high temporal and spatial precision of multisensory integration put strong demands on the underlying networks but it is largely unknown how task experience shapes multisensory processing. Long-term musical training is an excellent model for brain plasticity because it shapes the human brain at functional and structural levels, affecting a network of brain areas. In the present study we used magnetoencephalography (MEG) to investigate how audio-tactile perception is integrated in the human brain and if musicians show enhancement of the corresponding activation compared to non-musicians. Using a paradigm that allowed the investigation of combined and separate auditory and tactile processing, we found a multisensory incongruency response, generated in frontal, cingulate and cerebellar regions, an auditory mismatch response generated mainly in the auditory cortex and a tactile mismatch response generated in frontal and cerebellar regions. The influence of musical training was seen in the audio-tactile as well as in the auditory condition, indicating enhanced higher-order processing in musicians, while the sources of the tactile MMN were not influenced by long-term musical training. Consistent with the predictive coding model, more basic, bottom-up sensory processing was relatively stable and less affected by expertise, whereas areas for top-down models of multisensory expectancies were modulated by training.
A sensory stimulus evokes activity in many neurons, creating a population response that must be “decoded” by the brain to estimate the parameters of that stimulus. Most decoding models have suggested complex neural circuits that compute optimal estimates of sensory parameters on the basis of responses in many sensory neurons. We propose a slightly suboptimal but practically simpler decoder. Decoding neurons integrate their inputs across 100 ms; incoming spikes are weighted by the preferred stimulus of the neuron of origin; and a local, cellular non-linearity approximates divisive normalization without dividing explicitly. The suboptimal decoder includes two simplifying approximations. It uses estimates of firing rate across the population rather than computing the total population response, and it implements divisive normalization with local cellular mechanisms of single neurons rather than more complicated neural circuit mechanisms. When applied to the practical problem of estimating target speed from a realistic simulation of the population response in extrastriate visual area MT, the suboptimal decoder has almost the same accuracy and precision as traditional decoding models. It succeeds in predicting the precision and imprecision of motor behavior using a suboptimal decoding computation because it adds only a small amount of imprecision to the code for target speed in MT, which is itself imprecise.
population decoding; divisive normalization; spike timing; MT; vector averaging
During coordinated eye– hand movements, saccade reaction times (SRTs) and reach reaction times (RRTs) are correlated in humans and monkeys. Reaction times (RTs) measure the degree of movement preparation and can correlate with movement speed and accuracy. However, RTs can also reflect effector nonspecific influences, such as motivation and arousal. We use a combination of behavioral psychophysics and computational modeling to identify plausible mechanisms for correlations in SRTs and RRTs. To disambiguate nonspecific mechanisms from mechanisms specific to movement coordination, we introduce a dual-task paradigm in which a reach and a saccade are cued with a stimulus onset asynchrony (SOA). We then develop several variants of integrate-to-threshold models of RT, which postulate that responses are initiated when the neural activity encoding effector-specific movement preparation reaches a threshold. The integrator models formalize hypotheses about RT correlations and make predictions for how each RT should vary with SOA. To test these hypotheses, we trained three monkeys to perform the eye– hand SOA task and analyzed their SRTs and RRTs. In all three subjects, RT correlations decreased with increasing SOA duration. Additionally, mean SRT decreased with decreasing SOA, revealing facilitation of saccades with simultaneous reaches, as predicted by the model. These results are not consistent with the predictions of the models with common modulation or common input but are compatible with the predictions of a model with mutual excitation between two effector-specific integrators. We propose that RT correlations are not simply attributable to motivation and arousal and are a signature of coordination.
When planning target-directed reaching movements, human subjects combine visual and proprioceptive feedback to form two estimates of the arm’s position: one to plan the reach direction, and another to convert that direction into a motor command. These position estimates are based on the same sensory signals but rely on different combinations of visual and proprioceptive input, suggesting that the brain weights sensory inputs differently depending on the computation being performed. Here we show that the relative weighting of vision and proprioception depends on both the sensory modality of the target and the information content of the visual feedback, and that these factors affect the two stages of planning independently. The observed diversity of weightings demonstrates the flexibility of sensory integration, and suggests a unifying principle by which the brain chooses sensory inputs in order to minimize errors arising from the transformation of sensory signals between coordinate frames.
Cross-modality, or the interaction between the different senses, has emerged as a fundamental concept in perceptual neuroscience and psychology. The traditional idea of five separate senses with independent neural substrates has been invalidated by both psychophysical findings of sensory integration and neurophysiological discoveries of multi-modal neurons in many areas of the brain. Even areas previously thought to be unimodal have been shown to be influenced by other senses, thus establishing multisensory integration as a key principle of perceptual neuroscience.
There are several obstacles to students’ understanding of the concept. First, everyday subjective experience is modal: one sees, hears, smells the world and is rarely aware that these seemingly separate impressions are in reality fully integrated with each other. Second, standard content in undergraduate classes and textbooks still emphasizes the modal model of the senses and their corresponding brain areas and rarely mentions cross-modal phenomena. Third, feasible classroom demonstrations of cross-modality are few, making it difficult to provide students with first-hand experience that would aid their understanding of the principle.
This article describes an accessible and effective classroom demonstration of cross-modality between low-level vision, touch and proprioception. It consists in the illusion of eyelid droop in one eye when the other eye has been dark-adapted and when both eyes are exposed to the dark. The perceptual effect is dramatic and reliable. It illustrates cross-modality at a fundamental level of perception and might provide a means to help integrate the teaching of the concept into the standard content of undergraduate classes.
multisensory integration; cross-modality; illusions; class demonstrations
We perceive our surrounding environment by using different sense organs. However, it is not clear how the brain estimates information from our surroundings from the multisensory stimuli it receives. While Bayesian inference provides a normative account of the computational principle at work in the brain, it does not provide information on how the nervous system actually implements the computation. To provide an insight into how the neural dynamics are related to multisensory integration, we constructed a recurrent network model that can implement computations related to multisensory integration. Our model not only extracts information from noisy neural activity patterns, it also estimates a causal structure; i.e., it can infer whether the different stimuli came from the same source or different sources. We show that our model can reproduce the results of psychophysical experiments on spatial unity and localization bias which indicate that a shift occurs in the perceived position of a stimulus through the effect of another simultaneous stimulus. The experimental data have been reproduced in previous studies using Bayesian models. By comparing the Bayesian model and our neural network model, we investigated how the Bayesian prior is represented in neural circuits.
causality inference; multisensory integration; spatial orientation; recurrent neural network; Mexican-hat type interaction
In humans, the implementation of multijoint tasks of the arm implies a highly complex integration of sensory information, sensorimotor transformations and motor planning. Computational models can be profitably used to better understand the mechanisms sub-serving motor control, thus providing useful perspectives and investigating different control hypotheses. To this purpose, the use of Artificial Neural Networks has been proposed to represent and interpret the movement of upper limb. In this paper, a neural network approach to the modelling of the motor control of a human arm during planar ballistic movements is presented.
The developed system is composed of three main computational blocks: 1) a parallel distributed learning scheme that aims at simulating the internal inverse model in the trajectory formation process; 2) a pulse generator, which is responsible for the creation of muscular synergies; and 3) a limb model based on two joints (two degrees of freedom) and six muscle-like actuators, that can accommodate for the biomechanical parameters of the arm. The learning paradigm of the neural controller is based on a pure exploration of the working space with no feedback signal. Kinematics provided by the system have been compared with those obtained in literature from experimental data of humans.
The model reproduces kinematics of arm movements, with bell-shaped wrist velocity profiles and approximately straight trajectories, and gives rise to the generation of synergies for the execution of movements. The model allows achieving amplitude and direction errors of respectively 0.52 cm and 0.2 radians.
Curvature values are similar to those encountered in experimental measures with humans.
The neural controller also manages environmental modifications such as the insertion of different force fields acting on the end-effector.
The proposed system has been shown to properly simulate the development of internal models and to control the generation and execution of ballistic planar arm movements. Since the neural controller learns to manage movements on the basis of kinematic information and arm characteristics, it could in perspective command a neuroprosthesis instead of a biomechanical model of a human upper limb, and it could thus give rise to novel rehabilitation techniques.
Studies of reaching suggest that humans adapt to novel arm dynamics by building internal models that transform planned sensory states of the limb, e.g., desired limb position and its derivatives, into motor commands, e.g., joint torques. Earlier work modeled this computation via a population of basis elements and used system identification techniques to estimate the tuning properties of the bases from the patterns of generalization. Here we hypothesized that the neural representation of planned sensory states in the internal model might resemble the signals from the peripheral sensors. These sensors normally encode the limb's actual sensory state in which movement errors occurred. We developed a set of equations based on properties of muscle spindles that estimated spindle discharge as a function of the limb's state during reaching and drawing of circles. We then implemented a simulation of a two-link arm that learned to move in various force fields using these spindle-like bases. The system produced a pattern of adaptation and generalization that accounted for a wide range of previously reported behavioral results. In particular, the bases showed gain-field interactions between encoding of limb position and velocity, very similar to the gain fields inferred from behavioral studies. The poor sensitivity of the bases to limb acceleration predicted behavioral results that were confirmed by experiment. We suggest that the internal model of limb dynamics is computed by the brain with neurons that encode the state of the limb in a manner similar to that expected of muscle spindle afferents.
reaching; arm movements; adaptation; force fields; computational models; motor control; motor learning
There is increasing evidence that the brain relies on a set of canonical neural computations, repeating them across brain regions and modalities to apply similar operations to different problems. A promising candidate for such a computation is normalization, in which the responses of neurons are divided by a common factor that typically includes the summed activity of a pool of neurons. Normalization was developed to explain responses in the primary visual cortex and is now thought to operate throughout the visual system, and in many other sensory modalities and brain regions. Normalization may underlie operations such as the representation of odours, the modulatory effects of visual attention, the encoding of value and the integration of multisensory information. Its presence in such a diversity of neural systems in multiple species, from invertebrates to mammals, suggests that it serves as a canonical neural computation.
The frequency of environmental vibrations is sampled by two of the major sensory systems, audition and touch, notwithstanding that these signals are transduced through very different physical media and entirely separate sensory epithelia. Psychophysical studies have shown that manipulating frequency in audition or touch can have a significant cross-sensory impact on perceived frequency in the other sensory system, pointing to intimate links between these senses during computation of frequency. In this regard, the frequency of a vibratory event can be thought of as a multisensory perceptual construct. In turn, electrophysiological studies point to temporally early multisensory interactions that occur in hierarchically early sensory regions where convergent inputs from the auditory and somatosensory systems are to be found. A key question pertains to the level of processing at which the multisensory integration of featural information such as frequency occurs. Do the sensory systems calculate frequency independently before this information is combined, or is this feature calculated in an integrated fashion during pre-attentive sensory processing? The well-characterized mismatch negativity, an electrophysiological response that indexes pre-attentive detection of a change within the context of a regular pattern of stimulation, served as our dependent measure. High-density electrophysiological recordings were made in humans while they were presented with separate blocks of somatosensory, auditory, and audio-somatosensory “standards” and “deviants”, where the deviant differed in frequency. Multisensory effects were identified beginning at ~200ms, with the multisensory MMN significantly different from the sum of the unisensory MMNs. This provides compelling evidence for preattentive coupling between the somatosensory and auditory channels in the cortical representation of frequency.
somatosensory; auditory; multisensory; frequency processing; mismatch negativity
The comprehension of auditory-visual (AV) speech integration has greatly benefited from recent advances in neurosciences and multisensory research. AV speech integration raises numerous questions relevant to the computational rules needed for binding information (within and across sensory modalities), the representational format in which speech information is encoded in the brain (e.g., auditory vs. articulatory), or how AV speech ultimately interfaces with the linguistic system. The following non-exhaustive review provides a set of empirical findings and theoretical questions that have fed the original proposal for predictive coding in AV speech processing. More recently, predictive coding has pervaded many fields of inquiries and positively reinforced the need to refine the notion of internal models in the brain together with their implications for the interpretation of neural activity recorded with various neuroimaging techniques. However, it is argued here that the strength of predictive coding frameworks reside in the specificity of the generative internal models not in their generality; specifically, internal models come with a set of rules applied on particular representational formats themselves depending on the levels and the network structure at which predictive operations occur. As such, predictive coding in AV speech owes to specify the level(s) and the kinds of internal predictions that are necessary to account for the perceptual benefits or illusions observed in the field. Among those specifications, the actual content of a prediction comes first and foremost, followed by the representational granularity of that prediction in time. This review specifically presents a focused discussion on these issues.
analysis-by-synthesis; predictive coding; multisensory integration; Bayesian priors
Multisensory Integration describes a process by which information from different sensory systems is combined to influence perception, decisions, and overt behavior. Despite a widespread appreciation of its utility in the adult, its developmental antecedents have received relatively little attention. Here we review what is known about the development of multisensory integration, with a focus on the circuitry and experiential antecedents of its development in the model system of the multisensory (i.e., deep) layers of the superior colliculus. Of particular interest here are two sets of experimental observations: 1) cortical influences appear essential for multisensory integration in the SC, and 2) postnatal experience guides its maturation. The current belief is that the experience normally gained during early life is instantiated in the cortico-SC projection, and that this is the primary route by which ecological pressures adapt SC multisensory integration to the particular environment in which it will be used.