Sensory processing in the brain includes three key operations: multisensory integration—the task of combining cues into a single estimate of a common underlying stimulus; coordinate transformations—the change of reference frame for a stimulus (e.g., retinotopic to body-centered) effected through knowledge about an intervening variable (e.g., gaze position); and the incorporation of prior information. Statistically optimal sensory processing requires that each of these operations maintains the correct posterior distribution over the stimulus. Elements of this optimality have been demonstrated in many behavioral contexts in humans and other animals, suggesting that the neural computations are indeed optimal. That the relationships between sensory modalities are complex and plastic further suggests that these computations are learned—but how? We provide a principled answer, by treating the acquisition of these mappings as a case of density estimation, a well-studied problem in machine learning and statistics, in which the distribution of observed data is modeled in terms of a set of fixed parameters and a set of latent variables. In our case, the observed data are unisensory-population activities, the fixed parameters are synaptic connections, and the latent variables are multisensory-population activities. In particular, we train a restricted Boltzmann machine with the biologically plausible contrastive-divergence rule to learn a range of neural computations not previously demonstrated under a single approach: optimal integration; encoding of priors; hierarchical integration of cues; learning when not to integrate; and coordinate transformation. The model makes testable predictions about the nature of multisensory representations.
Over the first few years of their lives, humans (and other animals) appear to learn how to combine signals from multiple sense modalities: when to “integrate” them into a single percept, as with visual and proprioceptive information about one's body; when not to integrate them (e.g., when looking somewhere else); how they vary over longer time scales (e.g., where in physical space my hand tends to be); as well as more complicated manipulations, like subtracting gaze angle from the visually-perceived position of an object to compute the position of that object with respect to the head—i.e., “coordinate transformation.” Learning which sensory signals to integrate, or which to manipulate in other ways, does not appear to require an additional supervisory signal; we learn to do so, rather, based on structure in the sensory signals themselves. We present a biologically plausible artificial neural network that learns all of the above in just this way, but by training it for a much more general statistical task: “density estimation”—essentially, learning to be able to reproduce the data on which it was trained. This also links coordinate transformation and multisensory integration to other cortical operations, especially in early sensory areas, that have have been modeled as density estimators.
According to a prominent view of sensorimotor processing in primates, selection and specification of possible actions are not sequential operations. Rather, a decision for an action emerges from competition between different movement plans, which are specified and selected in parallel. For action choices which are based on ambiguous sensory input, the frontoparietal sensorimotor areas are considered part of the common underlying neural substrate for selection and specification of action. These areas have been shown capable of encoding alternative spatial motor goals in parallel during movement planning, and show signatures of competitive value-based selection among these goals. Since the same network is also involved in learning sensorimotor associations, competitive action selection (decision making) should not only be driven by the sensory evidence and expected reward in favor of either action, but also by the subject's learning history of different sensorimotor associations. Previous computational models of competitive neural decision making used predefined associations between sensory input and corresponding motor output. Such hard-wiring does not allow modeling of how decisions are influenced by sensorimotor learning or by changing reward contingencies. We present a dynamic neural field model which learns arbitrary sensorimotor associations with a reward-driven Hebbian learning algorithm. We show that the model accurately simulates the dynamics of action selection with different reward contingencies, as observed in monkey cortical recordings, and that it correctly predicted the pattern of choice errors in a control experiment. With our adaptive model we demonstrate how network plasticity, which is required for association learning and adaptation to new reward contingencies, can influence choice behavior. The field model provides an integrated and dynamic account for the operations of sensorimotor integration, working memory and action selection required for decision making in ambiguous choice situations.
Decision making requires the selection between alternative actions. It has been suggested that action selection is not separate from motor preparation of the according actions, but rather that the selection emerges from the competition between different movement plans. We expand on this idea, and ask how action selection mechanisms interact with the learning of new action choices. We present a neurodynamic model that provides an integrated account of action selection and the learning of sensorimotor associations. The model explains recent electrophysiological findings from monkeys' sensorimotor cortex, and correctly predicted a newly described characteristic pattern of their choice errors. Based on the model, we present a theory of how geometrical sensorimotor mapping rules can be learned by association without the need for an explicit representation of the transformation rule, and how the learning history of these associations can have a direct influence on later decision making.
Human psychophysical studies have described multisensory perceptual benefits such as enhanced detection rates and faster reaction times in great detail. However, the neural circuits and mechanism underlying multisensory integration remain difficult to study in the primate brain. While rodents offer the advantage of a range of experimental methodologies to study the neural basis of multisensory processing, rodent studies are still limited due to the small number of available multisensory protocols. We here demonstrate the feasibility of an audio-visual stimulus detection task for rats, in which the animals detect lateralized uni- and multi-sensory stimuli in a two-response forced choice paradigm. We show that animals reliably learn and perform this task. Reaction times were significantly faster and behavioral performance levels higher in multisensory compared to unisensory conditions. This benefit was strongest for dim visual targets, in agreement with classical patterns of multisensory integration, and was specific to task-informative sounds, while uninformative sounds speeded reaction times with little costs for detection performance. Importantly, multisensory benefits for stimulus detection and reaction times appeared at different levels of task proficiency and training experience, suggesting distinct mechanisms inducing these two multisensory benefits. Our results demonstrate behavioral multisensory enhancement in rats in analogy to behavioral patterns known from other species, such as humans. In addition, our paradigm enriches the set of behavioral tasks on which future studies can rely, for example to combine behavioral measurements with imaging or pharmacological studies in the behaving animal or to study changes of integration properties in disease models.
To control targeted movements, such as reaching to grasp an object or hammering a nail, the brain can use divers sources of sensory information, such as vision and proprioception. Although a variety of studies have shown that sensory signals are optimally combined according to principles of maximum likelihood, increasing evidence indicates that the CNS does not compute a single, optimal estimation of the target's position to be compared with a single optimal estimation of the hand. Rather, it employs a more modular approach in which the overall behavior is built by computing multiple concurrent comparisons carried out simultaneously in a number of different reference frames. The results of these individual comparisons are then optimally combined in order to drive the hand. In this article we examine at a computational level two formulations of concurrent models for sensory integration and compare this to the more conventional model of converging multi-sensory signals. Through a review of published studies, both our own and those performed by others, we produce evidence favoring the concurrent formulations. We then examine in detail the effects of additive signal noise as information flows through the sensorimotor system. By taking into account the noise added by sensorimotor transformations, one can explain why the CNS may shift its reliance on one sensory modality toward a greater reliance on another and investigate under what conditions those sensory transformations occur. Careful consideration of how transformed signals will co-vary with the original source also provides insight into how the CNS chooses one sensory modality over another. These concepts can be used to explain why the CNS might, for instance, create a visual representation of a task that is otherwise limited to the kinesthetic domain (e.g., pointing with one hand to a finger on the other) and why the CNS might choose to recode sensory information in an external reference frame.
sensory integration; motor control; maximum likelihood; reference frames
Many perceptual cue combination studies have shown that humans can integrate sensory information across modalities as well as within a modality in a manner that is close to optimal. While the limits of sensory cue integration have been extensively studied in the context of perceptual decision tasks, the evidence obtained in the context of motor decisions provides a less consistent picture. Here, we studied the combination of visual and haptic information in the context of human arm movement control. We implemented a pointing task in which human subjects pointed at an invisible unknown target position whose vertical position varied randomly across trials. In each trial, we presented a haptic and a visual cue that provided noisy information about the target position half-way through the reach. We measured pointing accuracy as function of haptic and visual cue onset and compared pointing performance to the predictions of a multisensory decision model. Our model accounts for pointing performance by computing the maximum a posteriori estimate, assuming minimum variance combination of uncertain sensory cues. Synchronicity of cue onset has previously been demonstrated to facilitate the integration of sensory information. We tested this in trials in which visual and haptic information was presented with temporal disparity. We found that for our sensorimotor task temporal disparity between visual and haptic cue had no effect. Sensorimotor learning appears to use all available information and to apply the same near-optimal rules for cue combination that are used by perception.
Multisensory integration; Hand movement control; Motor learning; Cue integration; Vision; Haptics
In this paper, we present two neural network models – devoted to two specific and widely investigated aspects of multisensory integration – in order to evidence the potentialities of computational models to gain insight into the neural mechanisms underlying organization, development, and plasticity of multisensory integration in the brain. The first model considers visual–auditory interaction in a midbrain structure named superior colliculus (SC). The model is able to reproduce and explain the main physiological features of multisensory integration in SC neurons and to describe how SC integrative capability – not present at birth – develops gradually during postnatal life depending on sensory experience with cross-modal stimuli. The second model tackles the problem of how tactile stimuli on a body part and visual (or auditory) stimuli close to the same body part are integrated in multimodal parietal neurons to form the perception of peripersonal (i.e., near) space. The model investigates how the extension of peripersonal space – where multimodal integration occurs – may be modified by experience such as use of a tool to interact with the far space. The utility of the modeling approach relies on several aspects: (i) The two models, although devoted to different problems and simulating different brain regions, share some common mechanisms (lateral inhibition and excitation, non-linear neuron characteristics, recurrent connections, competition, Hebbian rules of potentiation and depression) that may govern more generally the fusion of senses in the brain, and the learning and plasticity of multisensory integration. (ii) The models may help interpretation of behavioral and psychophysical responses in terms of neural activity and synaptic connections. (iii) The models can make testable predictions that can help guiding future experiments in order to validate, reject, or modify the main assumptions.
neural network modeling; multimodal neurons; superior colliculus; peripersonal space; neural mechanisms; learning and plasticity; behavior
When a perturbation is applied in a sensorimotor transformation task, subjects can adapt and maintain performance by either relying on sensory feedback, or, in the absence of such feedback, on information provided by rewards. For example, in a classical rotation task where movement endpoints must be rotated to reach a fixed target, human subjects can successfully adapt their reaching movements solely on the basis of binary rewards, although this proves much more difficult than with visual feedback. Here, we investigate such a reward-driven sensorimotor adaptation process in a minimal computational model of the task. The key assumption of the model is that synaptic plasticity is gated by the reward. We study how the learning dynamics depend on the target size, the movement variability, the rotation angle and the number of targets. We show that when the movement is perturbed for multiple targets, the adaptation process for the different targets can interfere destructively or constructively depending on the similarities between the sensory stimuli (the targets) and the overlap in their neuronal representations. Destructive interferences can result in a drastic slowdown of the adaptation. As a result of interference, the time to adapt varies non-linearly with the number of targets. Our analysis shows that these interferences are weaker if the reward varies smoothly with the subject's performance instead of being binary. We demonstrate how shaping the reward or shaping the task can accelerate the adaptation dramatically by reducing the destructive interferences. We argue that experimentally investigating the dynamics of reward-driven sensorimotor adaptation for more than one sensory stimulus can shed light on the underlying learning rules.
The brain has a robust ability to adapt to external perturbations imposed on acquired sensorimotor transformations. Here, we used a mathematical model to investigate the reward-based component in sensorimotor adaptations. We show that the shape of the delivered reward signal, which in experiments is usually binary to indicate success or failure, affects the adaptation dynamics. We demonstrate how the ability to adapt to perturbations by relying solely on binary rewards depends on motor variability, size of perturbation and the threshold for delivering the reward. When adapting motor responses to multiple sensory stimuli simultaneously, on-line interferences between the motor performance in response to the different stimuli occur as a result of the overlap in the neural representation of the sensory stimuli, as well as the physical distance between them. Adaptation may be extremely slow when perturbations are induced to a few stimuli that are physically different from each other because of destructive interferences. When intermediate stimuli are introduced, the physical distance between neighbor stimuli is reduced, and constructive interferences can emerge, resulting in faster adaptation. Remarkably, adaptation to a widespread sensorimotor perturbation is accelerated by increasing the number of sensory stimuli during training, i.e. learning is faster if one learns more.
Optical and electrophysiological tools were used to map out the neural circuits within and between cortical layers in three different brain regions, and the results suggest regional specializations for sensory versus motor information processing.
Rodents move their whiskers to locate and identify objects. Cortical areas involved in vibrissal somatosensation and sensorimotor integration include the vibrissal area of the primary motor cortex (vM1), primary somatosensory cortex (vS1; barrel cortex), and secondary somatosensory cortex (S2). We mapped local excitatory pathways in each area across all cortical layers using glutamate uncaging and laser scanning photostimulation. We analyzed these maps to derive laminar connectivity matrices describing the average strengths of pathways between individual neurons in different layers and between entire cortical layers. In vM1, the strongest projection was L2/3→L5. In vS1, strong projections were L2/3→L5 and L4→L3. L6 input and output were weak in both areas. In S2, L2/3→L5 exceeded the strength of the ascending L4→L3 projection, and local input to L6 was prominent. The most conserved pathways were L2/3→L5, and the most variable were L4→L2/3 and pathways involving L6. Local excitatory circuits in different cortical areas are organized around a prominent descending pathway from L2/3→L5, suggesting that sensory cortices are elaborations on a basic motor cortex-like plan.
The neocortex of the mammalian brain is divided into different regions that serve specific functions. These include sensory areas for vision, hearing, and touch, and motor areas for directing aspects of movement. However, the similarities and differences in local circuit organization between these areas are not well understood. The cortex is a layered structure numbered in an outside-in fashion, such that layer 1 is closest to the cortical surface and layer 6 is deepest. Each layer harbors distinct cell types. The precise circuit wiring within and between these layers allows for specific functions performed by particular cortical regions. To directly compare circuits from distinct cortical areas, we combined optical and electrophysiological tools to map connections between layers in different brain regions. We examined three regions of mouse neocortex that are involved in active whisker sensation: vibrissal motor cortex (vM1), primary somatosensory cortex (vS1), and secondary somatosensory cortex (S2). Our results demonstrate that excitatory connections from layer 2/3 to layer 5 are prominent in all three regions. In contrast, strong ascending pathways from middle layers (layer 4) to superficial ones (layer 3) and local inputs to layer 6 were prominent only in the two sensory cortical areas. These results indicate that cortical circuits employ regional specializations when processing motor versus sensory information. Moreover, our data suggest that sensory cortices are elaborations on a basic motor cortical plan involving layer 2/3 to layer 5 pathways.
Experimental manipulations of sensory feedback during complex behavior have provided valuable insights into the computations underlying motor control and sensorimotor plasticity1. Consistent sensory perturbations result in compensatory changes in motor output, reflecting changes in feedforward motor control that reduce the experienced feedback error. By quantifying how different sensory feedback errors affect human behavior, prior studies have explored how visual signals are used to recalibrate arm movements2,3 and auditory feedback is used to modify speech production4-7. The strength of this approach rests on the ability to mimic naturalistic errors in behavior, allowing the experimenter to observe how experienced errors in production are used to recalibrate motor output.
Songbirds provide an excellent animal model for investigating the neural basis of sensorimotor control and plasticity8,9. The songbird brain provides a well-defined circuit in which the areas necessary for song learning are spatially separated from those required for song production, and neural recording and lesion studies have made significant advances in understanding how different brain areas contribute to vocal behavior9-12. However, the lack of a naturalistic error-correction paradigm - in which a known acoustic parameter is perturbed by the experimenter and then corrected by the songbird - has made it difficult to understand the computations underlying vocal learning or how different elements of the neural circuit contribute to the correction of vocal errors13.
The technique described here gives the experimenter precise control over auditory feedback errors in singing birds, allowing the introduction of arbitrary sensory errors that can be used to drive vocal learning. Online sound-processing equipment is used to introduce a known perturbation to the acoustics of song, and a miniaturized headphones apparatus is used to replace a songbird's natural auditory feedback with the perturbed signal in real time. We have used this paradigm to perturb the fundamental frequency (pitch) of auditory feedback in adult songbirds, providing the first demonstration that adult birds maintain vocal performance using error correction14. The present protocol can be used to implement a wide range of sensory feedback perturbations (including but not limited to pitch shifts) to investigate the computational and neurophysiological basis of vocal learning.
Neuroscience; Issue 69; Anatomy; Physiology; Zoology; Behavior; Songbird; psychophysics; auditory feedback; biology; sensorimotor learning
Classical analytical approaches for examining multisensory processing in individual neurons have relied heavily on changes in mean firing rate to assess the presence and magnitude of multisensory interaction. However, neurophysiological studies within individual sensory systems have illustrated that important sensory and perceptual information is encoded in forms that go beyond these traditional spike-based measures. Here we review analytical tools as they are used within individual sensory systems (auditory, somatosensory, and visual) to advance our understanding of how sensory cues are effectively integrated across modalities (e.g., audiovisual cues facilitating speech processing). Specifically, we discuss how methods used to assess response variability (Fano factor, or FF), local field potentials (LFPs), current source density (CSD), oscillatory coherence, spike synchrony, and receiver operating characteristics (ROC) represent particularly promising tools for understanding the neural encoding of multisensory stimulus features. The utility of each approach and how it might optimally be applied toward understanding multisensory processing is placed within the context of exciting new data that is just beginning to be generated. Finally, we address how underlying encoding mechanisms might shape—and be tested alongside with—the known behavioral and perceptual benefits that accompany multisensory processing.
electrophysiology; multisensory; oscillations; receiver operating characteristics; spike synchrony
Trial by trial covariations between neural activity and perceptual decisions (quantified by choice Probability, CP) have been used to probe the contribution of sensory neurons to perceptual decisions. CPs are thought to be determined by both selective decoding of neural activity and by the structure of correlated noise among neurons, but the respective roles of these factors in creating CPs have been controversial. We used biologically-constrained simulations to explore this issue, taking advantage of a peculiar pattern of CPs exhibited by multisensory neurons in area MSTd that represent self-motion. Although models that relied on correlated noise or selective decoding could both account for the peculiar pattern of CPs, predictions of the selective decoding model were substantially more consistent with various features of the neural and behavioral data. While correlated noise is essential to observe CPs, our findings suggest that selective decoding of neuronal signals also plays important roles.
Even the simplest tasks require the brain to process vast amounts of information. To take a step forward, for example, the brain must process information about the orientation of the animal's body and what the animal is seeing, hearing and feeling in order to determine whether any obstacles stand in the way. The brain must integrate all this information to make decisions about how to proceed. And once a decision is made, the brain must send signals via the nervous system to the muscles to physically move the foot forward.
Specialized brain cells called sensory neurons help to process this sensory information. For example, visual neurons process information about what the animal sees, while auditory neurons process information about what it hears. Other sensory neurons—called multisensory neurons—can process information coming from more than one of an animal's senses.
For more than two decades, researchers have known that the firing of an individual sensory neuron can be linked to the decision that an animal makes about the meaning of the sensory information it has received. The ability to predict whether an animal will make a given decision based on the firing of individual sensory neurons is often referred to as a ‘choice probability’. Measurements of single neurons have often been used to try to work out how the brain decodes the sensory information that is needed to carry out a specific task. However, it remains unclear whether choice probabilities really reflect how sensory information is decoded in the brain, or whether these measurements are just reflecting coordinated patterns of background ‘noise’ among the neurons as the decisions are being made.
Gu et al. set out to help resolve this debate by examining choice probabilities in the multisensory neurons in one area of the brain. A series of experiments was conducted to see how these neurons process information, both from the eyes and the part of the inner ear that helps control balance, to work out the direction in which an animal was moving. By performing computer simulations of the activity of groups of neurons, Gu et al. found that choice probability measurements are better explained by the models whereby these measurements did reflect the strategy that is used to decode the sensory information. Models based solely on patterns of correlated noise did not explain the data as well, though Gu et al. suggest that this noise is likely to also contribute to the observed effects.
Following on from the work of Gu et al., a major challenge will be to see if it is possible to infer how the brain extracts the relevant information from the different sensory neurons. This may require recordings from large groups of neurons, but it might help us to decipher how patterns of activity in the brain lead to decisions about the world around us.
macaque; neural coding; decision; computational; other
Stimuli from different sensory modalities occurring on or close to the body are integrated in a multisensory representation of the space surrounding the body, i.e., peripersonal space (PPS). PPS dynamically modifies depending on experience, e.g., it extends after using a tool to reach far objects. However, the neural mechanism underlying PPS plasticity after tool use is largely unknown. Here we use a combined computational-behavioral approach to propose and test a possible mechanism accounting for PPS extension. We first present a neural network model simulating audio-tactile representation in the PPS around one hand. Simulation experiments showed that our model reproduced the main property of PPS neurons, i.e., selective multisensory response for stimuli occurring close to the hand. We used the neural network model to simulate the effects of a tool-use training. In terms of sensory inputs, tool use was conceptualized as a concurrent tactile stimulation from the hand, due to holding the tool, and an auditory stimulation from the far space, due to tool-mediated action. Results showed that after exposure to those inputs, PPS neurons responded also to multisensory stimuli far from the hand. The model thus suggests that synchronous pairing of tactile hand stimulation and auditory stimulation from the far space is sufficient to extend PPS, such as after tool-use. Such prediction was confirmed by a behavioral experiment, where we used an audio-tactile interaction paradigm to measure the boundaries of PPS representation. We found that PPS extended after synchronous tactile-hand stimulation and auditory-far stimulation in a group of healthy volunteers. Control experiments both in simulation and behavioral settings showed that the same amount of tactile and auditory inputs administered out of synchrony did not change PPS representation. We conclude by proposing a simple, biological-plausible model to explain plasticity in PPS representation after tool-use, which is supported by computational and behavioral data.
peripersonal space; tool-use; neural network model; multisensory processing; plasticity
Determining when, if, and how information from separate sensory channels has been combined is a fundamental goal of research on multisensory processing in the brain. This can be a particular challenge in psychophysical data, as there is no direct recording of neural output. The most common way to characterize multisensory interactions in behavioral data is to compare responses to multisensory stimulation with the race model, a model of parallel, independent processing constructed from the probability of responses to the two unisensory stimuli which make up the multisensory stimulus. If observed multisensory reaction times are faster than those predicted by the model, it is inferred that information from the two channels is being combined rather than processed independently. Recently, behavioral research has been published employing capacity analyses where comparisons between two conditions are carried out at the level of the integrated hazard function. Capacity analyses seem to be particularly appealing technique for evaluating multisensory functioning, as they describe relationships between conditions across the entire distribution curve, are relatively easy and intuitive to interpret. The current paper presents capacity analysis of a behavioral data set previously analyzed using the race model. While applications of capacity analyses are still somewhat limited due to their novelty, it is hoped that this exploration of capacity and race model analyses will encourage the use of this promising new technique both in multisensory research and other applicable fields.
capacity; hazard analysis; human aging; multisensory integration; psychophysics; race model
The sensory signals that drive movement planning arrive in a variety of “reference frames”, so integrating or comparing them requires sensory transformations. We propose a model where the statistical properties of sensory signals and their transformations determine how these signals are used. This model captures the patterns of gaze-dependent errors found in our human psychophysics experiment when the sensory signals available for reach planning are varied. These results challenge two widely held ideas: error patterns directly reflect the reference frame of the underlying neural representation, and it is preferable to use a single common reference frame for movement planning. We show that gaze-dependent error patterns, often cited as evidence for retinotopic reach planning, can be explained by a transformation bias and are not exclusively linked to retinotopic representations. Further, the presence of multiple reference frames allows for optimal use of available sensory information and explains task-dependent reweighting of sensory signals.
Compelling behavioral evidence suggests that humans can make optimal decisions despite the uncertainty inherent in perceptual or motor tasks. A key question in neuroscience is how populations of spiking neurons can implement such probabilistic computations. In this article, we develop a comprehensive framework for optimal, spike-based sensory integration and working memory in a dynamic environment. We propose that probability distributions are inferred spike-per-spike in recurrently connected networks of integrate-and-fire neurons. As a result, these networks can combine sensory cues optimally, track the state of a time-varying stimulus and memorize accumulated evidence over periods much longer than the time constant of single neurons. Importantly, we propose that population responses and persistent working memory states represent entire probability distributions and not only single stimulus values. These memories are reflected by sustained, asynchronous patterns of activity which make relevant information available to downstream neurons within their short time window of integration. Model neurons act as predictive encoders, only firing spikes which account for new information that has not yet been signaled. Thus, spike times signal deterministically a prediction error, contrary to rate codes in which spike times are considered to be random samples of an underlying firing rate. As a consequence of this coding scheme, a multitude of spike patterns can reliably encode the same information. This results in weakly correlated, Poisson-like spike trains that are sensitive to initial conditions but robust to even high levels of external neural noise. This spike train variability reproduces the one observed in cortical sensory spike trains, but cannot be equated to noise. On the contrary, it is a consequence of optimal spike-based inference. In contrast, we show that rate-based models perform poorly when implemented with stochastically spiking neurons.
Most of our daily actions are subject to uncertainty. Behavioral studies have confirmed that humans handle this uncertainty in a statistically optimal manner. A key question then is what neural mechanisms underlie this optimality, i.e. how can neurons represent and compute with probability distributions. Previous approaches have proposed that probabilities are encoded in the firing rates of neural populations. However, such rate codes appear poorly suited to understand perception in a constantly changing environment. In particular, it is unclear how probabilistic computations could be implemented by biologically plausible spiking neurons. Here, we propose a network of spiking neurons that can optimally combine uncertain information from different sensory modalities and keep this information available for a long time. This implies that neural memories not only represent the most likely value of a stimulus but rather a whole probability distribution over it. Furthermore, our model suggests that each spike conveys new, essential information. Consequently, the observed variability of neural responses cannot simply be understood as noise but rather as a necessary consequence of optimal sensory integration. Our results therefore question strongly held beliefs about the nature of neural “signal” and “noise”.
Zebrafish larvae show characteristic prey capture behavior in response to small moving objects. The neural mechanism used to recognize objects as prey remains largely unknown. We devised a machine learning behavior classification system to quantify hunting kinematics in semi-restrained animals exposed to a range of virtual stimuli. Two-photon calcium imaging revealed a small visual area, AF7, that was activated specifically by the optimal prey stimulus. This pretectal region is innervated by two types of retinal ganglion cells, which also send collaterals to the optic tectum. Laser ablation of AF7 markedly reduced prey capture behavior. We identified neurons with arbors in AF7 and found that they projected to multiple sensory and premotor areas: the optic tectum, the nucleus of the medial longitudinal fasciculus (nMLF) and the hindbrain. These findings indicate that computations in the retina give rise to a visual stream which transforms sensory information into a directed prey capture response.
Our ability to recognize objects, and to respond instinctively to them, is something that is not fully understood. For example, seeing your favorite dessert could trigger an irresistible urge to eat it. Yet precisely how the image of the dessert could trigger an inner desire to indulge is a question that has so far eluded scientists. This compelling question also applies to the animal kingdom. Predators often demonstrate a typical hunting behavior upon seeing their prey from a distance. But just how the image of the prey triggers this hunting behavior is not known.
Semmelhack et al. have now investigated this question by looking at the hunting behavior of zebrafish larvae. The larvae's prey is a tiny microbe that resembles a small moving dot. When the larvae encounter something that looks like their prey, they demonstrate a hardwired hunting response towards it. The hunting behavior consists of a series of swimming maneuvers to help the larvae successfully capture their prey.
Semmelhack et al. used prey decoys to lure the zebrafish larvae, and video recordings to monitor the larvae's response. During the recordings, the larvae were embedded in a bed of jelly with only their tails free to move. The larvae's tail movements were recorded, and because the larvae are completely transparent, their brain activity could be visually monitored at the same time using calcium dyes.
Using this approach, Semmelhack et al. identified a specific area of the brain that is responsible for triggering the larvae's hunting behavior. It turns out that this brain region forms a circuit that directly connects the retina at the back of the eye to nerve centers that control hunting maneuvers. So when the larva sees its prey, this circuit could directly trigger the larva's hunting behavior. When the circuit was specifically destroyed with a laser, this instinctive hunting response was impaired.
These findings suggest that predators have a distinct brain circuit that hardwires their hunting response to images of their prey. Future studies would involve understanding precisely how this circuit coordinates the larvae's complex hunting behavior.
visual system; behavior; retinal ganglion cells; pretectum; optic tectum; Zebrafish
Sensorimotor learning has been shown to depend on both prior expectations and sensory evidence in a way that is consistent with Bayesian integration. Thus, prior beliefs play a key role during the learning process, especially when only ambiguous sensory information is available. Here we develop a novel technique to estimate the covariance structure of the prior over visuomotor transformations – the mapping between actual and visual location of the hand – during a learning task. Subjects performed reaching movements under multiple visuomotor transformations in which they received visual feedback of their hand position only at the end of the movement. After experiencing a particular transformation for one reach, subjects have insufficient information to determine the exact transformation, and so their second reach reflects a combination of their prior over visuomotor transformations and the sensory evidence from the first reach. We developed a Bayesian observer model in order to infer the covariance structure of the subjects' prior, which was found to give high probability to parameter settings consistent with visuomotor rotations. Therefore, although the set of visuomotor transformations experienced had little structure, the subjects had a strong tendency to interpret ambiguous sensory evidence as arising from rotation-like transformations. We then exposed the same subjects to a highly-structured set of visuomotor transformations, designed to be very different from the set of visuomotor rotations. During this exposure the prior was found to have changed significantly to have a covariance structure that no longer favored rotation-like transformations. In summary, we have developed a technique which can estimate the full covariance structure of a prior in a sensorimotor task and have shown that the prior over visuomotor transformations favor a rotation-like structure. Moreover, through experience of a novel task structure, participants can appropriately alter the covariance structure of their prior.
When learning a new skill, such as riding a bicycle, we can adjust the commands we send to our muscles based on two sources of information. First, we can use sensory inputs to inform us how the bike is behaving. Second, we can use prior knowledge about the properties of bikes and how they behave in general. This prior knowledge is represented as a probability distribution over the properties of bikes. These two sources of information can then be combined by a process known as Bayes rule to identify optimally the properties of a particular bike. Here, we develop a novel technique to identify the probability distribution of a prior in a visuomotor learning task in which the visual location of the hand is transformed from the actual hand location, similar to when using a computer mouse. We show that subjects have a prior that tends to interpret ambiguous information about the task as arising from a visuomotor rotation but that experience of a particular set of visuomotor transformations can alter the prior.
Our senses interact in daily life through multisensory integration, facilitating perceptual processes and behavioral responses. The neural mechanisms proposed to underlie this multisensory facilitation include anatomical connections directly linking early sensory areas, indirect connections to higher-order multisensory regions, as well as thalamic connections. Here we examine the relationship between white matter connectivity, as assessed with diffusion tensor imaging, and individual differences in multisensory facilitation and provide the first demonstration of a relationship between anatomical connectivity and multisensory processing in typically developed individuals. Using a whole-brain analysis and contrasting anatomical models of multisensory processing we found that increased connectivity between parietal regions and early sensory areas was associated with the facilitation of reaction times to multisensory (auditory-visual) stimuli. Furthermore, building on prior animal work suggesting the involvement of the superior colliculus in this process, using probabilistic tractography we determined that the strongest cortical projection area connected with the superior colliculus includes the region of connectivity implicated in our independent whole-brain analysis.
diffusion tensor; cross-modal; bimodal; redundant target; redundant signals; superior colliculus
Haptic perception is an active process that provides an awareness of objects that are encountered as an organism scans its environment. In contrast to the sensation of touch produced by contact with an object, the perception of object location arises from the interpretation of tactile signals in the context of the changing configuration of the body. A discrete sensory representation and a low number of degrees of freedom in the motor plant make the ethologically prominent rat vibrissa system an ideal model for the study of the neuronal computations that underlie this perception. We found that rats with only a single vibrissa can combine touch and movement to distinguish the location of objects that vary in angle along the sweep of vibrissa motion. The patterns of this motion and of the corresponding behavioral responses show that rats can scan potential locations and decide which location contains a stimulus within 150 ms. This interval is consistent with just one to two whisk cycles and provides constraints on the underlying perceptual computation. Our data argue against strategies that do not require the integration of sensory and motor modalities. The ability to judge angular position with a single vibrissa thus connects previously described, motion-sensitive neurophysiological signals to perception in the behaving animal.
Rats explore the world with their whiskers (vibrissae). Although the sensations of touch that an animal experiences while exploring an object either in front of its head or to its side can be similar, the two sensations tell the animal different things about its nearby environment. The translation from passive touch to knowledge of an object's location requires that the nervous system keep track of the location of the animal's body as it moves. We studied this process by restricting a rat's whisking information to that provided by a single actively moving vibrissa. We found that even with such limited information, rats can search for, locate, and differentiate objects near their heads with astonishing speed. Their behavior during this search reflects the computations performed by their nervous systems to locate objects based on touch, and this behavior demonstrates that rats keeps track of their vibrissa motion with a resolution of less than 0.1 s. Understanding how these computations are performed will bring us closer to understanding how the brain integrates the sense of touch with its sense of self.
Rats can localize objects with their specialized vibrissa system by the integration of feed-forward sensory events and motor feedback. This discovery provides a behavioral model for understanding how sensorimotor loops derive the perception of space from the sensation of touch.
For processing and segmenting visual scenes, the brain is required to combine a multitude of features and sensory channels. It is neither known if these complex tasks involve optimal integration of information, nor according to which objectives computations might be performed. Here, we investigate if optimal inference can explain contour integration in human subjects. We performed experiments where observers detected contours of curvilinearly aligned edge configurations embedded into randomly oriented distractors. The key feature of our framework is to use a generative process for creating the contours, for which it is possible to derive a class of ideal detection models. This allowed us to compare human detection for contours with different statistical properties to the corresponding ideal detection models for the same stimuli. We then subjected the detection models to realistic constraints and required them to reproduce human decisions for every stimulus as well as possible. By independently varying the four model parameters, we identify a single detection model which quantitatively captures all correlations of human decision behaviour for more than 2000 stimuli from 42 contour ensembles with greatly varying statistical properties. This model reveals specific interactions between edges closely matching independent findings from physiology and psychophysics. These interactions imply a statistics of contours for which edge stimuli are indeed optimally integrated by the visual system, with the objective of inferring the presence of contours in cluttered scenes. The recurrent algorithm of our model makes testable predictions about the temporal dynamics of neuronal populations engaged in contour integration, and it suggests a strong directionality of the underlying functional anatomy.
Since Helmholtz put forward his concept that the brain performs inference on its sensory input for building an internal representation of the outside world, it is a puzzle for neuroscientific research whether visual perception can indeed be understood from first principles. An important part of vision is the integration of colinearly aligned edge elements into contours, which is required for the detection of object boundaries. We show that this visual function can fully be explained in a probabilistic model with a well–defined statistical objective. For this purpose, we developed a novel method to adapt models to correlations in human behaviour, and applied this technique to tightly link psychophysical experiments and numerical simulations of contour integration. The results not only demonstrate that complex neuronal computations can be elegantly described in terms of constrained probabilistic inference, but also reveal yet unknown neural mechanisms underlying early visual information processing.
The integration of multisensory information is essential to forming meaningful representations of the environment. Adults benefit from related multisensory stimuli but the extent to which the ability to optimally integrate multisensory inputs for functional purposes is present in children has not been extensively examined. Using a cross-sectional approach, high-density electrical mapping of event-related potentials (ERPs) was combined with behavioral measures to characterize neurodevelopmental changes in basic audiovisual (AV) integration from middle childhood through early adulthood. The data indicated a gradual fine-tuning of multisensory facilitation of performance on an AV simple reaction time task (as indexed by race model violation), which reaches mature levels by about 14 years of age. They also revealed a systematic relationship between age and the brain processes underlying multisensory integration (MSI) in the time frame of the auditory N1 ERP component (∼120 ms). A significant positive correlation between behavioral and neurophysiological measures of MSI suggested that the underlying brain processes contributed to the fine-tuning of multisensory facilitation of behavior that was observed over middle childhood. These findings are consistent with protracted plasticity in a dynamic system and provide a starting point from which future studies can begin to examine the developmental course of multisensory processing in clinical populations.
children; cross-modal; development, electrophysiology; ERP; multisensory integration
•A new parietal multisensory area integrates lower body and lower visual field.•Rearrangement of parietal areas in human and non-human primates is rationalized.•In vivo myelin mapping outlines some parietal multisensory areas.•Multisensory parietal areas transform visual maps into non-retinocentric coordinates.
Parietal cortex has long been known to be a site of sensorimotor integration. Recent findings in humans have shown that it is divided up into a number of small areas somewhat specialized for eye movements, reaching, and hand movements, but also face-related movements (avoidance, eating), lower body movements, and movements coordinating multiple body parts. The majority of these areas contain rough sensory (receptotopic) maps, including a substantial multisensory representation of the lower body and lower visual field immediately medial to face VIP. There is strong evidence for retinotopic remapping in LIP and face-centered remapping in VIP, and weaker evidence for hand-centered remapping. The larger size of the functionally distinct inferior parietal default mode network in humans compared to monkeys results in a superior and medial displacement of middle parietal areas (e.g., the saccade-related LIP's). Multisensory superior parietal areas located anterior to the angular gyrus such as AIP and VIP are less medially displaced relative to macaque monkeys, so that human LIP paradoxically ends up medial to human VIP.
A sensory stimulus evokes activity in many neurons, creating a population response that must be “decoded” by the brain to estimate the parameters of that stimulus. Most decoding models have suggested complex neural circuits that compute optimal estimates of sensory parameters on the basis of responses in many sensory neurons. We propose a slightly suboptimal but practically simpler decoder. Decoding neurons integrate their inputs across 100 ms; incoming spikes are weighted by the preferred stimulus of the neuron of origin; and a local, cellular non-linearity approximates divisive normalization without dividing explicitly. The suboptimal decoder includes two simplifying approximations. It uses estimates of firing rate across the population rather than computing the total population response, and it implements divisive normalization with local cellular mechanisms of single neurons rather than more complicated neural circuit mechanisms. When applied to the practical problem of estimating target speed from a realistic simulation of the population response in extrastriate visual area MT, the suboptimal decoder has almost the same accuracy and precision as traditional decoding models. It succeeds in predicting the precision and imprecision of motor behavior using a suboptimal decoding computation because it adds only a small amount of imprecision to the code for target speed in MT, which is itself imprecise.
population decoding; divisive normalization; spike timing; MT; vector averaging
Multisensory learning and resulting neural brain plasticity have recently become a topic of renewed interest in human cognitive neuroscience. Music notation reading is an ideal stimulus to study multisensory learning, as it allows studying the integration of visual, auditory and sensorimotor information processing. The present study aimed at answering whether multisensory learning alters uni-sensory structures, interconnections of uni-sensory structures or specific multisensory areas. In a short-term piano training procedure musically naive subjects were trained to play tone sequences from visually presented patterns in a music notation-like system [Auditory-Visual-Somatosensory group (AVS)], while another group received audio-visual training only that involved viewing the patterns and attentively listening to the recordings of the AVS training sessions [Auditory-Visual group (AV)]. Training-related changes in cortical networks were assessed by pre- and post-training magnetoencephalographic (MEG) recordings of an auditory, a visual and an integrated audio-visual mismatch negativity (MMN). The two groups (AVS and AV) were differently affected by the training. The results suggest that multisensory training alters the function of multisensory structures, and not the uni-sensory ones along with their interconnections, and thus provide an answer to an important question presented by cognitive models of multisensory training.
The perception of self-motion direction, or heading, relies on integration of multiple sensory cues, especially from the visual and vestibular systems. However, the reliability of sensory information can vary rapidly and unpredictably, and it remains unclear how the brain integrates multiple sensory signals given this dynamic uncertainty. Human psychophysical studies have shown that observers combine cues by weighting them in proportion to their reliability, consistent with statistically optimal integration schemes derived from Bayesian probability theory. Remarkably, because cue reliability is varied randomly across trials, the perceptual weight assigned to each cue must change from trial to trial. Dynamic cue re-weighting has not been examined for combinations of visual and vestibular cues, nor has the Bayesian cue integration approach been applied to laboratory animals, an important step toward understanding the neural basis of cue integration. To address these issues, we tested human and monkey subjects in a heading discrimination task involving visual (optic flow) and vestibular (translational motion) cues. The cues were placed in conflict on a subset of trials, and their relative reliability was varied to assess the weights that subjects gave to each cue in their heading judgments. We found that monkeys can rapidly re-weight visual and vestibular cues according to their reliability, the first such demonstration in a non-human species. However, some monkeys and humans tended to over-weight vestibular cues, inconsistent with simple predictions of a Bayesian model. Nonetheless, our findings establish a robust model system for studying the neural mechanisms of dynamic cue re-weighting in multisensory perception.
multisensory; psychophysics; spatial orientation; macaque; visual motion; decision