Redundancies and correlations in the responses of sensory neurons may seem to waste neural resources, but they can also carry cues about structured stimuli and may help the brain to correct for response errors. To investigate the effect of stimulus structure on redundancy in retina, we measured simultaneous responses from populations of retinal ganglion cells presented with natural and artificial stimuli that varied greatly in correlation structure; these stimuli and recordings are publicly available online. Responding to spatio-temporally structured stimuli such as natural movies, pairs of ganglion cells were modestly more correlated than in response to white noise checkerboards, but they were much less correlated than predicted by a non-adapting functional model of retinal response. Meanwhile, responding to stimuli with purely spatial correlations, pairs of ganglion cells showed increased correlations consistent with a static, non-adapting receptive field and nonlinearity. We found that in response to spatio-temporally correlated stimuli, ganglion cells had faster temporal kernels and tended to have stronger surrounds. These properties of individual cells, along with gain changes that opposed changes in effective contrast at the ganglion cell input, largely explained the pattern of pairwise correlations across stimuli where receptive field measurements were possible.
An influential theory of early sensory processing argues that sensory circuits should conserve scarce resources in their outputs by reducing correlations present in their inputs. Measuring simultaneous responses from large numbers of retinal ganglion cells responding to widely different classes of visual stimuli, we find that output correlations increase when we present stimuli with spatial, but not temporal, correlations. On the other hand, we find evidence that retina adjusts to spatio-temporal structure so that retinal output correlations change less than input correlations would predict. Changes in the receptive field properties of individual cells, along with gain changes, largely explain this relative constancy of correlations over the population.
Generalized linear models (GLMs) represent a popular choice for the probabilistic characterization of neural spike responses. While GLMs are attractive for their computational tractability, they also impose strong assumptions and thus only allow for a limited range of stimulus-response relationships to be discovered. Alternative approaches exist that make only very weak assumptions but scale poorly to high-dimensional stimulus spaces. Here we seek an approach which can gracefully interpolate between the two extremes. We extend two frequently used special cases of the GLM—a linear and a quadratic model—by assuming that the spike-triggered and non-spike-triggered distributions can be adequately represented using Gaussian mixtures. Because we derive the model from a generative perspective, its components are easy to interpret as they correspond to, for example, the spike-triggered distribution and the interspike interval distribution. The model is able to capture complex dependencies on high-dimensional stimuli with far fewer parameters than other approaches such as histogram-based methods. The added flexibility comes at the cost of a non-concave log-likelihood. We show that in practice this does not have to be an issue and the mixture-based model is able to outperform generalized linear and quadratic models.
An essential goal of sensory systems neuroscience is to characterize the functional relationship between neural responses and external stimuli. Of particular interest are the nonlinear response properties of single cells. Inherently linear approaches such as generalized linear modeling can nevertheless be used to fit nonlinear behavior by choosing an appropriate feature space for the stimulus. This requires, however, that one has already obtained a good understanding of a cells nonlinear properties, whereas more flexible approaches are necessary for the characterization of unexpected nonlinear behavior. In this work, we present a generalization of some frequently used generalized linear models which enables us to automatically extract complex stimulus-response relationships from recorded data. We show that our model can lead to substantial quantitative and qualitative improvements over generalized linear and quadratic models, which we illustrate on the example of primary afferents of the rat whisker system.
Extensive electrophysiology studies have shown that many V1 simple cells have nonlinear response properties to stimuli within their classical receptive field (CRF) and receive contextual influence from stimuli outside the CRF modulating the cell's response. Models seeking to explain these non-classical receptive field (nCRF) effects in terms of circuit mechanisms, input-output descriptions, or individual visual tasks provide limited insight into the functional significance of these response properties, because they do not connect the full range of nCRF effects to optimal sensory coding strategies. The (population) sparse coding hypothesis conjectures an optimal sensory coding approach where a neural population uses as few active units as possible to represent a stimulus. We demonstrate that a wide variety of nCRF effects are emergent properties of a single sparse coding model implemented in a neurally plausible network structure (requiring no parameter tuning to produce different effects). Specifically, we replicate a wide variety of nCRF electrophysiology experiments (e.g., end-stopping, surround suppression, contrast invariance of orientation tuning, cross-orientation suppression, etc.) on a dynamical system implementing sparse coding, showing that this model produces individual units that reproduce the canonical nCRF effects. Furthermore, when the population diversity of an nCRF effect has also been reported in the literature, we show that this model produces many of the same population characteristics. These results show that the sparse coding hypothesis, when coupled with a biophysically plausible implementation, can provide a unified high-level functional interpretation to many response properties that have generally been viewed through distinct mechanistic or phenomenological models.
Simple cells in the primary visual cortex (V1) demonstrate many response properties that are either nonlinear or involve response modulations (i.e., stimuli that do not cause a response in isolation alter the cell's response to other stimuli). These non-classical receptive field (nCRF) effects are generally modeled individually and their collective role in biological vision is not well understood. Previous work has shown that classical receptive field (CRF) properties of V1 cells (i.e., the spatial structure of the visual field responsive to stimuli) could be explained by the sparse coding hypothesis, which is an optimal coding model that conjectures a neural population should use the fewest number of cells simultaneously to represent each stimulus. In this paper, we have performed extensive simulated physiology experiments to show that many nCRF response properties are simply emergent effects of a dynamical system implementing this same sparse coding model. These results suggest that rather than representing disparate information processing operations themselves, these nCRF effects could be consequences of an optimal sensory coding strategy that attempts to represent each stimulus most efficiently. This interpretation provides a potentially unifying high-level functional interpretation to many response properties that have generally been viewed through distinct models.
The computation represented by a sensory neuron's response to stimuli is constructed from an array of physiological processes both belonging to that neuron and inherited from its inputs. Although many of these physiological processes are known to be nonlinear, linear approximations are commonly used to describe the stimulus selectivity of sensory neurons (i.e., linear receptive fields). Here we present an approach for modeling sensory processing, termed the Nonlinear Input Model (NIM), which is based on the hypothesis that the dominant nonlinearities imposed by physiological mechanisms arise from rectification of a neuron's inputs. Incorporating such ‘upstream nonlinearities’ within the standard linear-nonlinear (LN) cascade modeling structure implicitly allows for the identification of multiple stimulus features driving a neuron's response, which become directly interpretable as either excitatory or inhibitory. Because its form is analogous to an integrate-and-fire neuron receiving excitatory and inhibitory inputs, model fitting can be guided by prior knowledge about the inputs to a given neuron, and elements of the resulting model can often result in specific physiological predictions. Furthermore, by providing an explicit probabilistic model with a relatively simple nonlinear structure, its parameters can be efficiently optimized and appropriately regularized. Parameter estimation is robust and efficient even with large numbers of model components and in the context of high-dimensional stimuli with complex statistical structure (e.g. natural stimuli). We describe detailed methods for estimating the model parameters, and illustrate the advantages of the NIM using a range of example sensory neurons in the visual and auditory systems. We thus present a modeling framework that can capture a broad range of nonlinear response functions while providing physiologically interpretable descriptions of neural computation.
Sensory neurons are capable of representing a wide array of computations on sensory stimuli. Such complex computations are thought to arise in large part from the accumulation of relatively simple nonlinear operations across the sensory processing hierarchies. However, models of sensory processing typically rely on mathematical approximations of the overall relationship between stimulus and response, such as linear or quadratic expansions, which can overlook critical elements of sensory computation and miss opportunities to reveal how the underlying inputs contribute to a neuron's response. Here we present a physiologically inspired nonlinear modeling framework, the ‘Nonlinear Input Model’ (NIM), which instead assumes that neuronal computation can be approximated as a sum of excitatory and suppressive ‘neuronal inputs’. We show that this structure is successful at explaining neuronal responses in a variety of sensory areas. Furthermore, model fitting can be guided by prior knowledge about the inputs to a given neuron, and its results can often suggest specific physiological predictions. We illustrate the advantages of the proposed model and demonstrate specific parameter estimation procedures using a range of example sensory neurons in both the visual and auditory systems.
Receptive fields acquired through unsupervised learning of sparse representations of natural scenes have similar properties to primary visual cortex (V1) simple cell receptive fields. However, what drives in vivo development of receptive fields remains controversial. The strongest evidence for the importance of sensory experience in visual development comes from receptive field changes in animals reared with abnormal visual input. However, most sparse coding accounts have considered only normal visual input and the development of monocular receptive fields. Here, we applied three sparse coding models to binocular receptive field development across six abnormal rearing conditions. In every condition, the changes in receptive field properties previously observed experimentally were matched to a similar and highly faithful degree by all the models, suggesting that early sensory development can indeed be understood in terms of an impetus towards sparsity. As previously predicted in the literature, we found that asymmetries in inter-ocular correlation across orientations lead to orientation-specific binocular receptive fields. Finally we used our models to design a novel stimulus that, if present during rearing, is predicted by the sparsity principle to lead robustly to radically abnormal receptive fields.
The responses of neurons in the primary visual cortex (V1), a region of the brain involved in encoding visual input, are modified by the visual experience of the animal during development. For example, most neurons in animals reared viewing stripes of a particular orientation only respond to the orientation that the animal experienced. The responses of V1 cells in normal animals are similar to responses that simple optimisation algorithms can learn when trained on images. However, whether the similarity between these algorithms and V1 responses is merely coincidental has been unclear. Here, we used the results of a number of experiments where animals were reared with modified visual experience to test the explanatory power of three related optimisation algorithms. We did this by filtering the images for the algorithms in ways that mimicked the visual experience of the animals. This allowed us to show that the changes in V1 responses in experiment were consistent with the algorithms. This is evidence that the precepts of the algorithms, notably sparsity, can be used to understand the development of V1 responses. Further, we used our model to propose a novel rearing condition which we expect to have a dramatic effect on development.
Orientation tuning has been a classic model for understanding single neuron computation in the neocortex. However, little is known about how orientation can be read out from the activity of neural populations, in particular in alert animals. Our study is a first step towards that goal. We recorded from up to 20 well-isolated single neurons in the primary visual cortex of alert macaques simultaneously and applied a simple, neurally plausible decoder to read out the population code. We focus on two questions: First, what are the time course and the time scale at which orientation can be read out from the population response? Second, how complex does the decoding mechanism in a downstream neuron have to be in order to reliably discriminate between visual stimuli with different orientations? We show that the neural ensembles in primary visual cortex of awake macaques represent orientation in a way that facilitates a fast and simple read-out mechanism: with an average latency of 30–80 ms, the population code can be read out instantaneously with a short integration time of only tens of milliseconds and neither stimulus contrast nor correlations need to be taken into account to compute the optimal synaptic weight pattern. Our study shows that – similar to the case of single neuron computation – the representation of orientation in the spike patterns of neural populations can serve as an exemplary case for understanding of the computations performed by neural ensembles underlying visual processing during behavior.
Divisive normalization in primary visual cortex has been linked to adaptation to natural image statistics in accordance to Barlow's redundancy reduction hypothesis. Using recent advances in natural image modeling, we show that the previously studied static model of divisive normalization is rather inefficient in reducing local contrast correlations, but that a simple temporal contrast adaptation mechanism of the half-saturation constant can substantially increase its efficiency. Our findings reveal the experimentally observed temporal dynamics of divisive normalization to be critical for redundancy reduction.
The redundancy reduction hypothesis postulates that neural representations adapt to sensory input statistics such that their responses become as statistically independent as possible. Based on this hypothesis, many properties of early visual neurons—like orientation selectivity or divisive normalization—have been linked to natural image statistics. Divisive normalization, in particular, models a widely observed neural response property: The divisive inhibition of a single neuron by a pool of others. This mechanism has been shown to reduce the redundancy among neural responses to typical contrast dependencies in natural images. Here, we show that the standard model of divisive normalization achieves substantially less redundancy reduction than a theoretically optimal mechanism called radial factorization. On the other hand, we find that radial factorization is inconsistent with existing neurophysiological observations. As a solution we suggest a new physiologically plausible modification of the standard model which accounts for the dynamics of the visual input by adapting to local contrasts during fixations. In this way the dynamic version of the standard model achieves almost optimal redundancy reduction performance. Our results imply that the dynamics of natural viewing conditions are critical for testing the role of divisive normalization for redundancy reduction.
A key hypothesis in sensory system neuroscience is that sensory representations are adapted to the statistical regularities in sensory signals and thereby incorporate knowledge about the outside world. Supporting this hypothesis, several probabilistic models of local natural image regularities have been proposed that reproduce neural response properties. Although many such physiological links have been made, these models have not been linked directly to visual sensitivity. Previous psychophysical studies of sensitivity to natural image regularities focus on global perception of large images, but much less is known about sensitivity to local natural image regularities. We present a new paradigm for controlled psychophysical studies of local natural image regularities and compare how well such models capture perceptually relevant image content. To produce stimuli with precise statistics, we start with a set of patches cut from natural images and alter their content to generate a matched set whose joint statistics are equally likely under a probabilistic natural image model. The task is forced choice to discriminate natural patches from model patches. The results show that human observers can learn to discriminate the higher-order regularities in natural images from those of model samples after very few exposures and that no current model is perfect for patches as small as 5 by 5 pixels or larger. Discrimination performance was accurately predicted by model likelihood, an information theoretic measure of model efficacy, indicating that the visual system possesses a surprisingly detailed knowledge of natural image higher-order correlations, much more so than current image models. We also perform three cue identification experiments to interpret how model features correspond to perceptually relevant image features.
Several aspects of primate visual physiology have been identified as adaptations to local regularities of natural images. However, much less work has measured visual sensitivity to local natural image regularities. Most previous work focuses on global perception of large images and shows that observers are more sensitive to visual information when image properties resemble those of natural images. In this work we measure human sensitivity to local natural image regularities using stimuli generated by patch-based probabilistic natural image models that have been related to primate visual physiology. We find that human observers can learn to discriminate the statistical regularities of natural image patches from those represented by current natural image models after very few exposures and that discriminability depends on the degree of regularities captured by the model. The quick learning we observed suggests that the human visual system is biased for processing natural images, even at very fine spatial scales, and that it has a surprisingly large knowledge of the regularities in natural images, at least in comparison to the state-of-the-art statistical models of natural images.
Although data from longitudinal studies are sparse, effort-reward imbalance (ERI) seems to affect work ability. However, the potential pathway from restricted work ability to ERI must also be considered. Therefore, the aim of our study was to analyse cross-sectional and longitudinal associations between ERI and work ability and vice versa.
Data come from the Second German Sociomedical Panel of Employees. Logistic regression models were estimated to determine cross-sectional and longitudinal associations. The sample used to predict new cases of poor or moderate work ability was restricted to cases with good or excellent work ability at baseline. The sample used to predict new cases of ERI was restricted to persons without ERI at baseline.
The cross-sectional analysis included 1501 full-time employed persons. The longitudinal analyses considered 600 participants with good or excellent baseline work ability and 666 participants without baseline ERI, respectively. After adjustment for socio-demographic variables, health-related behaviour and factors of the work environment, ERI was cross-sectionally associated with poor or moderate work ability (OR = 1.980; 95% CI: 1.428 to 2.747). Longitudinally, persons with ERI had 2.1 times higher odds of poor or moderate work ability after one year (OR = 2.093; 95% CI: 1.047 to 4.183). Conversely, persons with poor or moderate work ability had 2.6 times higher odds of an ERI after one year (OR = 2.573; 95% CI: 1.314 to 5.041).
Interventions that enable workers to cope with ERI or address indicators of ERI directly could promote the maintenance of work ability. Integration management programmes for persons with poor work ability should also consider their psychosocial demands.
We present a probabilistic model for natural images that is based on mixtures of Gaussian scale mixtures and a simple multiscale representation. We show that it is able to generate images with interesting higher-order correlations when trained on natural images or samples from an occlusion-based model. More importantly, our multiscale model allows for a principled evaluation. While it is easy to generate visually appealing images, we demonstrate that our model also yields the best performance reported to date when evaluated with respect to the cross-entropy rate, a measure tightly linked to the average log-likelihood. The ability to quantitatively evaluate our model differentiates it from other multiscale models, for which evaluation of these kinds of measures is usually intractable.
Several studies have reported optimal population decoding of sensory responses in two-alternative visual discrimination tasks. Such decoding involves integrating noisy neural responses into a more reliable representation of the likelihood that the stimuli under consideration evoked the observed responses. Importantly, an ideal observer must be able to evaluate likelihood with high precision and only consider the likelihood of the two relevant stimuli involved in the discrimination task. We report a new perceptual bias suggesting that observers read out the likelihood representation with remarkably low precision when discriminating grating spatial frequencies. Using spectrally filtered noise, we induced an asymmetry in the likelihood function of spatial frequency. This manipulation mainly affects the likelihood of spatial frequencies that are irrelevant to the task at hand. Nevertheless, we find a significant shift in perceived grating frequency, indicating that observers evaluate likelihoods of a broad range of irrelevant frequencies and discard prior knowledge of stimulus alternatives when performing two-alternative discrimination.
An attractive view on human information processing proposes that inference problems are dealt with in a statistically optimal fashion. This hypothesis can explain aspects of perception, movement planning, cognition and decision making. In the present study, I use a new psychophysical paradigm that reveals surprisingly suboptimal perceptual decision making. Observers discriminate between two sinusoidal gratings of a different spatial frequency. Making use of visual noise, I induce an asymmetry in neural population responses to the gratings and find this asymmetry to effectively bias perceptual decision making. A simple ideal observer model, uninformed about the presence of visual noise but only considering the two grating spatial frequencies relevant to the task at hand, manages to avoid such a bias. I conclude that observers are limited in their ability to make use of prior knowledge of relevant visual features when performing this task. These results are in line with a growing number of findings suggesting that near-optimal decoders, although straightforward to implement and achieving near-maximal performance, consistently overestimate empirical performance in simple perceptual tasks.
The amount of information encoded by networks of neurons critically depends on the correlation structure of their activity. Neurons with similar stimulus preferences tend to have higher noise correlations than others. In homogeneous populations of neurons this limited range correlation structure is highly detrimental to the accuracy of a population code. Therefore, reduced spike count correlations under attention, after adaptation or after learning have been interpreted as evidence for a more efficient population code. Here we analyze the role of limited range correlations in more realistic, heterogeneous population models. We use Fisher information and maximum likelihood decoding to show that reduced correlations do not necessarily improve encoding accuracy. In fact, in populations with more than a few hundred neurons, increasing the level of limited range correlations can substantially improve encoding accuracy. We found that this improvement results from a decrease in noise entropy that is associated with increasing correlations if the marginal distributions are unchanged. Surprisingly, for constant noise entropy and in the limit of large populations the encoding accuracy is independent of both structure and magnitude of noise correlations.
Reconstructing stimuli from the spike trains of neurons is an important approach for understanding the neural code. One of the difficulties associated with this task is that signals which are varying continuously in time are encoded into sequences of discrete events or spikes. An important problem is to determine how much information about the continuously varying stimulus can be extracted from the time-points at which spikes were observed, especially if these time-points are subject to some sort of randomness. For the special case of spike trains generated by leaky integrate and fire neurons, noise can be introduced by allowing variations in the threshold every time a spike is released. A simple decoding algorithm previously derived for the noiseless case can be extended to the stochastic case, but turns out to be biased. Here, we review a solution to this problem, by presenting a simple yet efficient algorithm which greatly reduces the bias, and therefore leads to better decoding performance in the stochastic case.
decoding; spiking neurons; Bayesian inference; population coding; leaky integrate and fire
Generalized Linear Models (GLMs) are commonly used statistical methods for modelling the relationship between neural population activity and presented stimuli. When the dimension of the parameter space is large, strong regularization has to be used in order to fit GLMs to datasets of realistic size without overfitting. By imposing properly chosen priors over parameters, Bayesian inference provides an effective and principled approach for achieving regularization. Here we show how the posterior distribution over model parameters of GLMs can be approximated by a Gaussian using the Expectation Propagation algorithm. In this way, we obtain an estimate of the posterior mean and posterior covariance, allowing us to calculate Bayesian confidence intervals that characterize the uncertainty about the optimal solution. From the posterior we also obtain a different point estimate, namely the posterior mean as opposed to the commonly used maximum a posteriori estimate. We systematically compare the different inference techniques on simulated as well as on multi-electrode recordings of retinal ganglion cells, and explore the effects of the chosen prior and the performance measure used. We find that good performance can be achieved by choosing an Laplace prior together with the posterior mean estimate.
spiking neurons; Bayesian inference; population coding; sparsity; multielectrode recordings; receptive field; GLM; functional connectivity
Orientation selectivity is the most striking feature of simple cell coding in V1 that has been shown to emerge from the reduction of higher-order correlations in natural images in a large variety of statistical image models. The most parsimonious one among these models is linear Independent Component Analysis (ICA), whereas second-order decorrelation transformations such as Principal Component Analysis (PCA) do not yield oriented filters. Because of this finding, it has been suggested that the emergence of orientation selectivity may be explained by higher-order redundancy reduction. To assess the tenability of this hypothesis, it is an important empirical question how much more redundancy can be removed with ICA in comparison to PCA or other second-order decorrelation methods. Although some previous studies have concluded that the amount of higher-order correlation in natural images is generally insignificant, other studies reported an extra gain for ICA of more than 100%. A consistent conclusion about the role of higher-order correlations in natural images can be reached only by the development of reliable quantitative evaluation methods. Here, we present a very careful and comprehensive analysis using three evaluation criteria related to redundancy reduction: In addition to the multi-information and the average log-loss, we compute complete rate–distortion curves for ICA in comparison with PCA. Without exception, we find that the advantage of the ICA filters is small. At the same time, we show that a simple spherically symmetric distribution with only two parameters can fit the data significantly better than the probabilistic model underlying ICA. This finding suggests that, although the amount of higher-order correlation in natural images can in fact be significant, the feature of orientation selectivity does not yield a large contribution to redundancy reduction within the linear filter bank models of V1 simple cells.
Since the Nobel Prize winning work of Hubel and Wiesel it has been known that orientation selectivity is an important feature of simple cells in the primary visual cortex. The standard description of this stage of visual processing is that of a linear filter bank where each neuron responds to an oriented edge at a certain location within the visual field. From a vision scientist's point of view, we would like to understand why an orientation selective filter bank provides a useful image representation. Several previous studies have shown that orientation selectivity arises when the individual filter shapes are optimized according to the statistics of natural images. Here, we investigate quantitatively how critical the feature of orientation selectivity is for this optimization. We find that there is a large range of non-oriented filter shapes that perform nearly as well as the optimal orientation selective filters. We conclude that the standard filter bank model is not suitable to reveal a strong link between orientation selectivity and the statistics of natural images. Thus, to understand the role of orientation selectivity in the primary visual cortex, we will have to develop more sophisticated, nonlinear models of natural images.
The timing of action potentials in spiking neurons depends on the temporal dynamics of their inputs and contains information about temporal fluctuations in the stimulus. Leaky integrate-and-fire neurons constitute a popular class of encoding models, in which spike times depend directly on the temporal structure of the inputs. However, optimal decoding rules for these models have only been studied explicitly in the noiseless case. Here, we study decoding rules for probabilistic inference of a continuous stimulus from the spike times of a population of leaky integrate-and-fire neurons with threshold noise. We derive three algorithms for approximating the posterior distribution over stimuli as a function of the observed spike trains. In addition to a reconstruction of the stimulus we thus obtain an estimate of the uncertainty as well. Furthermore, we derive a ‘spike-by-spike’ online decoding scheme that recursively updates the posterior with the arrival of each new spike. We use these decoding rules to reconstruct time-varying stimuli represented by a Gaussian process from spike trains of single neurons as well as neural populations.
Bayesian decoding; population coding; spiking neurons; approximate inference