Along the auditory pathway from auditory nerve to midbrain to cortex, individual neurons adapt progressively to sound statistics, enabling the discernment of foreground sounds, such as speech, over background noise.
Identifying behaviorally relevant sounds in the presence of background noise is one of the most important and poorly understood challenges faced by the auditory system. An elegant solution to this problem would be for the auditory system to represent sounds in a noise-invariant fashion. Since a major effect of background noise is to alter the statistics of the sounds reaching the ear, noise-invariant representations could be promoted by neurons adapting to stimulus statistics. Here we investigated the extent of neuronal adaptation to the mean and contrast of auditory stimulation as one ascends the auditory pathway. We measured these forms of adaptation by presenting complex synthetic and natural sounds, recording neuronal responses in the inferior colliculus and primary fields of the auditory cortex of anaesthetized ferrets, and comparing these responses with a sophisticated model of the auditory nerve. We find that the strength of both forms of adaptation increases as one ascends the auditory pathway. To investigate whether this adaptation to stimulus statistics contributes to the construction of noise-invariant sound representations, we also presented complex, natural sounds embedded in stationary noise, and used a decoding approach to assess the noise tolerance of the neuronal population code. We find that the code for complex sounds in the periphery is affected more by the addition of noise than the cortical code. We also find that noise tolerance is correlated with adaptation to stimulus statistics, so that populations that show the strongest adaptation to stimulus statistics are also the most noise-tolerant. This suggests that the increase in adaptation to sound statistics from auditory nerve to midbrain to cortex is an important stage in the construction of noise-invariant sound representations in the higher auditory brain.
We rarely hear sounds (such as someone talking) in isolation, but rather against a background of noise. When mixtures of sounds and background noise reach the ears, peripheral auditory neurons represent the whole sound mixture. Previous evidence suggests, however, that the higher auditory brain represents just the sounds of interest, and is less affected by the presence of background noise. The neural mechanisms underlying this transformation are poorly understood. Here, we investigate these mechanisms by studying the representation of sound by populations of neurons at three stages along the auditory pathway; we simulate the auditory nerve and record from neurons in the midbrain and primary auditory cortex of anesthetized ferrets. We find that the transformation from noise-sensitive representations of sound to noise-tolerant processing takes place gradually along the pathway from auditory nerve to midbrain to cortex. Our results suggest that this results from neurons adapting to the statistics of heard sounds.
The neural processing of sensory stimuli involves a transformation of physical stimulus parameters into perceptual features, and elucidating where and how this transformation occurs is one of the ultimate aims of sensory neurophysiology. Recent studies have shown that the firing of neurons in early sensory cortex can be modulated by multisensory interactions [1–5], motor behavior [1, 3, 6, 7], and reward feedback [1, 8, 9], but it remains unclear whether neural activity is more closely tied to perception, as indicated by behavioral choice, or to the physical properties of the stimulus. We investigated which of these properties are predominantly represented in auditory cortex by recording local field potentials (LFPs) and multiunit spiking activity in ferrets while they discriminated the pitch of artificial vowels. We found that auditory cortical activity is informative both about the fundamental frequency (F0) of a target sound and also about the pitch that the animals appear to perceive given their behavioral responses. Surprisingly, although the stimulus F0 was well represented at the onset of the target sound, neural activity throughout auditory cortex frequently predicted the reported pitch better than the target F0.
► Auditory cortical responses were recorded while ferrets discriminated pitch shifts ► LFP and multiunit activity are sensitive to the sound’s fundamental frequency (F0) ► Neural activity related to animals’ reported pitch increases throughout the trial ► Cortical responses were more informative about behavioral choices than the sound F0
Auditory neurons are often described in terms of their spectrotemporal receptive fields (STRFs). These map the relationship between features of the sound spectrogram and neurons’ firing rates. Recently we showed that neurons in the primary fields of the ferret auditory cortex are also subject to gain control: when sounds undergo smaller fluctuations in their level over time, the neurons become more sensitive to small level changes (Rabinowitz et al., 2011). Just as STRFs measure the spectrotemporal features of a sound that lead to changes in neurons’ firing rates, in this study we sought to estimate the spectrotemporal regions in which sound statistics lead to changes in neurons’ gain. We designed a set of stimuli with complex contrast profiles to characterize these regions. This allowed us to estimate cortical neurons’ STRFs alongside a set of spectrotemporal contrast kernels. We find that these two sets of integration windows match up: the extent to which a stimulus feature causes a neuron’s firing rate to change is strongly correlated with the extent to which that feature’s contrast modulates the neuron’s gain. Adding contrast kernels to STRF models also yields considerable improvements in the ability to capture and predict how auditory cortical neurons respond to statistically complex sounds.
The specificity of auditory perceptual learning has been taken as an indicator of the likely locus within the brain at which underlying neuronal changes occur. This study examined interaural level difference (ILD) discrimination learning with sinusoidally amplitude modulated (SAM) tones and whether training-induced threshold improvements generalize from one side of auditory space to the other and to an untrained carrier frequency. A novel, dual-staircase adaptive method was adopted that was designed to prevent participants from identifying the nature of the adaptive track. ILD thresholds obtained with this method were compared with a constant-stimulus technique using otherwise identical stimuli. Adaptive thresholds derived from psychometric functions were found to be biased compared to those obtained from reversals. Although adaptive and constant-stimulus procedures appeared to yield different temporal patterns of learning, no global differences were found between them in terms of training outcomes. These data show that ILD discrimination learning with SAM tones does generalize to an untrained carrier frequency, but does not generalize across the midline. This implies that the neural substrate for binaural plasticity is found at a relatively high level of the auditory pathway where information is combined across frequency and where each side of auditory space is represented separately.
Although ears capable of detecting airborne sound have arisen repeatedly and independently in different species, most animals that are capable of hearing have a pair of ears. We review the advantages that arise from having two ears and discuss recent research on the similarities and differences in the binaural processing strategies adopted by birds and mammals. We also ask how these different adaptations for binaural and spatial hearing might inform and inspire the development of techniques for future auditory prosthetic devices.
We have previously shown that neurons in primary auditory cortex (A1) of anaesthetized (ketamine/medetomidine) ferrets respond more strongly and reliably to dynamic stimuli whose statistics follow “natural” 1/f dynamics than to stimuli exhibiting pitch and amplitude modulations that are faster (1/f 0.5) or slower (1/f 2) than 1/f. To investigate where along the central auditory pathway this 1/f-modulation tuning arises, we have now characterized responses of neurons in the central nucleus of the inferior colliculus (ICC) and the ventral division of the mediate geniculate nucleus of the thalamus (MGV) to 1/f γ distributed stimuli with γ varying between 0.5 and 2.8. We found that, while the great majority of neurons recorded from the ICC showed a strong preference for the most rapidly varying (1/f 0.5 distributed) stimuli, responses from MGV neurons did not exhibit marked or systematic preferences for any particular γ exponent. Only in A1 did a majority of neurons respond with higher firing rates to stimuli in which γ takes values near 1. These results indicate that 1/f tuning emerges at forebrain levels of the ascending auditory pathway.
The auditory system must represent sounds with a wide range of statistical properties. One important property is the spectrotemporal contrast in the acoustic environment: the variation in sound pressure in each frequency band, relative to the mean pressure. We show that neurons in ferret auditory cortex rescale their gain to partially compensate for the spectrotemporal contrast of recent stimulation. When contrast is low, neurons increase their gain, becoming more sensitive to small changes in the stimulus, although the effectiveness of contrast gain control is reduced at low mean levels. Gain is primarily determined by contrast near each neuron's preferred frequency, but there is also a contribution from contrast in more distant frequency bands. Neural responses are modulated by contrast over timescales of ∼100 ms. By using contrast gain control to expand or compress the representation of its inputs, the auditory system may be seeking an efficient coding of natural sounds.
► We find evidence for spectrotemporal contrast gain control in auditory cortex ► Gain is determined by a combination of spectrally local and global contrast ► Within a limited range, mean stimulus level also affects neural gain ► Contrast gain control is fast (∼100 ms); gain decreases are faster than increases
Neurons in the auditory cortex of anesthetized animals are generally considered to generate phasic responses to simple stimuli such as tones or noise bursts. In this paper, we show that under ketamine/medetomidine anesthesia, neurons in ferret auditory cortex usually exhibit complex sustained responses. We presented 100-ms broad-band noise bursts at a range of interaural level differences (ILDs) and average binaural levels (ABLs), and used extracellular electrodes to monitor evoked activity over 700 ms poststimulus onset. We estimated the degree of randomness (noise) in the response functions of individual neurons over poststimulus time; we found that neural activity was significantly modulated by sound for up to ∼500 ms following stimulus offset. Pooling data from all neurons, we found that spiking activity carries significant information about stimulus identity over this same time period. However, information about ILD decayed much more quickly over time compared with information about ABL. In addition, ILD and ABL are coded independently by the neural population even though this is not the case at individual neurons. Though most neurons responded more strongly to ILDs corresponding to the opposite side of space, as a population, they were equally informative about both contra- and ipsilateral stimuli.
It is widely appreciated that the key predictor of the pitch of a sound is its periodicity. Neural structures which support pitch perception must therefore be able to reflect the repetition rate of a sound, but this alone is not sufficient. Since pitch is a psychoacoustic property, a putative cortical code for pitch must also be able to account for the relationship between the amount to which a sound is periodic (i.e. its temporal regularity) and the perceived pitch salience, as well as limits in our ability to detect pitch changes or to discriminate rising from falling pitch. Pitch codes must also be robust in the presence of nuisance variables such as loudness or timbre. Here, we review a large body of work on the cortical basis of pitch perception, which illustrates that the distribution of cortical processes that give rise to pitch perception is likely to depend on both the acoustical features and functional relevance of a sound. While previous studies have greatly advanced our understanding, we highlight several open questions regarding the neural basis of pitch perception. These questions can begin to be addressed through a cooperation of investigative efforts across species and experimental techniques, and, critically, by examining the responses of single neurons in behaving animals.
A1, primary auditory cortex; F0, fundamental frequency; fMRI, functional magnetic resonance imaging; HG, Heschl’s gyrus; IRN, iterated rippled noise; MEG, magnetoencephalography; SAM, sinusoidally amplitude modulated
We measured the responses of neurons in auditory cortex of male and female ferrets to artificial vowels of varying fundamental frequency (f0), or periodicity, and compared these to the performance of animals trained to discriminate the periodicity of these sounds. Sensitivity to f0 was found in all five auditory cortical fields examined, with most of those neurons exhibiting either low-pass or high-pass response functions. Only rarely was the stimulus dependence of individual neuron discharges sufficient to account for the discrimination performance of the ferrets. In contrast, when analyzed with a simple classifier, responses of small ensembles, comprising 3-61 simultaneously recorded neurons, often discriminated periodicity changes as well as the animals did. We examined four potential strategies for decoding ensemble responses: spike counts, relative first-spike latencies, a binary “spike or no-spike” code and a spike-order code. All four codes represented stimulus periodicity effectively, and, surprisingly, the spike count and relative latency codes enabled an equally rapid readout, within 75 ms of stimulus onset. Thus, relative latency codes do not necessarily facilitate faster discrimination judgments. A joint spike count plus relative latency code was more informative than either code alone, indicating that the information captured by each measure was not wholly redundant. The responses of neural ensembles, but not of single neurons, reliably encoded f0 changes even when stimulus intensity was varied randomly over a 20 dB range. Because trained animals can discriminate stimulus periodicity across different sound levels, this implies that ensemble codes are better suited to account for behavioral performance.
neurometric; distributed processing; periodicity; vowel; neural coding; spike timing
Although many studies have examined the performance of animals in detecting a frequency change in a sequence of tones, few have measured animals' discrimination of the fundamental frequency (F0) of complex, naturalistic stimuli. Additionally, it is not yet clear if animals perceive the pitch of complex sounds along a continuous, low-to-high scale. Here, four ferrets (Mustela putorius) were trained on a two-alternative forced choice task to discriminate sounds that were higher or lower in F0 than a reference sound, using pure tones and artificial vowels as stimuli. Average Weber fractions for ferrets on this task varied from ~20 – 80% across references (200 - 1200 Hz), and these fractions were similar for pure tones and vowels. These thresholds are approximately 10 times higher than those typically reported for other mammals on frequency change detection tasks that use go/no-go designs. Naive human listeners outperformed ferrets on the present task, but they showed similar effects of stimulus type and reference F0. These results suggest that while non-human animals can be trained to label complex sounds as high or low in pitch, this task may be much more difficult for animals than simply detecting a frequency change.
Because we can perceive the pitch, timbre and spatial location of a sound source independently, it seems natural to suppose that cortical processing of sounds might separate out spatial from non-spatial attributes. Indeed, recent studies support the existence of anatomically segregated ‘what’ and ‘where’ cortical processing streams. However, few attempts have been made to measure the responses of individual neurons in different cortical fields to sounds that vary simultaneously across spatial and non-spatial dimensions. We recorded responses to artificial vowels presented in virtual acoustic space to investigate the representations of pitch, timbre and sound source azimuth in both core and belt areas of ferret auditory cortex. A variance decomposition technique was used to quantify the way in which altering each parameter changed neural responses. Most units were sensitive to two or more of these stimulus attributes. Whilst indicating that neural encoding of pitch, location and timbre cues is distributed across auditory cortex, significant differences in average neuronal sensitivity were observed across cortical areas and depths, which could form the basis for the segregation of spatial and non-spatial cues at higher cortical levels. Some units exhibited significant non-linear interactions between particular combinations of pitch, timbre and azimuth. These interactions were most pronounced for pitch and timbre and were less commonly observed between spatial and non-spatial attributes. Such non-linearities were most prevalent in primary auditory cortex, although they tended to be small compared with stimulus main effects.
Auditory cortex; tuning; sound; spike trains; vocalization; localization; parallel; hearing
Auditory neurons in the superior colliculus (SC) respond preferentially to sounds from restricted directions to form a map of auditory space. The development of this representation is shaped by sensory experience, but little is known about the relative contribution of peripheral and central factors to the emergence of adult responses. By recording from the SC of anesthetized ferrets at different age points, we show that the map matures gradually after birth; the spatial receptive fields (SRFs) become more sharply tuned and topographic order emerges by the end of the second postnatal month. Principal components analysis of the head-related transfer function revealed that the time course of map development is mirrored by the maturation of the spatial cues generated by the growing head and external ears. However, using virtual acoustic space stimuli, we show that these acoustical changes are not by themselves responsible for the emergence of SC map topography. Presenting stimuli to infant ferrets through virtual adult ears did not improve the order in the representation of sound azimuth in the SC. But using linear discriminant analysis to compare different response properties across age, we found that the SRFs of infant neurons nevertheless became more adult-like when stimuli were delivered through virtual adult ears. Hence, although the emergence of auditory topography is likely to depend on refinements in neural circuitry, maturation of the structure of the SRFs (particularly their spatial extent) can be largely accounted for by changes in the acoustics associated with growth of the head and ears.
virtual acoustic space; receptive field; ferret; linear discriminant analysis; head-related transfer function; sound localization