The intensity of a noise-induced startle response can be reduced by the presentation of an otherwise neutral stimulus immediately before the noise (“prepulse inhibition” or PPI). We used a form of PPI to study the effects of damage to auditory cortex on the discrimination of speech sounds by rats. Subjects underwent control surgery or treatment of the auditory cortex with the vasoconstrictor endothelin-1. This treatment caused damage concentrated in primary auditory cortex (A1). Both before and after lesions, subjects were tested on 5 tasks, most presenting a pair of human speech sounds (consonant-vowel syllables) so that the capacity for discrimination would be evident in the extent of PPI. Group comparisons failed to reveal any consistent lesion effect. At the same time, the analysis of individual differences in performance by multiple regression suggests that some of the temporal processing required to discriminate speech sounds is concentrated anteroventrally in the right A1. These results also confirm that PPI can be adapted to studies of the brain mechanisms involved in the processing of speech and other complex sounds.
Auditory cortex; Brain lesions; Prepulse inhibition; Speech perception; Speech sounds; Voice onset time
Accurate temporal processing of sound is essential for detecting word structures in speech. Maternal smoking affects speech processing in newborns and may influence child language development; however, it is unclear how neonatal exposure to nicotine, present in cigarettes, affects the normal development of temporal processing. The present study used the gap-induced prepulse inhibition (gap-PPI) of the acoustic startle response to investigate the effects of neonatal nicotine exposure on the normal development of gap detection, a behavioral testing procedure of auditory temporal resolution. Neonatal rats were injected twice per day with saline (control), 1 mg/kg nicotine (N-1mg) or 5 mg/kg nicotine (N-5mg) from postnatal day 8 to 12 (P8–P12). During the first month after birth, rats showed poor gap-PPI in all three groups. At P45 and P60, gap-PPI in control rats improved significantly, whereas rats exposed to nicotine exhibited less improvement. At P60, the gap-detection threshold in the N-5mg group was significantly higher than in the control group, suggesting that neonatal nicotine exposure affects the normal development of gap detection acuity. Additionally, 1 hour after receiving an acute nicotine injection (1 mg/kg), gap-PPI recorded in adult rats from the N-5mg group showed a temporary significant improvement. These results suggest that neonatal nicotine exposure reduces gap-PPI implying an impairment of the normal development of auditory temporal processing by inducing changes in cholinergic systems.
Nicotine; Startle reflex; Gap detection; Development; Auditory cortex
In a natural environment, contextual noise frequently occurs with a signal sound for detection or discrimination in a temporal relation. However, the representation of sound frequency by auditory cortical neurons in a noisy environment is not fully understood. Therefore, the purpose of this study was to explore the impact of contextual noise on the cortical tuning to signal sound frequency in order to better understand the mechanism of cortical frequency coding in a complex acoustical environment.
We compared the excitatory frequency-level receptive fields (FLRFs) of neurons in the rat primary auditory cortex determined under both quiet and preceding noise conditions. Based on the changes of minimum threshold and the extent of FLRF of auditory cortical neurons, we found that the FLRFs of a cortical neuron were modulated dynamically by a varying preceding noise. When the interstimulus interval between noise and the probe tone was constant, the modulation of the FLRF increased as the level of noise was increased. If the preceding noise level was constant, the modulation decreased when the interstimulus interval was increased. Preceding noise sharpened the bandwidth of the FLRFs of 47.6% tested neurons. Moreover, preceding noise shifted the CFs of 47.6% neurons by more than 0.25 octaves, while the CFs of the rest of the neurons remained relatively unchanged.
The results indicate that the cortical representation of sound frequency is dynamically modulated by contextual acoustical environment, and that there are cortical neurons whose characteristic frequencies were resistant to the interference of contextual noise.
Noise; Auditory cortex; Forward masking; Frequency tuning; Receptive field
It is well known that damage to the peripheral auditory system causes deficits in tone detection as well as pitch and loudness perception across a wide range of frequencies. However, the extent to which to which the auditory cortex plays a critical role in these basic aspects of spectral processing, especially with regard to speech, music, and environmental sound perception, remains unclear. Recent experiments indicate that primary auditory cortex is necessary for the normally-high perceptual acuity exhibited by humans in pure-tone frequency discrimination. The present study assessed whether the auditory cortex plays a similar role in the intensity domain and contrasted its contribution to sensory versus discriminative aspects of intensity processing. We measured intensity thresholds for pure-tone detection and pure-tone loudness discrimination in a population of healthy adults and a middle-aged man with complete or near-complete lesions of the auditory cortex bilaterally. Detection thresholds in his left and right ears were 16 and 7 dB HL, respectively, within clinically-defined normal limits. In contrast, the intensity threshold for monaural loudness discrimination at 1 kHz was 6.5±2.1 dB in the left ear and 6.5±1.9 dB in the right ear at 40 dB sensation level, well above the means of the control population (left ear: 1.6±0.22 dB; right ear: 1.7±0.19 dB). The results indicate that auditory cortex lowers just-noticeable differences for loudness discrimination by approximately 5 dB but is not necessary for tone detection in quiet. Previous human and Old-world monkey experiments employing lesion-effect, neurophysiology, and neuroimaging methods to investigate the role of auditory cortex in intensity processing are reviewed.
Speech perception is based on a variety of spectral and temporal acoustic features available in the acoustic signal. Voice-onset time (VOT) is considered an important cue that is cardinal for phonetic perception.
In the present study, we recorded and compared scalp auditory evoked potentials (AEP) in response to consonant-vowel-syllables (CV) with varying voice-onset-times (VOT) and non-speech analogues with varying noise-onset-time (NOT). In particular, we aimed to investigate the spatio-temporal pattern of acoustic feature processing underlying elemental speech perception and relate this temporal processing mechanism to specific activations of the auditory cortex.
Results show that the characteristic AEP waveform in response to consonant-vowel-syllables is on a par with those of non-speech sounds with analogue temporal characteristics. The amplitude of the N1a and N1b component of the auditory evoked potentials significantly correlated with the duration of the VOT in CV and likewise, with the duration of the NOT in non-speech sounds.
Furthermore, current density maps indicate overlapping supratemporal networks involved in the perception of both speech and non-speech sounds with a bilateral activation pattern during the N1a time window and leftward asymmetry during the N1b time window. Elaborate regional statistical analysis of the activation over the middle and posterior portion of the supratemporal plane (STP) revealed strong left lateralized responses over the middle STP for both the N1a and N1b component, and a functional leftward asymmetry over the posterior STP for the N1b component.
The present data demonstrate overlapping spatio-temporal brain responses during the perception of temporal acoustic cues in both speech and non-speech sounds. Source estimation evidences a preponderant role of the left middle and posterior auditory cortex in speech and non-speech discrimination based on temporal features. Therefore, in congruency with recent fMRI studies, we suggest that similar mechanisms underlie the perception of linguistically different but acoustically equivalent auditory events on the level of basic auditory analysis.
Early experience of structured inputs and complex sound features generate lasting changes in tonotopy and receptive field properties of primary auditory cortex (A1). In this study we tested whether these changes are severe enough to alter neural representations and behavioral discrimination of speech. We exposed two groups of rat pups during the critical period of auditory development to pulsed noise or speech. Both groups of rats were trained to discriminate speech sounds when they were young adults, and anesthetized neural responses were recorded from A1. The representation of speech in A1 and behavioral discrimination of speech remained robust to altered spectral and temporal characteristics of A1 neurons after pulsed-noise exposure. Exposure to passive speech during early development provided no added advantage in speech sound processing. Speech training increased A1 neuronal firing rate for speech stimuli in naïve rats, but did not increase responses in rats that experienced early exposure to pulsed noise or speech. Our results suggest that speech sound processing is resistant to changes in simple neural response properties caused by manipulating early acoustic environment.
pulsed-noise; early acoustic experience; primary auditory cortex; speech discrimination; plasticity; critical period
The early onset of peripheral deafness profoundly alters the functional maturation of the central auditory system. A prolonged exposure to an artificial acoustic environment has a similar disruptive influence. These observations establish the importance of normal patterns of sound-driven activity during the initial stages of auditory development. The present study was designed to address the role of cochlear gain control during these activity-dependent developmental processes. It was hypothesized that the regulation of auditory nerve activity by the medial olivocochlear system (MOCS) would preserve normal development when the immature auditory system was challenged by continuous background noise. To test this hypothesis, knock-out mice lacking MOCS feedback were reared in noisy or quiet environments and then evaluated with behavioral paradigms for auditory processing deficits. Relative to wild-type controls, noise-reared knock-out mice showed a decreased ability to process rapid acoustic events. Additional anatomical and physiological assessments linked these perceptual deficits to synaptic defects in the auditory brainstem that shared important features with human auditory neuropathy. Our findings offer a new perspective on the potentially damaging effects of environmental noise and how these risks are ameliorated by the protective role of MOCS feedback.
activity-dependent development; olivocochlear feedback; environmental noise; nicotinic acetylcholine receptor; temporal processing deficit; auditory neuropathy
Auditory temporal processes in quiet are impaired in Auditory Neuropathy (AN) similar to normal hearing subjects tested in noise. N100 latencies were measured from AN subjects at several tone intensities in quiet and noise for comparison with a group of normal hearing individuals.
Subjects were tested with brief 100 ms tones (1.0 kHz 100 dB to 40 dB SPL) in quiet and in continuous noise (90 dB SPL). N100 latency and amplitude were analyzed as a function of signal intensity and audibility.
N100 latency in AN in quiet was delayed and amplitude was reduced compared to the normal group; the extent of latency delay was related to psychoacoustic measures of gap detection threshold and speech recognition scores, but not to audibility. Noise in normal hearing subjects was accompanied by N100 latency delays and amplitude reductions paralleling those found in AN tested in quiet. Additional N100 latency delays and amplitude reductions occurred in AN with noise.
N100 latency to tones and performance on auditory temporal tasks were related in AN subjects. Noise masking in normal hearing subjects affected N100 latency to resemble AN in quiet.
N100 latency to tones may serve as an objective measure of the efficiency of auditory temporal processes.
Temporal processes; Dys-synchrony; Deafferentation; Noise masking; Psychoacoustics; Hearing impairment
Cochlear implants provide good speech discrimination ability despite highly limited amount of information they transmit compared with normal cochlea. Noise vocoded speech, simulating cochlear implants in normal hearing listeners, have demonstrated that spectrally and temporally degraded speech contains sufficient cues to provide accurate speech discrimination. We hypothesized that neural activity patterns generated in the primary auditory cortex by spectrally and temporally degraded speech sounds will account for the robust behavioral discrimination of speech. We examined the behavioral discrimination of noise vocoded consonants and vowels by rats and recorded neural activity patterns from rat primary auditory cortex (A1) for the same sounds. We report the first evidence of behavioral discrimination of degraded speech sounds by an animal model. Our results show that rats are able to accurately discriminate both consonant and vowel sounds even after significant spectral and temporal degradation. The degree of degradation that rats can tolerate is comparable to human listeners. We observed that neural discrimination based on spatiotemporal patterns (spike timing) of A1 neurons is highly correlated with behavioral discrimination of consonants and that neural discrimination based on spatial activity patterns (spike count) of A1 neurons is highly correlated with behavioral discrimination of vowels. The results of the current study indicate that speech discrimination is resistant to degradation as long as the degraded sounds generate distinct patterns of neural activity.
Electronic supplementary material
The online version of this article (doi:10.1007/s10162-012-0328-1) contains supplementary material, which is available to authorized users.
speech processing; neural code; noise vocoded speech; primary auditory cortex; cochlear implants; spatiotemporal patterns; spatial patterns
The speech signal consists of a continuous stream of consonants and vowels, which must be de- and encoded in human auditory cortex to ensure the robust recognition and categorization of speech sounds. We used small-voxel functional magnetic resonance imaging to study information encoded in local brain activation patterns elicited by consonant-vowel syllables, and by a control set of noise bursts. First, activation of anterior–lateral superior temporal cortex was seen when controlling for unspecific acoustic processing (syllables versus band-passed noises, in a “classic” subtraction-based design). Second, a classifier algorithm, which was trained and tested iteratively on data from all subjects to discriminate local brain activation patterns, yielded separations of cortical patches discriminative of vowel category versus patches discriminative of stop-consonant category across the entire superior temporal cortex, yet with regional differences in average classification accuracy. Overlap (voxels correctly classifying both speech sound categories) was surprisingly sparse. Third, lending further plausibility to the results, classification of speech–noise differences was generally superior to speech–speech classifications, with the no\ exception of a left anterior region, where speech–speech classification accuracies were significantly better. These data demonstrate that acoustic–phonetic features are encoded in complex yet sparsely overlapping local patterns of neural activity distributed hierarchically across different regions of the auditory cortex. The redundancy apparent in these multiple patterns may partly explain the robustness of phonemic representations.
auditory cortex; speech; multivariate pattern classification; fMRI; syllables; vowels; consonants
Auditory spatial acuity was measured in mice using prepulse inhibition (PPI) of the acoustic startle reflex (ASR) as the indicator response for stimulus detection. The prepulse was a “speaker swap” (SSwap), shifting a noise between two speakers located along the azimuth. Their angular separation, and the spectral composition and sound level of the noise were varied, as was the interstimulus interval (ISI) between SSwap and ASR elicitation. In Experiment 1 a 180° SSwap of wide band noise (WBN) was compared with WBN Onset and Offset. SSwap and WBN Onset had near equal effects, but less than Offset. In Experiment 2 WBN SSwap was measured with speaker separations of 15°, 22.5°, 45°, and 90°. Asymptotic level and the growth rate of PPI increased with increased separation from 15° to 90°, but even the 15° SSwap provided significant PPI for the mean performance of the group. SSwap in Experiment 3 used octave band noise (2–4, 4–8, 8–16, or 16–32 kHz) and separations of 7.5° to 180°. SSwap was most effective for the highest frequencies, with no significant PPI for SSwap below 8–16 kHz, or for separations of 7.5°. In Experiment 4 SSwap had WBN sound levels from 40 to 78 dB SPL, and separations of 22.5°, 45°, 90° and 180°: PPI increased with level, this effect varying with ISI and angular separation. These experiments extend the prior findings on sound localization in mice, and the dependence of PPI on ISI adds a reaction-time-like dimension to this behavioral analysis.
Spatial processing; processing time; minimum audible angle; startle inhibition
Hearing aid technology has improved dramatically in the last decade, especially in the ability to adaptively respond to dynamic aspects of background noise. Despite these advancements, however, hearing aid users continue to report difficulty hearing in background noise and having trouble adjusting to amplified sound quality. These difficulties may arise in part from current approaches to hearing aid fittings, which largely focus on increased audibility and management of environmental noise. These approaches do not take into account the fact that sound is processed all along the auditory system from the cochlea to the auditory cortex. Older adults represent the largest group of hearing aid wearers; yet older adults are known to have deficits in temporal resolution in the central auditory system. Here we review evidence that supports the use of the auditory brainstem response to complex sounds (cABR) in the assessment of hearing-in-noise difficulties and auditory training efficacy in older adults.
Auditory sustained responses have been recently suggested to reflect neural processing of speech sounds in the auditory cortex. As periodic fluctuations below the pitch range are important for speech perception, it is necessary to investigate how low frequency periodic sounds are processed in the human auditory cortex. Auditory sustained responses have been shown to be sensitive to temporal regularity but the relationship between the amplitudes of auditory evoked sustained responses and the repetitive rates of auditory inputs remains elusive. As the temporal and spectral features of sounds enhance different components of sustained responses, previous studies with click trains and vowel stimuli presented diverging results. In order to investigate the effect of repetition rate on cortical responses, we analyzed the auditory sustained fields evoked by periodic and aperiodic noises using magnetoencephalography.
Sustained fields were elicited by white noise and repeating frozen noise stimuli with repetition rates of 5-, 10-, 50-, 200- and 500 Hz. The sustained field amplitudes were significantly larger for all the periodic stimuli than for white noise. Although the sustained field amplitudes showed a rising and falling pattern within the repetition rate range, the response amplitudes to 5 Hz repetition rate were significantly larger than to 500 Hz.
The enhanced sustained field responses to periodic noises show that cortical sensitivity to periodic sounds is maintained for a wide range of repetition rates. Persistence of periodicity sensitivity below the pitch range suggests that in addition to processing the fundamental frequency of voice, sustained field generators can also resolve low frequency temporal modulations in speech envelope.
The Kv1.1 potassium channel subunit, encoded by the Kcna1 gene, is heavily expressed in the auditory brainstem and is thought to have a critical role in producing the high temporal precision of action potentials characteristic of the auditory system. Our intent was to determine whether temporal acuity was reduced in Kcna1 null-mutant (−/−) mice, compared to wild-type (+/+) and heterozygotic mice (+/−), as measured by the encoding of gaps in the inferior colliculus by near-field auditory evoked potentials (NFAEP) or behavioral gap detection (BGD) using a prepulse inhibition paradigm. NFAEPs were collected at 40, 60 and 80 dB SPL with gap durations from 0.5 to 64 ms. BGD data were collected using silent gaps in 70 dB noise from 1 to 15 ms in duration. There were no systematic effects of Kcna1 genotype on NFAEP recovery functions, NFAEP latencies, or the time constant for BGD, but there was a small reduction in asymptotic prepulse inhibition for the longest gap stimuli in −/− mice. Gap thresholds were approximately 1–2 ms across genotypes, stimulus conditions, and paradigms. These data suggest that the neural pathways encoding behaviorally relevant, rapid auditory temporal fluctuations are not limited by the absence of Kv1.1 expression.
voltage-gated potassium channel; Kv1.1; inferior colliculus; prepulse inhibition
For patients with pharmaco-resistant temporal epilepsy, unilateral anterior temporal lobectomy (ATL) – i.e. the surgical resection of the hippocampus, the amygdala, the temporal pole and the most anterior part of the temporal gyri – is an efficient treatment. There is growing evidence that anterior regions of the temporal lobe are involved in the integration and short-term memorization of object-related sound properties. However, non-verbal auditory processing in patients with temporal lobe epilepsy (TLE) has raised little attention. To assess non-verbal auditory cognition in patients with temporal epilepsy both before and after unilateral ATL, we developed a set of non-verbal auditory tests, including environmental sounds. We could evaluate auditory semantic identification, acoustic and object-related short-term memory, and sound extraction from a sound mixture. The performances of 26 TLE patients before and/or after ATL were compared to those of 18 healthy subjects. Patients before and after ATL were found to present with similar deficits in pitch retention, and in identification and short-term memorisation of environmental sounds, whereas not being impaired in basic acoustic processing compared to healthy subjects. It is most likely that the deficits observed before and after ATL are related to epileptic neuropathological processes. Therefore, in patients with drug-resistant TLE, ATL seems to significantly improve seizure control without producing additional auditory deficits.
epilepsy; audition; short-term memory; identification; environmental sounds; temporal lobe; resection
Auditory temporal resolution and auditory temporal ordering are two major components of the auditory temporal processing abilities that contribute to speech perception and language development. Auditory temporal resolution and auditory temporal ordering can be evaluated by gap-in-noise (GIN) and pitch-pattern-sequence (PPS) tests, respectively. In this survey, the effect of bilingualism as a potential confounding factor on auditory temporal processing abilities was investigated in early Azari-Persian bilinguals.
Materials and Methods:
In this cross-sectional non-interventional study, GIN and PPS tests were performed on 24 (12 men and 12 women) early Azari-Persian bilingual persons and 24 (12 men and 12 women) Persian monolingual subjects in the age range of 18–30 years, with a mean age of 24.57 years in bilingual and 24.68 years in monolingual subjects. Data were analyzed with t-test using SPSS software version 16.
There was no statistically significant difference between mean gap threshold and mean percentages of the correct response of the GIN test and average percentage of correct responses in the PPS test between early Azari-Persian bilinguals and Persian monolinguals (P≥0.05).
According to the findings of this study, bilingualism did not have notable effect on auditory temporal processing abilities.
Auditory perception; Multilingualism; Pitch perception
We investigated a neural basis of speech-in-noise perception in older adults. Hearing loss, the third most common chronic condition in older adults, is most often manifested by difficulty with understanding speech in background noise. This trouble with understanding speech in noise, which occurs even in individuals who have normal hearing thresholds, may arise, in part, from age-related declines in central auditory processing of the temporal and spectral components of speech. We hypothesized that older adults with poorer speech-in-noise (SIN) perception demonstrate impairments in the subcortical representation of speech.
In all participants (28 adults, ages 60 to 73 years), average hearing thresholds calculated from 500 to 4000 Hz were ≤ 25 dB HL. The participants were evaluated behaviorally with the Hearing in Noise Test (HINT) and neurophysiologically using speech-evoked auditory brainstem responses recorded in quiet and in background noise. The participants were divided based on their HINT scores into top and bottom performing groups that were matched for audiometric thresholds and IQ. We compared brainstem responses in the two groups, specifically, the average spectral magnitudes of the neural response and the degree to which background noise affected response morphology.
In the quiet condition, the bottom SIN group had reduced neural representation of the fundamental frequency of the speech stimulus and an overall reduction in response magnitude. In the noise condition, the bottom SIN group demonstrated greater effects of noise, which may reflect reduction in neural synchrony. All physiologic measures correlated with SIN perception.
Adults in the bottom SIN group differed from the audiometrically-matched top SIN group in how speech was neurally encoded. The strength of subcortical encoding of the fundamental frequency appears to be a factor in successful speech-in-noise perception in older adults. Given the limitations of amplification for improving central auditory processing, our results indicate the need for inclusion of auditory training in intervention plans for older adults with SIN perception difficulties.
Aging; Central; Brainstem; Speech-in-Noise Perception
Tinnitus is a perception of sound without external source. For complete assessment of tinnitus, central auditory processing abilities should be considered in addition to the routine psychological evaluation of tinnitus characteristics. Temporal processing is one of the important auditory skills that are necessary for complex higher level auditory processing.
Materials and Methods:
20 tinnitus patients and 20 healthy volunteers without tinnitus, all with normal auditory thresholds (≤ 20 dBnHL), were enrolled in present study. Pure Tone Audiometry (PTA), Tinnitus evaluation, Gap in Noise (GIN) test and Duration Pattern Test (DPT) were applied to all participants.
Analysis of GIN test revealed statistically significant increases in an approximate threshold value of gap detection in the patients group, both in right and left sides (P=0.007 and P=0.011, respectively). Comparison of percentage of correct responses in between two groups was also statistically meaningful in right and left ears (P=0.019 and P=0.026, respectively). The comparison of different parameters of DPT in two study groups revealed no significant differences in percentage of correct responses between two groups (P>0.05).
GIN test results identified auditory temporal resolution difficulties in patients with tinnitus, meaning that in spite of normal auditory thresholds there may be some possibility of abnormality in central auditory processing functions.
Duration pattern test; Gap in noise test; Temporal processing; Tinnitus
Auditory-nerve fibers demonstrate dynamic response properties in that they adapt to rapid changes in sound level, both at the onset and offset of a sound. These dynamic response properties affect temporal coding of stimulus modulations that are perceptually relevant for many sounds such as speech and music. Temporal dynamics have been well characterized in auditory-nerve fibers from normal-hearing animals, but little is known about the effects of sensorineural hearing loss on these dynamics. This study examined the effects of noise-induced hearing loss on the temporal dynamics in auditory-nerve fiber responses from anesthetized chinchillas. Post-stimulus time histograms were computed from responses to 50-ms tones presented at characteristic frequency and 30 dB above fiber threshold. Several response metrics related to temporal dynamics were computed from post-stimulus-time histograms and were compared between normal-hearing and noise-exposed animals. Results indicate that noise-exposed auditory-nerve fibers show significantly reduced response latency, increased onset response and percent adaptation, faster adaptation after onset, and slower recovery after offset. The decrease in response latency only occurred in noise-exposed fibers with significantly reduced frequency selectivity. These changes in temporal dynamics have important implications for temporal envelope coding in hearing-impaired ears, as well as for the design of dynamic compression algorithms for hearing aids.
Auditory nerve; adaptation; recovery; latency; acoustic trauma; chinchilla
Timbre is the attribute that distinguishes sounds of equal pitch, loudness and duration. It contributes to our perception and discrimination of different vowels and consonants in speech, instruments in music and environmental sounds. Here we begin by reviewing human timbre perception and the spectral and temporal acoustic features that give rise to timbre in speech, musical and environmental sounds. We also consider the perception of timbre by animals, both in the case of human vowels and non-human vocalizations. We then explore the neural representation of timbre, first within the peripheral auditory system and later at the level of the auditory cortex. We examine the neural networks that are implicated in timbre perception and the computations that may be performed in auditory cortex to enable listeners to extract information about timbre. We consider whether single neurons in auditory cortex are capable of representing spectral timbre independently of changes in other perceptual attributes and the mechanisms that may shape neural sensitivity to timbre. Finally, we conclude by outlining some of the questions that remain about the role of neural mechanisms in behavior and consider some potentially fruitful avenues for future research.
auditory cortex; vowels; ferret; speech; neural coding
To restore hearing sensation, cochlear implants deliver electrical pulses to the auditory nerve by relying on sophisticated signal processing algorithms that convert acoustic inputs to electrical stimuli. Although individuals fitted with cochlear implants perform well in quiet, in the presence of background noise, the speech intelligibility of cochlear implant listeners is more susceptible to background noise than that of normal hearing listeners. Traditionally, to increase performance in noise, single-microphone noise reduction strategies have been used. More recently, a number of approaches have suggested that speech intelligibility in noise can be improved further by making use of two or more microphones, instead. Processing strategies based on multiple microphones can better exploit the spatial diversity of speech and noise because such strategies rely mostly on spatial information about the relative position of competing sound sources. In this article, we identify and elucidate the most significant theoretical aspects that underpin single- and multi-microphone noise reduction strategies for cochlear implants. More analytically, we focus on strategies of both types that have been shown to be promising for use in current-generation implant devices. We present data from past and more recent studies, and furthermore we outline the direction that future research in the area of noise reduction for cochlear implants could follow.
cochlear implants; single-microphone noise reduction; multi-microphone noise reduction
This study examined whether rapid temporal auditory processing, verbal working memory capacity, non-verbal intelligence, executive functioning, musical ability and prior foreign language experience predicted how well native English speakers (N = 120) discriminated Norwegian tonal and vowel contrasts as well as a non-speech analogue of the tonal contrast and a native vowel contrast presented over noise. Results confirmed a male advantage for temporal and tonal processing, and also revealed that temporal processing was associated with both non-verbal intelligence and speech processing. In contrast, effects of musical ability on non-native speech-sound processing and of inhibitory control on vowel discrimination were not mediated by temporal processing. These results suggest that individual differences in non-native speech-sound processing are to some extent determined by temporal auditory processing ability, in which males perform better, but are also determined by a host of other abilities that are deployed flexibly depending on the characteristics of the target sounds.
Temporal acuity in the auditory brainstem is correlated with left-dominant patterns of cortical asymmetry for processing rapid speech-sound stimuli. Here we investigate whether a similar relationship exists between brainstem processing of rapid speech components and cortical processing of syllable patterns in speech.
We measured brainstem and cortical evoked potentials in response to speech tokens in 23 children. We used established measures of auditory brainstem and cortical activity to examine functional relationships between these structures.
We found no relationship between brainstem responses to fast acoustic elements of speech and right-dominant cortical processing of syllable patterns.
Brainstem processing of rapid elements in speech is not functionally related to rightward cortical asymmetry associated with the processing of syllable-rate features in speech. Viewed together with previous evidence linking brainstem timing with leftward cortical asymmetry for faster acoustic features, findings support the existence of distinct mechanisms for encoding rapid vs. slow elements of speech.
Results provide a fundamental advance in our knowledge of the segregation of subcortical input associated with cortical asymmetries for acoustic rate processing in the human auditory system. Implications of these findings for auditory perception, reading ability and development are discussed.
auditory cortex; auditory brainstem; children; cerebral asymmetry; speech
Speech recognition is remarkably robust to the listening background, even when the energy of background sounds strongly overlaps with that of speech. How the brain transforms the corrupted acoustic signal into a reliable neural representation suitable for speech recognition, however, remains elusive. Here, we hypothesize that this transformation is performed at the level of auditory cortex through adaptive neural encoding, and we test the hypothesis by recording, using magnetoencephalography (MEG), the neural responses of human subjects listening to a narrated story. Spectrally matched stationary noise, which has maximal acoustic overlap with the speech, is mixed in at various intensity levels. Despite the severe acoustic interference caused by this noise, it is here demonstrated that low-frequency auditory cortical activity is reliably synchronized to the slow temporal modulations of speech, even when the noise is twice as strong as the speech. Such a reliable neural representation is maintained by intensity contrast gain control, and by adaptive processing of temporal modulations at different time scales, corresponding to the neural delta and theta bands. Critically, the precision of this neural synchronization predicts how well a listener can recognize speech in noise, indicating that the precision of the auditory cortical representation limits the performance of speech recognition in noise. Taken together, these results suggest that, in a complex listening environment, auditory cortex can selectively encode a speech stream in a background insensitive manner, and this stable neural representation of speech provides a plausible basis for background-invariant recognition of speech.
Speech processing engages multiple cortical regions in the temporal, parietal, and frontal lobes. Isolating speech-sensitive cortex in individual participants is of major clinical and scientific importance. This task is complicated by the fact that responses to sensory and linguistic aspects of speech are tightly packed within the posterior superior temporal cortex. In functional magnetic resonance imaging (fMRI), various baseline conditions are typically used in order to isolate speech-specific from basic auditory responses. Using a short, continuous sampling paradigm, we show that reversed (“backward”) speech, a commonly used auditory baseline for speech processing, removes much of the speech responses in frontal and temporal language regions of adult individuals. On the other hand, signal correlated noise (SCN) serves as an effective baseline for removing primary auditory responses while maintaining strong signals in the same language regions. We show that the response to reversed speech in left inferior frontal gyrus decays significantly faster than the response to speech, thus suggesting that this response reflects bottom-up activation of speech analysis followed up by top-down attenuation once the signal is classified as nonspeech. The results overall favor SCN as an auditory baseline for speech processing.
fMRI; functional localizer; reversed speech; signal correlated noise; speech perception