Although there has been a great deal of recent empirical work and new theoretical interest in audiovisual speech perception in both normal-hearing and hearing-impaired adults, relatively little is known about the development of these abilities and skills in deaf children with cochlear implants. This study examined how prelingually deafened children combine visual information available in the talker’s face with auditory speech cues provided by their cochlear implants to enhance spoken language comprehension.
Twenty-seven hearing-impaired children who use cochlear implants identified spoken sentences presented under auditory-alone and audiovisual conditions. Five additional measures of spoken word recognition performance were used to assess auditory-alone speech perception skills. A measure of speech intelligibility was also obtained to assess the speech production abilities of these children.
A measure of audiovisual gain, “Ra,” was computed using sentence recognition scores in auditory-alone and audiovisual conditions. Another measure of audiovisual gain, “Rv,” was computed using scores in visual-alone and audiovisual conditions. The results indicated that children who were better at recognizing isolated spoken words through listening alone were also better at combining the complementary sensory information about speech articulation available under audiovisual stimulation. In addition, we found that children who received more benefit from audiovisual presentation also produced more intelligible speech, suggesting a close link between speech perception and production and a common underlying linguistic basis for audiovisual enhancement effects. Finally, an examination of the distribution of children enrolled in Oral Communication (OC) and Total Communication (TC) indicated that OC children tended to score higher on measures of audiovisual gain, spoken word recognition, and speech intelligibility.
The relationships observed between auditory-alone speech perception, audiovisual benefit, and speech intelligibility indicate that these abilities are not based on independent language skills, but instead reflect a common source of linguistic knowledge, used in both perception and production, that is based on the dynamic, articulatory motions of the vocal tract. The effects of communication mode demonstrate the important contribution of early sensory experience to perceptual development, specifically, language acquisition and the use of phonological processing skills. Intervention and treatment programs that aim to increase receptive and productive spoken language skills, therefore, may wish to emphasize the inherent cross-correlations that exist between auditory and visual sources of information in speech perception.
The present study examined how postlingually deafened adults with cochlear implants combine visual information from lipreading with auditory cues in an open-set word recognition task. Adults with normal hearing served as a comparison group. Word recognition performance was assessed using lexically controlled word lists presented under auditory-only, visual-only, and combined audiovisual presentation formats. Effects of talker variability were studied by manipulating the number of talkers producing the stimulus tokens. Lexical competition was investigated using sets of lexically easy and lexically hard test words. To assess the degree of audiovisual integration, a measure of visual enhancement, Ra, was used to assess the gain in performance provided in the audiovisual presentation format relative to the maximum possible performance obtainable in the auditory-only format. Results showed that word recognition performance was highest for audiovisual presentation followed by auditory-only and then visual-only stimulus presentation. Performance was better for single-talker lists than for multiple-talker lists, particularly under the audiovisual presentation format. Word recognition performance was better for the lexically easy than for the lexically hard words regardless of presentation format. Visual enhancement scores were higher for single-talker conditions compared to multiple-talker conditions and tended to be somewhat better for lexically easy words than for lexically hard words. The pattern of results suggests that information from the auditory and visual modalities is used to access common, multimodal lexical representations in memory. The findings are discussed in terms of the complementary nature of auditory and visual sources of information that specify the same underlying gestures and articulatory events in speech.
cochlear implants; hearing impairment; speech perception; audiovisual
The acceptance of cochlear implantation as an effective and safe treatment for deafness has increased steadily over the past quarter century. The earliest devices were the first implanted prostheses found to be successful in compensating partially for lost sensory function by direct electrical stimulation of nerves. Initially, the main intention was to provide limited auditory sensations to people with profound or total sensorineural hearing impairment in both ears. Although the first cochlear implants aimed to provide patients with little more than awareness of environmental sounds and some cues to assist visual speech-reading, the technology has advanced rapidly. Currently, most people with modern cochlear implant systems can understand speech using the device alone, at least in favorable listening conditions. In recent years, an increasing research effort has been directed towards implant users’ perception of nonspeech sounds, especially music. This paper reviews that research, discusses the published experimental results in terms of both psychophysical observations and device function, and concludes with some practical suggestions about how perception of music might be enhanced for implant recipients in the future. The most significant findings of past research are: (1) On average, implant users perceive rhythm about as well as listeners with normal hearing; (2) Even with technically sophisticated multiple-channel sound processors, recognition of melodies, especially without rhythmic or verbal cues, is poor, with performance at little better than chance levels for many implant users; (3) Perception of timbre, which is usually evaluated by experimental procedures that require subjects to identify musical instrument sounds, is generally unsatisfactory; (4) Implant users tend to rate the quality of musical sounds as less pleasant than listeners with normal hearing; (5) Auditory training programs that have been devised specifically to provide implant users with structured musical listening experience may improve the subjective acceptability of music that is heard through a prosthesis; (6) Pitch perception might be improved by designing innovative sound processors that use both temporal and spatial patterns of electric stimulation more effectively and precisely to overcome the inherent limitations of signal coding in existing implant systems; (7) For the growing population of implant recipients who have usable acoustic hearing, at least for low-frequency sounds, perception of music is likely to be much better with combined acoustic and electric stimulation than is typical for deaf people who rely solely on the hearing provided by their prostheses.
The present study investigated the development of audiovisual speech perception skills in children who are prelingually deaf and received cochlear implants. We analyzed results from the Pediatric Speech Intelligibility (Jerger, Lewis, Hawkins, & Jerger, 1980) test of audiovisual spoken word and sentence recognition skills obtained from a large group of young children with cochlear implants enrolled in a longitudinal study, from pre-implantation to 3 years post-implantation. The results revealed better performance under the audiovisual presentation condition compared with auditory-alone and visual-alone conditions. Performance in all three conditions improved over time following implantation. The results also revealed differential effects of early sensory and linguistic experience. Children from oral communication (OC) education backgrounds performed better overall than children from total communication (TC backgrounds. Finally, children in the early-implanted group performed better than children in the late-implanted group in the auditory-alone presentation condition after 2 years of cochlear implant use, whereas children in the late-implanted group performed better than children in the early-implanted group in the visual-alone condition. The results of the present study suggest that measures of audiovisual speech perception may provide new methods to assess hearing, speech, and language development in young children with cochlear implants.
This study examined how prelingually deafened children with cochlear implants combine visual information from lipreading with auditory cues in an open-set speech perception task. A secondary aim was to examine lexical effects on the recognition of words in isolation and in sentences. Fifteen children with cochlear implants served as participants in this study. Participants were administered two tests of spoken word recognition. The LNT assessed isolated word recognition in an auditory-only format. The AV-LNST assessed recognition of key words in sentences in a visual-only, auditory-only and audiovisual presentation format. On each test, lexical characteristics of the stimulus items were controlled to assess the effects of lexical competition. The children also were administered a test of receptive vocabulary knowledge. The results revealed that recognition of key words was significantly influenced by presentation format. Audiovisual speech perception was best, followed by auditory-only and visual-only presentation, respectively. Lexical effects on spoken word recognition were evident for isolated words, but not when words were presented in sentences. Finally, there was a significant relationship between auditory-only and audiovisual word recognition and language knowledge. The results demonstrate that children with cochlear implants obtain significant benefit from audiovisual speech integration, and suggest such tests should be included in test batteries intended to evaluate cochlear implant outcomes.
hearing impairment; speech perception; assessment; cochlear implant
This study examined the speech perception skills of a younger and older group of cochlear implant recipients to determine the benefit that auditory and visual information provides for speech understanding.
Pre- and postimplantation speech perception scores from the Consonant-Nucleus-Consonant (CNC), the Hearing In Noise sentence Test (HINT), and the City University of New York (CUNY) tests were analyzed for 34 postlingually deafened adult cochlear implant recipients. Half were elderly (i.e., >65 y old) and other half were middle aged (i.e., 39–53 y old). The CNC and HINT tests were administered using auditory-only presentation; the CUNY test was administered using auditory-only, vision-only, and audiovisual presentation conditions
No differences were observed between the two age groups on the CNC and HINT tests. For a subset of individuals tested with the CUNY sentences, we found that the preimplantation speechreading scores of the younger group correlated negatively with auditory-only postimplant performance. Additionally, older individuals demonstrated a greater reliance on the integration of auditory and visual information to understand sentences than did the younger group
On average, the auditory-only speech perception performance of older cochlear implant recipients was similar to the performance of younger adults. However, variability in speech perception abilities was observed within and between both age groups. Differences in speechreading skills between the younger and older individuals suggest that visual speech information is processed in a different manner for elderly individuals than it is for younger adult cochlear implant recipients.
Cochlear implant; speech perception; aging
The use of bilateral amplification is now common clinical practice for hearing aid users but not for cochlear implant recipients. In the past, most cochlear implant recipients were implanted in one ear and wore only a monaural cochlear implant processor. There has been recent interest in benefits arising from bilateral stimulation that may be present for cochlear implant recipients. One option for bilateral stimulation is the use of a cochlear implant in one ear and a hearing aid in the opposite nonimplanted ear (bimodal hearing).
This study evaluated the effect of wearing a cochlear implant in one ear and a digital hearing aid in the opposite ear on speech recognition and localization.
A repeated-measures correlational study was completed.
Nineteen adult Cochlear Nucleus 24 implant recipients participated in the study.
The participants were fit with a Widex Senso Vita 38 hearing aid to achieve maximum audibility and comfort within their dynamic range.
Data Collection and Analysis
Soundfield thresholds, loudness growth, speech recognition, localization, and subjective questionnaires were obtained six–eight weeks after the hearing aid fitting. Testing was completed in three conditions: hearing aid only, cochlear implant only, and cochlear implant and hearing aid (bimodal). All tests were repeated four weeks after the first test session. Repeated-measures analysis of variance was used to analyze the data. Significant effects were further examined using pairwise comparison of means or in the case of continuous moderators, regression analyses. The speech-recognition and localization tasks were unique, in that a speech stimulus presented from a variety of roaming azimuths (140 degree loudspeaker array) was used.
Performance in the bimodal condition was significantly better for speech recognition and localization compared to the cochlear implant–only and hearing aid–only conditions. Performance was also different between these conditions when the location (i.e., side of the loudspeaker array that presented the word) was analyzed. In the bimodal condition, the speech-recognition and localization tasks were equal regardless of which side of the loudspeaker array presented the word, while performance was significantly poorer for the monaural conditions (hearing aid only and cochlear implant only) when the words were presented on the side with no stimulation. Binaural loudness summation of 1–3 dB was seen in soundfield thresholds and loudness growth in the bimodal condition. Measures of the audibility of sound with the hearing aid, including unaided thresholds, soundfield thresholds, and the Speech Intelligibility Index, were significant moderators of speech recognition and localization. Based on the questionnaire responses, participants showed a strong preference for bimodal stimulation.
These findings suggest that a well-fit digital hearing aid worn in conjunction with a cochlear implant is beneficial to speech recognition and localization. The dynamic test procedures used in this study illustrate the importance of bilateral hearing for locating, identifying, and switching attention between multiple speakers. It is recommended that unilateral cochlear implant recipients, with measurable unaided hearing thresholds, be fit with a hearing aid.
Bimodal hearing; cochlear implant; hearing aid; localization; speech recognition
The purpose of this investigation was to compare the ability of young adults and older adults to integrate auditory and visual sentence materials under conditions of good and poor signal clarity. The Principle of Inverse Effectiveness (PoIE), which characterizes many neuronal and behavioral phenomena related to multisensory integration, asserts that as unimodal performance declines, integration is enhanced. Thus, the PoIE predicts that both young and older adults will show enhanced integration of auditory and visual speech stimuli when these stimuli are degraded. More importantly, because older adults' unimodal speech recognition skills decline in both the auditory and visual domains, the PoIE predicts that older adults will show enhanced integration during audiovisual speech recognition relative to young adults. The present study provides a test of these predictions.
Fifty-three young and 53 older adults with normal hearing completed the closed-set Build-A-Sentence (BAS) Test and the CUNY Sentence Test in a total of eight conditions, four unimodal and four audiovisual. In the unimodal conditions, stimuli were either auditory or visual and either easier or harder to perceive; the audiovisual conditions were formed from all the combinations of the unimodal signals. The hard visual signals were created by degrading video contrast; the hard auditory signals were created by decreasing the signal-to-noise ratio. Scores from the unimodal and bimodal conditions were used to compute auditory enhancement and integration enhancement measures.
Contrary to the PoIE, neither the auditory enhancement nor integration enhancement measures increased when signal clarity in the auditory or visual channel of audiovisual speech stimuli was decreased, nor was either measure higher for older adults than for young adults. In audiovisual conditions with easy visual stimuli, the integration enhancement measure for older adults was equivalent to that for young adults. In conditions with hard visual stimuli, however, integration enhancement for older adults was significantly lower than for young adults.
The present findings do not support extension of the PoIE to audiovisual speech recognition. Our results are not consistent with either the prediction that integration would be enhanced under conditions of poor signal clarity or the prediction that older adults would show enhanced integration, relative to young adults. Although there is considerable controversy with regard to the best way to measure audiovisual integration, the fact that two of the most prominent measures, auditory enhancement and integration enhancement, both yielded results inconsistent with the PoIE, strongly suggests that the integration of audiovisual speech stimuli differs in some fundamental way from the integration of other bimodal stimuli. The results also suggest aging does not impair integration enhancement when the visual speech signal has good clarity, but may affect it when the visual speech signal has poor clarity.
Nearly 300 million people worldwide have moderate to profound hearing loss. Hearing impairment, if not adequately managed, has strong socioeconomic and affective impact on individuals. Cochlear implants have become the most effective vehicle for helping profoundly deaf children and adults to understand spoken language, to be sensitive to environmental sounds, and, to some extent, to listen to music. The auditory information delivered by the cochlear implant remains non-optimal for speech perception because it delivers a spectrally degraded signal and lacks some of the fine temporal acoustic structure. In this article, we discuss research revealing the multimodal nature of speech perception in normally-hearing individuals, with important inter-subject variability in the weighting of auditory or visual information. We also discuss how audio-visual training, via Cued Speech, can improve speech perception in cochlear implantees, particularly in noisy contexts. Cued Speech is a system that makes use of visual information from speechreading combined with hand shapes positioned in different places around the face in order to deliver completely unambiguous information about the syllables and the phonemes of spoken language. We support our view that exposure to Cued Speech before or after the implantation could be important in the aural rehabilitation process of cochlear implantees. We describe five lines of research that are converging to support the view that Cued Speech can enhance speech perception in individuals with cochlear implants.
cued speech; cochlear implants; brain plasticity; phonological processing; audiovisual integration
The latest-generation cochlear implant devices provide many deaf patients with good speech recognition in quiet listening conditions. However, speech recognition deteriorates rapidly as the level of background noise increases. Previous studies have shown that, for cochlear implant users, the absence of fine spectro-temporal cues may contribute to poorer performance in noise, especially when the noise is dynamic (e.g., competing speaker or modulated noise). Here we report on sentence recognition by cochlear implant users and by normal-hearing subjects listening to an acoustic simulation of a cochlear implant, in the presence of steady or square-wave modulated speech-shaped noise. Implant users were tested using their everyday, clinically assigned speech processors. In the acoustic simulation, normal-hearing listeners were tested for different degrees of spectral resolution (16, eight, or four channels) and spectral smearing (carrier filter slopes of −24 or −6 dB/octave). For modulated noise, normal-hearing listeners experienced significant release from masking when the original, unprocessed speech was presented (which preserved the spectro-temporal fine structure), while cochlear implant users experienced no release from masking. As the spectral resolution was reduced, normal-hearing listeners’ release from masking gradually diminished. Release from masking was further reduced as the degree of spectral smearing increased. Interestingly, the mean speech recognition thresholds of implant users were very close to those of normal-hearing subjects listening to four-channel spectrally smeared noise-band speech. Also, the best cochlear implant listeners performed like normal-hearing subjects listening to eight- to 16-channel spectrally smeared noise-band speech. These findings suggest that implant users’ susceptibility to noise may be caused by the reduced spectral resolution and the high degree of spectral smearing associated with channel interaction. Efforts to improve the effective number of spectral channels as well as reduce channel interactions may improve implant performance in noise, especially for temporally modulated noise.
noise susceptibility; cochlear implants; spectral resolution; spectral smearing; gated noise
To evaluate sound localization acuity in a group of children who received bilateral (BI) cochlear implants in sequential procedures and to determine the extent to which BI auditory experience affects sound localization acuity. In addition, to investigate the extent to which a hearing aid in the nonimplanted ear can also provide benefits on this task.
Two groups of children participated, 13 with BI cochlear implants (cochlear implant + cochlear implant), ranging in age from 3 to 16 yrs, and six with a hearing aid in the nonimplanted ear (cochlear implant + hearing aid), ages 4 to 14 yrs. Testing was conducted in large sound-treated booths with loudspeakers positioned on a horizontal arc with a radius of 1.5 m. Stimuli were spondaic words recorded with a male voice. Stimulus levels typically averaged 60 dB SPL and were randomly roved between 56 and 64 dB SPL (±4 dB rove); in a few instances, levels were held fixed (60 dB SPL). Testing was conducted by using a “listening game” platform via computerized interactive software, and the ability of each child to discriminate sounds presented to the right or left was measured for loudspeakers subtending various angular separations. Minimum audible angle thresholds were measured in the BI (cochlear implant + cochlear implant or cochlear implant + hearing aid) listening mode and under monaural conditions.
Approximately 70% (9/13) of children in the cochlear implant + cochlear implant group discriminated left/right for source separations of ≤20° and, of those, 77% (7/9) performed better when listening bilaterally than with either cochlear implant alone. Several children were also able to perform the task when using a single cochlear implant, under some conditions. Minimum audible angle thresholds were better in the first cochlear implant than the second cochlear implant listening mode for nearly all (8/9) subjects. Repeated testing of a few individual subjects over a 2-yr period suggests that robust improvements in performance occurred with increased auditory experience. Children who wore hearing aids in the nonimplanted ear were at times also able to perform the task. Average group performance was worse than that of the children with BI cochlear implants when both ears were activated (cochlear implant + hearing aid versus cochlear implant + cochlear implant) but not significantly different when listening with a single cochlear implant.
Children with sequential BI cochlear implants represent a unique population of individuals who have undergone variable amounts of auditory deprivation in each ear. Our findings suggest that many but not all of these children perform better on measures of localization acuity with two cochlear implants compared with one and are better at the task than children using the cochlear implant + hearing aid. These results must be interpreted with caution, because benefits on other tasks as well as the long-term benefits of BI cochlear implants are yet to be fully understood. The factors that might contribute to such benefits must be carefully evaluated in large populations of children using a variety of measures.
Training with audiovisual (AV) speech has been shown to promote auditory perceptual learning of vocoded acoustic speech by adults with normal hearing. In Experiment 1, we investigated whether AV speech promotes auditory-only (AO) perceptual learning in prelingually deafened adults with late-acquired cochlear implants. Participants were assigned to learn associations between spoken disyllabic C(=consonant)V(=vowel)CVC non-sense words and non-sense pictures (fribbles), under AV and then AO (AV-AO; or counter-balanced AO then AV, AO-AV, during Periods 1 then 2) training conditions. After training on each list of paired-associates (PA), testing was carried out AO. Across all training, AO PA test scores improved (7.2 percentage points) as did identification of consonants in new untrained CVCVC stimuli (3.5 percentage points). However, there was evidence that AV training impeded immediate AO perceptual learning: During Period-1, training scores across AV and AO conditions were not different, but AO test scores were dramatically lower in the AV-trained participants. During Period-2 AO training, the AV-AO participants obtained significantly higher AO test scores, demonstrating their ability to learn the auditory speech. Across both orders of training, whenever training was AV, AO test scores were significantly lower than training scores. Experiment 2 repeated the procedures with vocoded speech and 43 normal-hearing adults. Following AV training, their AO test scores were as high as or higher than following AO training. Also, their CVCVC identification scores patterned differently than those of the cochlear implant users. In Experiment 1, initial consonants were most accurate, and in Experiment 2, medial consonants were most accurate. We suggest that our results are consistent with a multisensory reverse hierarchy theory, which predicts that, whenever possible, perceivers carry out perceptual tasks immediately based on the experience and biases they bring to the task. We point out that while AV training could be an impediment to immediate unisensory perceptual learning in cochlear implant patients, it was also associated with higher scores during training.
cochlear implants; perceptual learning; multisensory processing; speech perception; plasticity training
While the cochlear implant provides many deaf patients with good speech understanding in quiet, music perception and appreciation with the cochlear implant remains a major challenge for most cochlear implant users. The present study investigated whether a closed-set melodic contour identification (MCI) task could be used to quantify cochlear implant users’ ability to recognize musical melodies and whether MCI performance could be improved with moderate auditory training. The present study also compared MCI performance with familiar melody identification (FMI) performance, with and without MCI training.
For the MCI task, test stimuli were melodic contours composed of 5 notes of equal duration whose frequencies corresponded to musical intervals. The interval between successive notes in each contour was varied between 1 and 5 semitones; the “root note” of the contours was also varied (A3, A4, and A5). Nine distinct musical patterns were generated for each interval and root note condition, resulting in a total of 135 musical contours. The identification of these melodic contours was measured in 11 cochlear implant users. FMI was also evaluated in the same subjects; recognition of 12 familiar melodies was tested with and without rhythm cues. MCI was also trained in 6 subjects, using custom software and melodic contours presented in a different frequency range from that used for testing.
Results showed that MCI recognition performance was highly variable among cochlear implant users, ranging from 14% to 91% correct. For most subjects, MCI performance improved as the number of semitones between successive notes was increased; performance was slightly lower for the A3 root note condition. Mean FMI performance was 58% correct when rhythm cues were preserved and 29% correct when rhythm cues were removed. Statistical analyses revealed no significant correlation between MCI performance and FMI performance (with or without rhythmic cues). However, MCI performance was significantly correlated with vowel recognition performance; FMI performance was not correlated with cochlear implant subjects’ phoneme recognition performance. Preliminary results also showed that the MCI training improved all subjects’ MCI performance; the improved MCI performance also generalized to improved FMI performance.
Preliminary data indicate that the closed-set MCI task is a viable approach toward quantifying an important component of cochlear implant users’ music perception. The improvement in MCI performance and generalization to FMI performance with training suggests that MCI training may be useful for improving cochlear implant users’ music perception and appreciation; such training may be necessary to properly evaluate patient performance, as acute measures may underestimate the amount of musical information transmitted by the cochlear implant device and received by cochlear implant listeners.
In a previous paper we reported the frequency selectivity, temporal resolution, nonlinear cochlear processing, and speech recognition in quiet and in noise for 5 listeners with normal hearing (mean age 24.2 years) and 17 older listeners (mean age 68.5 years) with bilateral, mild sloping to profound sensory hearing loss (Gifford et al., 2007). Since that report, 2 additional participants with hearing loss completed experimentation for a total of 19 listeners. Of the 19 with hearing loss, 16 ultimately received a cochlear implant. The purpose of the current study was to provide information on the pre-operative psychophysical characteristics of low-frequency hearing and speech recognition abilities, and on the resultant postoperative speech recognition and associated benefit from cochlear implantation. The current preoperative data for the 16 listeners receiving cochlear implants demonstrate: 1) reduced or absent nonlinear cochlear processing at 500 Hz, 2) impaired frequency selectivity at 500 Hz, 3) normal temporal resolution at low modulation rates for a 500-Hz carrier, 4) poor speech recognition in a modulated background, and 5) highly variable speech recognition (from 0 to over 60% correct) for monosyllables in the bilaterally aided condition. As reported previously, measures of auditory function were not significantly correlated with pre- or post-operative speech recognition – with the exception of nonlinear cochlear processing and preoperative sentence recognition in quiet (p=0.008) and at +10 dB SNR (p=0.007). These correlations, however, were driven by the data obtained from two listeners who had the highest degree of nonlinearity and preoperative sentence recognition. All estimates of postoperative speech recognition performance were significantly higher than preoperative estimates for both the ear that was implanted (p<0.001) as well as for the best-aided condition (p<0.001). It can be concluded that older individuals with mild sloping to profound sensory hearing loss have very little to no residual nonlinear cochlear function, resulting in impaired frequency selectivity as well as poor speech recognition in modulated noise. These same individuals exhibit highly significant improvement in speech recognition in both quiet and noise following cochlear implantation. For older individuals with mild to profound sensorineural hearing loss who have difficulty in speech recognition with appropriately fitted hearing aids, there is little to lose in terms of psychoacoustic processing in the low-frequency region and much to gain with respect to speech recognition and overall communication benefit. These data further support the need to consider factors beyond the audiogram in determining cochlear implant candidacy, as older individuals with relatively good low-frequency hearing may exhibit vastly different speech perception abilities – illustrating the point that signal audibility is not a reliable predictor of performance on supra-threshold tasks such as speech recognition.
cochlear implant; older; aging; psychoacoustic function; low-frequency hearing; bimodal; frequency resolution; temporal resolution; speech recognition
The article aims to test the hypothesis that audiovisual integration can improve spatial hearing in monaural conditions when interaural difference cues are not available. We trained one group of subjects with an audiovisual task, where a flash was presented in parallel with the sound and another group in an auditory task, where only sound from different spatial locations was presented. To check whether the observed audiovisual effect was similar to feedback, the third group was trained using the visual feedback paradigm. Training sessions were administered once per day, for 5 days. The performance level in each group was compared for auditory only stimulation on the first and the last day of practice. Improvement after audiovisual training was several times higher than after auditory practice. The group trained with visual feedback demonstrated a different effect of training with the improvement smaller than the group with audiovisual training. We conclude that cross-modal facilitation is highly important to improve spatial hearing in monaural conditions and may be applied to the rehabilitation of patients with unilateral deafness and after unilateral cochlear implantation.
This study documented the ability of experienced pediatric cochlear implant (CI) users to perceive linguistic properties (what is said) and indexical attributes (emotional intent and talker identity) of speech, and examined the extent to which linguistic (LSP) and indexical (ISP) perception skills are related. Pre-implant aided hearing, age at implantation, speech processor technology, CI-aided thresholds, sequential bilateral cochlear implantation, and academic integration with hearing age-mates were examined for their possible relationships to both LSP and ISP skills.
Sixty 9–12 year olds, first implanted at an early age (12–38 months), participated in a comprehensive test battery that included the following LSP skills: 1) recognition of monosyllabic words at loud and soft levels, 2) repetition of phonemes and suprasegmental features from non-words, and 3) recognition of keywords from sentences presented within a noise background, and the following ISP skills: 1) discrimination of male from female and female from female talkers and 2) identification and discrimination of emotional content from spoken sentences. A group of 30 age-matched children without hearing loss completed the non-word repetition, and talker- and emotion-perception tasks for comparison.
Word recognition scores decreased with signal level from a mean of 77% correct at 70 dB SPL to 52% at 50 dB SPL. On average, CI users recognized 50% of keywords presented in sentences that were 9.8 dB above background noise. Phonetic properties were repeated from non-word stimuli at about the same level of accuracy as suprasegmental attributes (70% and 75%, respectively). The majority of CI users identified emotional content and differentiated talkers significantly above chance levels. Scores on LSP and ISP measures were combined into separate principal component scores and these components were highly correlated (r = .76). Both LSP and ISP component scores were higher for children who received a CI at the youngest ages, upgraded to more recent CI technology and had lower CI-aided thresholds. Higher scores, for both LSP and ISP components, were also associated with higher language levels and mainstreaming at younger ages. Higher ISP scores were associated with better social skills.
Results strongly support a link between indexical and linguistic properties in perceptual analysis of speech. These two channels of information appear to be processed together in parallel by the auditory system and are inseparable in perception. Better speech performance, for both linguistic and indexical perception, is associated with younger age at implantation and use of more recent speech processor technology. Children with better speech perception demonstrated better spoken language, earlier academic mainstreaming, and placement in more typically-sized classrooms (i.e., >20 students). Well-developed social skills were more highly associated with the ability to discriminate the nuances of talker identity and emotion than with the ability to recognize words and sentences through listening. The extent to which early cochlear implantation enabled these early-implanted children to make use of both linguistic and indexical properties of speech influenced not only their development of spoken language, but also their ability to function successfully in a hearing world.
Cochlear Implant; pediatric hearing loss; speech perception; indexical
Bilateral severe-to-profound sensorineural hearing loss is a standard criterion for cochlear implantation. Increasingly, patients are implanted in one ear and continue to use a hearing aid in the non-implanted ear to improve abilities such as sound localization and speech understanding in noise. Patients with severe-to-profound hearing loss in one ear and a more moderate hearing loss in the other ear (i.e., asymmetric hearing) are not typically considered candidates for cochlear implantation. Amplification in the poorer ear is often unsuccessful due to limited benefit, restricting the patient to unilateral listening from the better ear alone. The purpose of this study was to determine if patients with asymmetric hearing loss could benefit from cochlear implantation in the poorer ear with continued use of a hearing aid in the better ear.
Ten adults with asymmetric hearing between ears participated. In the poorer ear, all participants met cochlear implant candidacy guidelines; seven had postlingual onset and three had pre/perilingual onset of severe-to-profound hearing loss. All had open-set speech recognition in the better hearing ear. Assessment measures included word and sentence recognition in quiet, sentence recognition in fixed noise (four-talker babble) and in diffuse restaurant noise using an adaptive procedure, localization of word stimuli and a hearing handicap scale. Participants were evaluated pre-implant with hearing aids and post-implant with the implant alone, the hearing aid alone in the better ear and bimodally (the implant and hearing aid in combination). Postlingual participants were evaluated at six months post-implant and pre/perilingual participants were evaluated at six and 12 months post-implant. Data analysis compared results 1) of the poorer hearing ear pre-implant (with hearing aid) and post-implant (with cochlear implant), 2) with the device(s) used for everyday listening pre- and post-implant and, 3) between the hearing aid-alone and bimodal listening conditions post-implant.
The postlingual participants showed significant improvements in speech recognition after six months cochlear implant use in the poorer ear. Five postlingual participants had a bimodal advantage over the hearing aid-alone condition on at least one test measure. On average, the postlingual participants had significantly improved localization with bimodal input compared to the hearing aid-alone. Only one pre/perilingual participant had open-set speech recognition with the cochlear implant. This participant had better hearing than the other two pre/perilingual participants in both the poorer and better ear. Localization abilities were not significantly different between the bimodal and hearing aid-alone conditions for the pre/perilingual participants. Mean hearing handicap ratings improved post-implant for all participants indicating perceived benefit in everyday life with the addition of the cochlear implant.
Patients with asymmetric hearing loss who are not typical cochlear implant candidates can benefit from using a cochlear implant in the poorer ear with continued use of a hearing aid in the better ear. For this group of ten, the seven postlingually deafened participants showed greater benefits with the cochlear implant than the pre/perilingual participants; however, further study is needed to determine maximum benefit for those with early onset of hearing loss.
Asymmetric hearing loss; Bilateral; Bimodal; Cochlear implant; Speech recognition
Many studies have documented the effect of reducing spectral information for speech perception in listeners with normal hearing and hearing impairment. While it is understood that more spectral bands are needed for unilateral cochlear implant listeners to perform well on more challenging listening tasks such as speech perception in noise, it is unclear how reducing the number of spectral bands or electrodes in cochlear implants influences the ability to localize sound or understand speech with spatially separate noise sources.
The purpose of this study was to measure the effect of reducing the number of electrodes for patients with bilateral cochlear implants on spatial hearing tasks.
Performance on spatial hearing tasks was examined as the number of bilateral electrodes in the speech processor was deactivated equally across ears and the full frequency spectrum was reallocated to a reduced number of active electrodes. Program parameters (i.e., pulse width, stimulation rate) were held constant among the programs and set identically between the right and left cochlear implants so that only the number of electrodes varied.
Nine subjects had used bilateral Nucleus or Advanced Bionics cochlear implants for at least 12 mo prior to beginning the study. Only those subjects with full insertion of the electrode arrays with all electrodes active in both ears were eligible to participate.
Data Collection and Analysis
Two test measures were utilized to evaluate the effect of reducing the number of electrodes, including a speech-perception-in-noise test with spatially separated sources and a sound source localization test.
Reducing the number of electrodes had different effects across individuals. Three patterns emerged: (1) no effect on localization (two of nine subjects), (2) at least two to four bilateral electrodes were required for maximal performance (five of nine subjects), and (3) performance gradually decreased across conditions as electrode number was reduced (two of nine subjects). For the test of speech perception in spatially separated noise, performance was affected as the number of electrodes was reduced for all subjects. Two categories of performance were found: (1) at least three or four bilateral electrodes were needed for maximum performance (five of seven subjects) and (2) as the number of electrodes were reduced, performance gradually decreased across conditions (two of seven subjects).
Large individual differences exist in determining maximum performance using bilateral electrodes for localization and speech perception in noise. For some bilateral cochlear implant users, as few as three to four electrodes can be used to obtain maximal performance on localization and speech-in-noise tests. However, other listeners show a gradual decrement in performance on both tasks when the number of electrodes is reduced.
Bilateral cochlear implants; electrode number; localization; speech recognition
This research studied whether the mode of input (auditory vs audiovisual) influenced semantic access by speech in children with sensorineural hearing impairment (HI).
Participants, 31 children with HI and 62 children with normal hearing (NH), were tested with our new multi-modal picture word task. Children were instructed to name pictures displayed on a monitor and ignore auditory or audiovisual speech distractors. The semantic content of the distractors was varied to be related vs unrelated to the pictures (e.g, picture-distractor of dog-bear vs dog-cheese respectively). In children with NH, picture naming times were slower in the presence of semantically-related distractors. This slowing, called semantic interference, is attributed to the meaning-related picture-distractor entries competing for selection and control of the response [the lexical selection by competition (LSbyC) hypothesis]. Recently, a modification of the LSbyC hypothesis, called the competition threshold (CT) hypothesis, proposed that 1) the competition between the picture-distractor entries is determined by a threshold, and 2) distractors with experimentally reduced fidelity cannot reach the competition threshold. Thus, semantically-related distractors with reduced fidelity do not produce the normal interference effect, but instead no effect or semantic facilitation (faster picture naming times for semantically-related vs -unrelated distractors). Facilitation occurs because the activation level of the semantically-related distractor with reduced fidelity 1) is not sufficient to exceed the competition threshold and produce interference but 2) is sufficient to activate its concept which then strengthens the activation of the picture and facilitates naming. This research investigated whether the proposals of the CT hypothesis generalize to the auditory domain, to the natural degradation of speech due to HI, and to participants who are children. Our multi-modal picture word task allowed us to 1) quantify picture naming results in the presence of auditory speech distractors and 2) probe whether the addition of visual speech enriched the fidelity of the auditory input sufficiently to influence results.
In the HI group, the auditory distractors produced no effect or a facilitative effect, in agreement with proposals of the CT hypothesis. In contrast, the audiovisual distractors produced the normal semantic interference effect. Results in the HI vs NH groups differed significantly for the auditory mode, but not for the audiovisual mode.
This research indicates that the lower fidelity auditory speech associated with HI affects the normalcy of semantic access by children. Further, adding visual speech enriches the lower fidelity auditory input sufficiently to produce the semantic interference effect typical of children with NH.
Despite excellent performance in speech recognition in quiet, most cochlear implant users have great difficulty with speech recognition in noise, music perception, identifying tone of voice, and discriminating different talkers. This may be partly due to the pitch coding in cochlear implant speech processing. Most current speech processing strategies use only the envelope information; the temporal fine structure is discarded. One way to improve electric pitch perception is to utilize residual acoustic hearing via a hearing aid on the non-implanted ear (bimodal hearing). This study aimed to test the hypothesis that bimodal users would perform better than bilateral cochlear implant users on tasks requiring good pitch perception.
Four pitch-related tasks were used:
Hearing in Noise Test (HINT) sentences spoken by a male talker with a competing female, male, or child talker.
Montreal Battery of Evaluation of Amusia. This is a music test with six subtests examining pitch, rhythm and timing perception, and musical memory.
Aprosodia Battery. This has five subtests evaluating aspects of affective prosody and recognition of sarcasm.
Talker identification using vowels spoken by ten different talkers (three male, three female, two boys, and two girls).
Bilateral cochlear implant users were chosen as the comparison group. Thirteen bimodal and thirteen bilateral adult cochlear implant users were recruited; all had good speech perception in quiet.
There were no significant differences between the mean scores of the bimodal and bilateral groups on any of the tests, although the bimodal group did perform better than the bilateral group on almost all tests. Performance on the different pitch-related tasks was not correlated, meaning that if a subject performed one task well they would not necessarily perform well on another. The correlation between the bimodal users' hearing threshold levels in the aided ear and their performance on these tasks was weak.
Although the bimodal cochlear implant group performed better than the bilateral group on most parts of the four pitch-related tests, the differences were not statistically significant. The lack of correlation between test results shows that the tasks used are not simply providing a measure of pitch ability. Even if the bimodal users have better pitch perception, the real-world tasks used are reflecting more diverse skills than pitch. This research adds to the existing speech perception, language, and localization studies that show no significant difference between bimodal and bilateral cochlear implant users.
cochlear implants; bimodal; bilateral
Attending to a conversation in a crowded scene requires selection of relevant information, while ignoring other distracting sensory input, such as speech signals from surrounding people. The neural mechanisms of how distracting stimuli influence the processing of attended speech are not well understood. In this high-density electroencephalography (EEG) study, we investigated how different types of speech and non-speech stimuli influence the processing of attended audiovisual speech. Participants were presented with three horizontally aligned speakers who produced syllables. The faces of the three speakers flickered at specific frequencies (19 Hz for flanking speakers and 25 Hz for the center speaker), which induced steady-state visual evoked potentials (SSVEP) in the EEG that served as a measure of visual attention. The participants' task was to detect an occasional audiovisual target syllable produced by the center speaker, while ignoring distracting signals originating from the two flanking speakers. In all experimental conditions the center speaker produced a bimodal audiovisual syllable. In three distraction conditions, which were contrasted with a no-distraction control condition, the flanking speakers either produced audiovisual speech, moved their lips, and produced acoustic noise, or moved their lips without producing an auditory signal. We observed behavioral interference in the reaction times (RTs) in particular when the flanking speakers produced naturalistic audiovisual speech. These effects were paralleled by enhanced 19 Hz SSVEP, indicative of a stimulus-driven capture of attention toward the interfering speakers. Our study provides evidence that non-relevant audiovisual speech signals serve as highly salient distractors, which capture attention in a stimulus-driven fashion.
crossmodal; EEG; bimodal; SSVEP; oscillatory
This research assessed the influence of visual speech on phonological processing by children with hearing loss (HL).
Children with HL and children with normal hearing (NH) named pictures while attempting to ignore auditory or audiovisual speech distractors whose onsets relative to the pictures were either congruent, conflicting in place of articulation, or conflicting in voicing—for example, the picture “pizza” coupled with the distractors “peach,” “teacher,” or “beast,” respectively. Speed of picture naming was measured.
The conflicting conditions slowed naming, and phonological processing by children with HL displayed the age-related shift in sensitivity to visual speech seen in children with NH, although with developmental delay. Younger children with HL exhibited a disproportionately large influence of visual speech and a negligible influence of auditory speech, whereas older children with HL showed a robust influence of auditory speech with no benefit to performance from adding visual speech. The congruent conditions did not speed naming in children with HL, nor did the addition of visual speech influence performance. Unexpectedly, the /∧/-vowel congruent distractors slowed naming in children with HL and decreased articulatory proficiency.
Results for the conflicting conditions are consistent with the hypothesis that speech representations in children with HL (a) are initially disproportionally structured in terms of visual speech and (b) become better specified with age in terms of auditorily encoded information.
phonological processing; lipreading; picture–word task; multimodal speech perception
An error analysis of the word recognition responses of cochlear implant users and listeners with normal hearing was conducted to determine the types of partial information used by these two populations when they identified spoken words under auditory-alone and audiovisual conditions. The results revealed that the two groups used different types of partial information in identifying spoken words under auditory-alone or audiovisual presentation. Different types of partial information were also used in identifying words with different lexical properties. In our study, however, there were no significant interactions with hearing status, indicating that cochlear implant users and listeners with normal hearing identify spoken words in a similar manner. The information available to users with cochlear implants preserves much of the partial information necessary for accurate spoken word recognition.
The present study investigated the development of audiovisual comprehension skills in prelingually deaf children who received cochlear implants.
We analyzed results obtained with the Common Phrases (Robbins et al., 1995) test of sentence comprehension from 80 prelingually deaf children with cochlear implants who were enrolled in a longitudinal study, from pre-implantation to 5 years after implantation.
The results revealed that prelingually deaf children with cochlear implants performed better under audiovisual (AV) presentation compared with auditory-alone (A-alone) or visual-alone (V-alone) conditions. AV sentence comprehension skills were found to be strongly correlated with several clinical outcome measures of speech perception, speech intelligibility, and language. Finally, pre-implantation V-alone performance on the Common Phrases test was strongly correlated with 3-year postimplantation performance on clinical outcome measures of speech perception, speech intelligibility, and language skills.
The results suggest that lipreading skills and AV speech perception reflect a common source of variance associated with the development of phonological processing skills that is shared among a wide range of speech and language outcome measures.
(1) To evaluate the recognition of words, phonemes and lexical tones in audiovisual (AV) and auditory-only (AO) modes in Mandarin-speaking adults with cochlear implants (CIs); (2) to understand the effect of presentation levels on AV speech perception; (3) to learn the effect of hearing experience on AV speech perception.
Thirteen deaf adults (age = 29.1±13.5 years; 8 male, 5 female) who had used CIs for >6 months and 10 normal-hearing (NH) adults participated in this study. Seven of them were prelingually deaf, and 6 postlingually deaf. The Mandarin Monosyllablic Word Recognition Test was used to assess recognition of words, phonemes and lexical tones in AV and AO conditions at 3 presentation levels: speech detection threshold (SDT), speech recognition threshold (SRT) and 10 dB SL (re:SRT).
The prelingual group had better phoneme recognition in the AV mode than in the AO mode at SDT and SRT (both p = 0.016), and so did the NH group at SDT (p = 0.004). Mode difference was not noted in the postlingual group. None of the groups had significantly different tone recognition in the 2 modes. The prelingual and postlingual groups had significantly better phoneme and tone recognition than the NH one at SDT in the AO mode (p = 0.016 and p = 0.002 for phonemes; p = 0.001 and p<0.001 for tones) but were outperformed by the NH group at 10 dB SL (re:SRT) in both modes (both p<0.001 for phonemes; p<0.001 and p = 0.002 for tones). The recognition scores had a significant correlation with group with age and sex controlled (p<0.001).
Visual input may help prelingually deaf implantees to recognize phonemes but may not augment Mandarin tone recognition. The effect of presentation level seems minimal on CI users' AV perception. This indicates special considerations in developing audiological assessment protocols and rehabilitation strategies for implantees who speak tonal languages.