|Home | About | Journals | Submit | Contact Us | Français|
The phonological systems of children with cochlear implants may include segment inventories that contain both target and non-target speech sounds. These children may not consistently follow phonological rules of the target language. These issues present a challenge for the clinical speech-language pathologist who uses phonetic transcriptions to evaluate speech production skills and to develop a plan of care. The purposes of this tutorial are (1) to identify issues associated with phonetic transcriptions of the speech of children with cochlear implants and (2) to discuss implications for assessment.
Narrow transcription data from an ongoing, longitudinal research study were catalogued and reviewed. Study participants had at least 5 years of cochlear implant experience and used spoken American English as a primary means of communication. In this tutorial, selected phonetic symbols and phonetic phenomena are reviewed.
A set of principles for phonetic transcriptions is proposed. Narrow phonetic transcriptions that include all segment possibilities in the International Phonetic Alphabet and extensions for disordered speech are needed to capture the subtleties of the speech of children with cochlear implants. Narrow transcriptions also may play a key role in planning treatment.
Cochlear implantation has gained increasing acceptance in the past two decades as an effective treatment for children and adults with severe-to-profound sensorineural hearing loss. Because hearing loss is the most common congenital anomaly, with profound sensorineural hearing loss affecting 1-3 in every 1000 births in the United States (Morton, 1991; Mason & Herrmann, 1998), it is likely that a speech-language pathologist working with children will encounter a child with a cochlear implant. Children with cochlear implants routinely require ongoing, intensive speech-language therapy to improve communication skills, including speech perception, receptive and expressive language skills, and speech production abilities. In many cases, the child with the cochlear implant will produce errors that are not typical in the population of children with normal hearing. Most clinicians are poorly equipped to identify the articulatory characteristics of the sounds in error, unsure of how to represent those sounds in a phonetic transcription and address those sounds in treatment. This tutorial draws attention to phonetic phenomena common among children with cochlear implants in English-speaking environments, reviews phonetic symbols and diacritics necessary for describing phonetic details of their speech, and illustrates the clinical application of narrow phonetic transcription in planning treatment.
A cochlear implant is a surgically-implanted device that bypasses the impaired cochlear function in patients with severe-to-profound sensorineural hearing loss by directly stimulating the auditory nerve. An external microphone, typically worn at ear level, picks up sound from the environment and sends it to a sound processor, which digitally encodes the acoustic signal. The encoded signal is then sent to the transmitter, which transmits the signal intensity, frequency, and timing to electrodes placed surgically inside the cochlea. The electrodes stimulate the auditory nerve, which carries the signal to the auditory cortex. Although the electrical hearing provides significant auditory benefit in most cases, it does not restore normal hearing; this is particularly significant for children who are prelingually deaf, learning to listen and speak for the first time.
Early research regarding outcome measures for individuals with cochlear implants focused on improvements in auditory perception of postlingually deafened adults. In the last 20 years, attention has shifted to quantifying the benefits of cochlear implantation for children, with recent emphasis on the potential for very young children to develop age-appropriate speech perception, language, and speech production. The tests used to evaluate speech perception and language abilities are typically objective and quantitative; however, the assessment of speech production can be much more subjective. It relies more heavily on the skill level and experience of the evaluator, the type of assessment, and the type of recording and coding systems.
Speech production research for children with cochlear implants often cites outcomes related to articulation tests and estimates of intelligibility (Chin & Kaiser, 2000; Dawson et al., 1995; Geers, 2002; Tobey & Hasenstab, 1991; Tye-Murray, Spencer, & Woodworth, 1995). Standardized articulation tests, such as the Goldman-Fristoe 2 Test of Articulation (Goldman & Fristoe, 2000), typically rely on a correct/incorrect judgment by the evaluator, where specific consonants are evaluated in limited phonetic contexts. Intelligibility refers to an estimate of the degree to which a listener will understand a given message produced by the child with a cochlear implant (Boothroyd, 1985). These measures provide insight into general trends in speech production outcomes (e.g., age-equivalencies, rates of achievement post-surgery), but they do not describe the nature of the speech errors.
In the past 10 years, investigative efforts have analyzed phonetic inventories to offer a more detailed account of the phonological systems of children with cochlear implants (Chin, 2002, 2003; Law & So, 2006; Blamey, Barry, & Jacq, 2001; Serry & Blamey, 1999). Serry and Blamey (1999) described the developmental course of phoneme acquisition among children with cochlear implants over a 4-year period. The report focused on the acquisition of consonants and vowels in Australian English, but it did not detail the nature of speech errors that were undoubtedly used by the children in the course of their development. The authors excluded non-English sounds in their reports of these children's phonetic inventories. Studies have shown that children with cochlear implants, across languages, have unique phonetic inventories that may lack speech sounds of the ambient language but also include nonambient ones (Chin, 2002, 2003; Law & So, 2006). Chin's (2002) results demonstrate that for English-speaking children the variability is not only based on the presence or absence of standard English sounds, but may depend on the presence and nature of non-English sounds. The clinical implication of these findings is that “intervention may need to address only refining articulation, rather than reorganizing the phonological system” (p. 43).
With recent advances in cochlear implant technology and with greater emphasis on early intervention, methods for obtaining more specific information regarding improvements in the speech production of children following cochlear implantation is an important topic of discussion. Even some of the best cochlear implant users who are not perceived as having a speech disorder are still perceived as having a foreign accent (e.g., Gulati, 2003, p. 38). In some cases, these children achieve high degrees of intelligibility; however, the perception of a foreign accent can have psychosocial implications. As children grow, there is an increasing desire to identify with peers. Factors such as foreign accents and regional dialects can affect one's ability to identity with a particular social group. Chin (2003) reviewed the phonological systems of children with cochlear implants whose target language is American English. The children in that study had poorly established velar stops; lacked ambient fricatives, affricates, and velar nasals; and added nonambient stops, fricatives, and rhotacized labial glides.
In order to address the idiosyncrasies of the phonological systems of children with cochlear implants in therapy, the clinician must be able to describe the articulatory features of the sounds produced in error (i.e., the starting point) so that he can train the child how to adjust his productions. For example, in the production of s/, if the child's place of articulation is too far forward, intervention would include instructing the child to move his tongue back to some degree. The “starting point” for treatment is determined by the phonetic transcription.
Phonetic transcription is the first step in the process of phonological analysis, and the type of transcription depends on the goal of the analysis (Stoel-Gammon, 2001). In the case of children with cochlear implants, the goal of the phonological analysis should be the development of a treatment plan that can be implemented to improve overall speech intelligibility and facilitate full access to mainstream society. Phonetic transcriptions are clinically valuable in that they describe the nature of phoneme productions and degree of severity, and they are useful in evaluating change over time (Powell, 2001).
Ball (1991) described two levels of abstraction within impressionistic transcription: “narrow” and “broad.” A narrow transcription “aims to capture in symbol form the maximum amount of phonetic information that the transcriber is able to perceive and for which there are symbols” (pp. 59-60). Broad transcriptions are less redundant because phonological rules are assumed and not transcribed; however, a pitfall of broad transcriptions is the potential for ambiguity. Ball and Rahilly (2002) reported that broad transcriptions or phonemic transcriptions may over- or underestimate the phonological abilities of the speaker due to a collapse of phonetic contrasts. For example, if a child substitutes [t] for /k/ in the English word cap, the realization would be broadly transcribed [tæp], with the assumption that the initial stop plosive [t] is aspirated according to the phonological rules of English. In this case, it appears that the child makes only an error in place of articulation. Without knowing how broad or narrow the transcription is, however, it is not clear whether or not there was also an aspiration error. A narrow transcription would be unambiguous: [thæp] indicates that the aspiration of the initial voiceless plosive follows the phonological rules of the language, whereas [t=æp] indicates lack of aspiration and thus an aspiration error.
Narrow phonetic transcriptions address the issue of symbol ambiguity and provide a system for describing all phoneme possibilities and distortions (Powell, 2001); however, it has been reported in the literature that most clinicians have little experience with narrow transcription and may find the approach intimidating, time-consuming and cumbersome (Howard & Heselwood, 2002; Powell, 2001). With increasing caseload demands, most clinicians often do not have time for what may be perceived as the extra work involved in completing narrow transcriptions. In their paper discussing phonetic transcriptions, Louko and Edwards (2001) reported that many practicing speech-language pathologists do not use narrow transcriptions regularly and may not be aware of the benefits of using diacritics and symbols that extend beyond the basic broad transcription; however, they recommended using a narrow transcription anytime the speaker is a young child or an individual with a foreign accent, as it makes no assumptions about the underlying system of phonological rules. The use of narrow transcription is appropriate for any unstudied language (Kelly & Local, 1989), in this case, the impaired speech of a child with a cochlear implant, because we cannot know with certainty which aspects of the speech may have phonological significance. For children with cochlear implants, as with other special populations, the time and effort involved in narrow transcriptions may be a profitable investment. Detailed information regarding the type and nature of speech errors can help to focus the clinician's efforts, resulting in more effective and efficient management (Perkins & Howard, 1995; Shriberg & Kent, 1982).
Narrow transcriptions encompass the use of all International Phonetic Alphabet (IPA; International Phonetic Association, 1999) symbols, diacritics, and extensions to the IPA developed for disordered speech (Duckworth, Allen, Hardcastle, & Ball, 1990). The problem with transcribing disordered speech is that the transcriber will encounter non-normal speech sounds within the confines of a system developed to address speech sounds of a natural language (Duckworth et al., 1990). At the 1989 International Phonetic Association congress of Kiel in West Germany, a committee of phoneticians revised the alphabet system, and in 1994, the International Clinical Phonetics and Linguistics Association officially adopted a set of specialized symbols for transcribing disordered speech called extensions to the IPA, or extIPA (International Phonetic Association, 1999). These extensions provided a comprehensive and viable system for coding patterns found in disordered speech, including atypical place of articulation (e.g., dentolabials), manner of articulation (e.g., denasalization and nasal escape), airstream features (e.g., ingressive airflow), vocal fold activity (e.g., prevoicing, lack of aspiration), degree of certainty by the transcriber, and other aspects of connected speech, such as pause length.
When using narrow phonetic transcriptions, there are several issues that should be carefully considered. Louko and Edwards (2001) observe that certain sounds are prone to transcription errors, including voiceless unaspirated stops, unreleased stops and glottal stops, final velarized /l/, and “sound distortions.” In the following sections, we will attempt to demystify the speech production errors common among children with cochlear implants, review selected IPA symbols and diacritics and offer suggestions for enhancing phonetic transcription.
Problematic areas for phonetic transcription of children with cochlear implants are present in every aspect of articulation. Based on the authors’ experience transcribing the speech of children with cochlear implants, there are a number of recurring issues affecting accurate descriptions of articulatory phenomena.1 The following selected illustrations identify issues that may arise when transcribing the speech of children with cochlear implants and offer support for the use of narrow transcriptions in most situations.
Among children with cochlear implants whose target language is American English, it is often the case that standard American English consonants are used in nonstandard or unexpected conditions. IPA symbols for standard American English consonants should be familiar to most speech-language pathologists. For the full IPA chart, refer to the International Phonetic Association's website, http://www.arts.gla.ac.uk/IPA/ipa.html.
In standard American English, there are two allophones of /l/, differing in place of articulation. The allophonic rule requires velarized (“dark”) place of articulation ([ɫ]) at the end of a word, as in all, and alveolar (“clear”) place of articulation elsewhere (Ladefoged, 1993). Some children with cochlear implants use the velarized allophone in word-initial position (e.g., [ɫif] leaf). When a dark /l/ appears at the beginning of a word, it may sound like it has a [ɡ] before it (e.g., love may sound like glove) due to the velarized articulation.
A glottal stop () is a plosive with the place of articulation at the level of the glottis, characterized by an abrupt closure of the vocal folds, as in “uh-oh” [o]. The glottal stop is not considered to be a phoneme in English, but it is often used by very young children in babbling and early words. It is also acceptable in many dialects of English (including General American English and African American Vernacular English) as an allophone of /t/ in medial and final position, as in [bnˌz] “buttons.” The replacement of a target stop by a glottal stop is called “glottal substitution” (Khan, 1982) or “glottal replacement” (Hodson & Paden, 1991). Some children with cochlear implants use glottal stops to replace other phonemes that are not consistent with acceptable adult patterns. Examples of this include [bεi] “beddy,” [dji] “ducky,” and [kh] “cup.” In the last example, the abrupt end to the vowel, as opposed to a more gradual fading of the vowel, indicates a stop, and in the absence of supraglottal articulation, specifically a glottal stop (Louko & Edwards, 2001). Speech characterized by excessive or unusual glottal substitutions may sound choppy and negatively affect overall intelligibility.
In standard American English, a voicing contrast exists between the voiced and voiceless alveolar stops /d/ and /t/; however, in certain phonetic environments, that contrast is often neutralized to a flap [ɾ]. As an example, most speakers of American English produce the medial consonant of latter and ladder the same way in normal conversation unless they are trying to be very precise. Articulatory characteristics include partial voicing and rapid or ballistic contact between the articulators, and acoustic characteristics include extremely short closure durations (de Jong, 1998). Previous research by Chin and Krug (2004) showed that children with cochlear implants, like children with normal hearing, produced greater distinctions between voiced and voiceless alveolar stops (/d/ and /t/) than adults and did not reliably neutralize in expected conditions. The data in (1) illustrate a variety of speech sounds substituted for the intervocalic target flap; the order of forms in the list shows progressively greater distance from the target. In the first example, the child failed to reduce the medial stop. The second example represents a slight change in place of articulation for the target flap. In conversational speech, these variations could contribute to the perception that the child with a cochlear implant has a foreign accent. In the third example [h], both the voicing feature and the duration of the medial stop were altered. In the final example, the target flap was replaced by a glottal stop. These two latter changes could affect both fluency and general intelligibility and therefore would be a priority in speech therapy.
Vowels can be difficult to transcribe due to dialectal variations and the diversity among systems used to describe vowels (Powell, 2001). Vowels are also considered more difficult to transcribe because they are less discrete, more variable, and often not addressed in speech therapy; however, for children with hearing loss, vowel errors can have a significant effect on intelligibility (Pollock & Berni, 2001). The following examples show atypical use of standard American English vowels.
A monophthong is a vowel in which the characteristics of the tongue body in the oral cavity remain constant during a syllable. A diphthong is a sequence of two vocalic elements in a single syllabic nucleus (e.g., [aɪ] as in bite; [o] as in boat; Ladefoged, 1993). Monophthongization is the omission of one of the vocalic elements in a diphthong. Children with cochlear implants often omit the standard off-glide from the diphthong [o], for example, using the monophthong  in the word comb (e.g., [khom]), resulting in an incomplete production of the target vowel.
Nonstandard diphthongs can occur when a vocalic element is added to a monophthong (e.g., [bʉuth] “boot”), a process known as diphthongization, or when there is a change to one vowel in a diphthong pair (e.g., [ɡot] “goat’ and [khõmb] “comb”). One source of diphthongization in children with cochlear implants may involve a combination of vocalic prolongation and unstable vowel quality production. Although diphthong variation is not uniform across individuals with cochlear implants, it is a recurring issue that should be considered in the phonetic transcriptions.
Standard American English IPA symbols are useful in describing disordered speech characterized by restricted phonetic inventories, within-set substitutions, and jargon aphasia (Powell, 2001); however, the set of standard American English IPA symbols is often inadequate in describing the speech production of children with cochlear implants. Standard English consonants represent a relatively small percentage of the pulmonic IPA consonant symbols (International Phonetic Association, 1999). Speech sounds traditionally referred to as distortions can often be transcribed using non-English segments (Louko & Edwards, 2001). The following sections describe selected non-English segments.
Coronal fricatives, or fricatives made with the blade of the tongue raised from the neutral position (Lowe, 1994), occurring in standard American English include interdental /θ, ð/, alveolar /s, z/, and postalveolar /ʃ, ʒ/ places of articulation. Substitution of non-English coronal fricatives for standard American English coronal fricatives is common among children with cochlear implants. Examples 2 and 3 show one child's set of voiceless coronal fricatives including six possibilities for two target phonemes.
Dental /s/ () is produced with the tip of the tongue touching the back of the upper incisors. The voiceless retroflex fricative is produced with the tip of the tongue curled back. The voiceless alveolo-palatal fricative, [ɕ], is farther back than [ʃ], but not as far back as a true palatal fricative ([ç]). To hear the differences between these sounds, produce a prolonged /s/ sound while sliding the tongue body posteriorly along the roof of the mouth for [s], [ʃ], [ɕ], and [ɕ], respectively. Notice that the more posterior the place of articulation for sibilants, the lower the frequency.
Similar to the greater variability in the set of coronal fricatives used by the children with cochlear implants, greater variability also exists in the place of articulation of affricates. Affricates are single phonemes that combine an initial stop portion with a fricative release (Ladefoged, 1993). Standard American English affricates include the voiced and voiceless pair, /dʒ/ (as in gin) and /tʃ/ (as in chin), at the postalveolar place of articulation. One child with a cochlear implant produced two non-English variations of /tʃ/: [tʂ] in peach, with retroflex place of articulation, and [tɕh] in witch, with alveolo-palatal place of articulation.
Spirantization is the substitution of a fricative for a stop. In some sources, spirantization is considered an “idiosyncratic” process because it occurs seldom if ever in normal child linguistic development (Stoel-Gammon & Dunn, 1985); other sources include spirantization as a natural process (Edwards & Shriberg, 1983). It is frequently noted in the speech of the children with cochlear implants, and non-English IPA symbols for the voiceless bilabial fricative [Φ], the voiced bilabial fricative [β], the voiceless velar fricative [x], and the voiced velar fricative  are needed for phonetic transcription. This process is due to a lack of full closure at the place of articulation.
Ejectives are produced by a non-pulmonic stream of air originating between the closed glottis and a supraglottal place of articulation that is then released abruptly (Ladefoged, 1993); they sound like very short and forcefully produced voiceless stops. Ejectives are transcribed by a diacritic apostrophe (names of diacritics in this paper are taken from Pullum & Ladusaw, 1996) added to the voiceless consonant symbol (International Phonetic Association, 1999). For example, the ejective consonant with alveolar place of articulation is indicated by [t‘]. Example 3 shows one child's use of ejectives in the onset and coda positions.
Diacritics are small signs added to an IPA segment to mark adjustments and variations in speech productions (Powell, 2001). The use of diacritics is associated with narrow phonetic transcription and is often needed to describe the phonetic details of the speech nuances of children with cochlear implants.
Substitutions of stops for homorganic (i.e., at the same place of articulation) fricatives are common among children with cochlear implants, as they are in children with typical phonological development. This substitution process is called “stopping” and is exemplified by [tn] for “sun,” in which both alveolar place and voicelessness are preserved. Among children with cochlear implants, stopping of other fricatives besides alveolar ones is also common, including labiodental /f, v/ and (inter)dental /θ, ð/. In such cases, place of articulation may also be strictly retained to produce labiodental and dental stops. To transcribe such stops, a diacritic must be added to the IPA symbols for bilabial stops /p, b/ and alveolar stops /t, d/; this “dental” diacritic is a small “bridge” placed under the relevant letter: [, , , ]. Dental stops are permitted in standard American English in certain assimilatory contexts such as before an interdental target, as in the word eighth, transcribed [eɪθ] (Ladefoged, 1993). Examples 5 and 6 show dental stops commonly used by children with cochlear implants.
Gliding is the process of substituting a glide, typically [w], for a liquid, /l/ or // (Bauman-Waengler, 2000; Lowe, 1994). This type of error is considered a typical developmental error and is an extremely common process in phonological acquisition and phonological disorders (Edwards & Shriberg, 1983). Realizations of // by children with cochlear implants are not an  vs. [w] dichotomy, but rather a continuum with the liquid and the glide as endpoints. Intermediate realizations include a labialized r, transcribed [w], and a w with r-coloring (rhoticization), transcribed [˞]. Labialization involves the addition of a lip rounding gesture and, in the majority of cases, raising of the back of the tongue (“velarization”; Ladefoged & Maddieson, 1996). Rhoticization is a process typically applied to vowels. Rhoticity can be characterized by a retroflexed tongue position (Ladefoged, 1993) or by retraction of the tongue tip, bunching of the back of the tongue, or retraction of the tongue base into the pharynx (Mackay, 1978). To determine whether a sound is [w] or [˞], the clinician should focus on the prevailing speech sound, that is, whether the sound in question is more like an  or a [w]. Example 7 shows the continuum of possible productions for the phoneme //.
Full nasal resonance for nasal consonants /m, n, ŋ/ may not be consistently achieved by children with cochlear implants. Partial nasal resonance is indicated by diacritics. The diacritic tilde [~] represents nasalization of an oral segment. Conversely, denasalized nasal consonants are indicated by the diacritic superscript slashed tilde [~/] (note typographical compromise), on a nasal consonant (International Phonetic Association, 1999, p.17, 189). In Example 8, one child produced a denasalized nasal consonant in the onset for the word moon, and another child produced a nasalized oral consonant in the onset of the word moony.
In standard English, vowels adjacent to nasal consonants are partially nasalized, and listeners often use acoustic cues from nasal vowels to make judgments about the surrounding consonants (Kent & Read, 1992). The nasal quality of a non-nasal sound may be related to coarticulation, velopharyngeal dysfunction, dysarthria, or nasal vowels that compensate for the omission of a nasal consonant (Powell, 2001). In the case of children with cochlear implants, nasal vowels are often produced in the expected context preceding a nasal consonant in the target, but the nasal consonant itself is omitted. As an example, one child omitted the nasal consonant following a nasal vowel (e.g., [kho] “comb”), where nasality is indicated by the diacritic [~]. Children with cochlear implants may also use nasal vowels even when the surrounding consonants are not nasals (e.g., [kkb] “cup”).
Obstruents are consonants with close or full obstruction at the point of articulation, including plosives, fricatives, and affricates (Ladefoged, 1993). In standard American English, there is a phonemic distinction between voiced and voiceless obstruents; however, errors of total or near-total devoicing of voiced consonants may be encountered in the speech of children with cochlear implants. Voicing errors are typically related to the timing of vocal fold adduction and abduction during phonation (Stevens, 1998). Devoicing of underlying voiced segments is indicated by the diacritic  which is typically placed below the base symbol but may be placed above symbols with a descender (e.g., [ɡ˚]; International Phonetic Association, 1999). Final obstruent devoicing is the process of reducing the amount of voicing in a final voiced obstruent. It has been reported that devoicing of final consonants may be an assimilation to the silence following the end of a word (Ingram, 1976). Some degree of final consonant devoicing is considered a normal developmental process in English-speaking children until age 3 years (Khan, 1982) or 4 years (Hodson & Paden, 1981) and is also characteristic of several adult varieties of English, such as African American Vernacular English (Bailey & Thomas, 1998).
Aspiration is the audible release of air following a voiceless stop (Ladefoged, 1993), typically occurring at the beginning of a word and denoted by the diacritic [h] (International Phonetic Association, 1999). Aspiration provides a salient cue regarding the voicelessness of initial stop consonants. Voiced stops are not aspirated in standard English; however, in the speech samples of children with cochlear implants, aspirated voiced stops may be observed. In other languages, such as Hindi, aspiration with voiced plosives is represented by the diacritic [ɦ], the superscript symbol for a voiced glottal fricative. Information provided in a spectrogram can be helpful in showing whether aspiration is voiced or voiceless.
In standard English, vowels are lengthened before voiced consonants (Klatt, 1976). For example, notice the difference in vowel length as you say the following pairs of words: fade/fate, seed/seat. Although this is predictable in standard English, for the children with cochlear implants, vowel length is not always a reliable indicator of the phonetic voicing of the following consonant. In one child's production of the word tub [thp], the appropriate phonemic contrast was preserved even though the final consonant in tub was not appropriately voiced. The child correctly lengthened the vowel, as indicated by the diacritic , but voicing was not realized in the final consonant. In this example, vowel duration was 181 ms. By contrast, the same child produced the vowel  with shorter duration (112 ms) when followed by a target voiceless consonant in the word cup [khp]. Consonants can also be lengthened in the speech of children with cochlear implants, as in [ɡoth], [ha], and [mui]. It should be noted that most instances of orthographic doubled consonants in English (e.g., hammer, ladder) do not indicate long consonants.
Shriberg and Lof (1991) stated that “the limits of perceptual transcription underlie the need for an acoustic-aided technology” (p. 273), and the current authors have found the use of spectrograms to be a viable option. A spectrogram is a visual representation of speech acoustics. Spectrograms can be useful in validating perceptual analysis (Howard & Heselwood, 2002) and clearly show certain distinctions in manner, voicing, and vowel formants. Today, there are free spectrogram programs available on the internet, including Praat (Boersma & Weenink, 2005), that allow clinicians to look at speech sounds. The following examples illustrate the clinical value of the spectrogram in phonetic transcription.
Robust contrasts between stop consonants and continuants are often visible on a spectrogram. In Figure 1, spectrographic evidence shows a stop segment (), or a period of low energy, between the two vowels in the target word mother.
In Figure 2A, the child produced a bilabial fricative for the final stop in the word sleep. The spectrogram shows frication following the vowel formants. By contrast, Figure 2B shows an aspirated stop in the final position.
Figure 3A shows the spectrogram along with the waveform for the target word rose. The waveform provides a visual account of variations in air pressure that occur during vocal fold vibration. The vocal folds vibrate at relatively regular intervals or pulses for voiced sounds. Voiceless sounds are evidenced by small, irregular fluctuations in air pressure. The spectrogram and waveform in Figure 3A show partial voicing of /z/ in the word rose represented by the vertical pulsing bars that stop prior to the end of the frication. Figure 3B is an example from the same child showing full voicing of /z/ in the intervocalic position in the word rosey.
In Figure 4A, the speaker substituted an aspirated voiced dental stop for an interdental fricative in the word father. The low energy in region A indicates the stop closure, and the noise marked as B following the stop release is perceived as aspiration. Figure 4B shows the correct production of the word father produced by another child with a cochlear implant.
The presence or absence of aspiration is another distinction that can be seen on a spectrogram. When a stop consonant is released, vocal fold vibration for the following vowel may begin immediately or after a delay. This period between the plosive release and the onset of vocal fold vibration is the voice onset time (VOT). For initial voiceless stops in English, the VOT generally ranges from 25 to 100 ms (Kent & Read, 1992); for voiced stops, the VOT is closer to 0 ms. Longer VOTs are generally accompanied by weak frication noise called aspiration. Because the acoustic characteristics of aspiration are similar to those for [h] (low amplitude random noise), aspiration is transcribed as a superscript h after the symbol for the stop consonant. In English, then, voiceless stops are associated with aspiration and long VOTs, whereas voiced stops are associated with a lack of aspiration and short VOTs. Aspiration and VOT are highly salient cues to the lexical voicing of a stop in English.
Children with cochlear implants may be observed producing initial voiceless stops without aspiration, so that allophonic aspiration of voiceless stops cannot be assumed in phonetic transcriptions. The absence of aspiration is often misperceived as the cognate voiced phoneme. For example, an unaspirated [p] may sound like a [b]; however, unlike the segment [b], which is voiced into closure, an unaspirated [p] does not have glottal pulsing. In Figure 5A, the child with a cochlear implant produced an unaspirated voiceless bilabial stop, [p=], in the word pig, with a relatively short VOT of 10 ms. By contrast, in Figure 5B, a different child produced an aspirated [ph] in the word pig, with a VOT of 51 ms.
Vowels are made up of bundles of energy known as formants, which appear on a spectrogram as dark horizontal bars. Each vowel has a predictable set of formant values; the most important formants for vowel identification are the first formant (F1) and the second formant (F2). Children with cochlear implants may be observed using the fronted vowel [ʉ] (barred-u), a central vowel, in place of /u/, a back vowel, with height and rounding held constant.3 Forward movement of the tongue body results in higher second formant values (Stevens, 1998). The spectrogram provides evidence for this distinction. In Figures 6A and and6B,6B, a child with a cochlear implant produced two variations of the target vowel /u/. The second formant values for the vowel were 1158 Hz in [but] “boot” (Figure 6A), and 1972 Hz in [dʒʉɕ] “juice” (Figure 6B). The higher F2 of the vowel in juice indicates a more forward tongue position than in boot.
Access to formant values may also provide evidence of another type of vowel error occurring among children with cochlear implants called vowel reduction. An example of reduction would be the substitution of a lax vowel (e.g., [ɪ]) for a tense vowel (e.g., [i]). Stevens (1998) outlined acoustic correlates for the tense-lax distinction, including formant values, duration, diphthongization, glottal source characteristics and bandwidth changes. With respect to the formant shift, moving from the tense vowel [i] to the lax vowel [ɪ] results in a higher F1 value and lower F2 value. That trend is seen in Example 9 from a child with a cochlear implant, where the final vowel in badgey was lax and transcribed as [ɪ]. In this case, formant values supported the perceived lenition pattern.
Advances in cochlear implant technology and early intervention have resulted in more favorable outcomes for children following cochlear implantation; however, addressing both the overt and subtle sound substitutions is necessary for describing the evolution of the speech of children in this population. As presented here, narrow transcriptions may provide valuable insight to the fields of research and habilitation by capturing critical phonetic details in individual phonological systems. Powell (2001) reminds clinicians not to fear transcription. By taking the time to develop transcription skills, clinicians may begin to notice speech production details they might have previously overlooked. Building upon the suggestions outlined in his paper, we propose a series of strategies to be considered when transcribing the speech of children with cochlear implants.
The ability to complete a detailed phonetic transcription is a tremendous clinical asset. The transcription itself is simply a record of speech behaviors; however, it is the knowledge and experience of the clinician completing the assessment that is truly valuable. According to Müller and Damico (2002), “if a transcript is to have any real representational value, it can only be as a record of a transcriber's interpretation of a set of data” (p. 308). This tutorial has highlighted particular transcription issues that may arise when working with children with cochlear implants; however, using narrow phonetic transcription to describe a phonological system that differs from standard American English is not unique to children with cochlear implants. The transcription issues reviewed and the strategies presented in this paper have implications for other populations of children with speech disorders, including children with cleft palates (Howard, 1993), Down syndrome (Roberts et al., 2005), and dysarthria (Harris & Cottam, 1985). A good narrow phonetic transcription informs patient care. Once the nature of the speech disorder is known, clinicians can begin to implement an appropriate treatment plan with confidence. For some children, the goal of speech therapy is intelligible speech, and for others, the goal extends beyond intelligible speech to fluent, natural-sounding speech. In both cases, narrow phonetic transcriptions can influence clinical practice. The narrow phonetic transcription might not change the ultimate goals of therapy; however, the underlying knowledge of specific articulatory features could dramatically change how a therapist would approach therapy, the way a therapist measures progress, and the way a therapist gives instruction and feedback (e.g., the tongue is a little too far back, the voice should be on/off, etc.). Many times, the small, subtle speech errors are the most challenging to address in therapy. Instead of teaching target speech sounds from the ground up, clinicians should tap into sounds already in the child's inventory and teach him/her to make small adjustments.
This work was supported by research grant R01DC005594 from the National Institutes of Health to Indiana University.
We are grateful to the children and parents who participated in this research study. We would also like to acknowledge Elizabeth Ying, Shirley Henning, and Bethany Gehrlein in the DeVault Otologic Research Laboratory for assistance with data collection, and Su Wooi Teoh and Katie Vaden for comments and suggestions.
1Examples provided in this paper are from an ongoing, longitudinal research project at the Indiana University School of Medicine. Study participants had at least 5 years of cochlear implant experience at the time of testing. The target language for all participants was standard American English. Speech samples were elicited in picture-naming tasks designed to test all standard American English consonants in all possible word positions. Responses were audio recorded, and edited responses were transcribed by the authors using the IPA with diacritics and extensions for disordered speech. Auditory monitoring was supported by visual examination of waveforms and spectrograms generated using Praat (Boersma & Weenink, 2005). Phoneme-by-phoneme consensus between the transcribers was achieved by simultaneous monitoring and discussion. Issues related to phonetic transcription were catalogued.
2The symbol  represents a voiced labiodental stop, involving articulation of the upper teeth with the lower lip.
3A diachronic change whereby the central vowel (barred-u) is replacing the back vowel appears to be underway, particularly among young speakers, in a wide variety of dialects in the both the United States and the United Kingdom (see Fridland, 2008; Harrington, 2007).
Amy Teoh, Indiana University School of Medicine, Dept. of Otolaryngology-HNS.
Steven Chin, Indiana University School of Medicine, Dept. of Otolaryngology-HNS.