Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Brain Lang. Author manuscript; available in PMC 2008 May 2.
Published in final edited form as:
PMCID: PMC2364719

Speech Perception and Short Term Memory Deficits in Persistent Developmental Speech Disorder


Children with developmental speech disorders may have additional deficits in speech perception and/or short-term memory. To determine whether these are only transient developmental delays that can accompany the disorder in childhood or persist as part of the speech disorder, adults with a persistent familial speech disorder were tested on speech perception and short-term memory. Nine adults with a persistent familial developmental speech disorder without language impairment were compared with 20 controls on tasks requiring the discrimination of fine acoustic cues for word identification and on measures of verbal and nonverbal short-term memory. Significant group differences were found in the slopes of the discrimination curves for first formant transitions for word identification with stop gaps of 40 and 20 ms with effect sizes of 1.60 and 1.56. Significant group differences also occurred on tests of nonverbal rhythm and tonal memory, and verbal short-term memory with effect sizes of 2.38, 1.56 and 1.73. No group differences occurred in the use of stop gap durations for word identification. Because frequency-based speech perception and short-term verbal and nonverbal memory deficits both persisted into adulthood in the speech-impaired adults, these deficits may be involved in the persistence of speech disorders without language impairment.

1. Introduction

Developmental speech disorders of unknown origin, that is when other contributing factors such as mental retardation, hearing loss or structural abnormalities cannot account for the speech disorder, can be divided into apraxia of speech (speech motor programming disorders), and developmental speech articulation disorders of unknown etiology (Shriberg, Austin, Lewis, McSweeny, & Wilson, 1997). Developmental speech articulation disorders, which differ from apraxia of speech (Shriberg, Aram, & Kwiatkowski, 1997), are often referred to as developmental phonological disorders and include three subtypes: speech delay (with multiple sound deletions and substitutions); questionable residual disorders (delayed by less than one year in speech sound development); and residual speech errors (persisting past 9 years of age). Approximately 11% of boys and 15% percent of girls with speech delay have concomitant language development disorders without cognitive impairments, while only a small percentage of children with specific language impairment have speech delay (7.6% of boys and 4.8% of girls) (Shriberg, Tomblin, & McSweeny, 1999). Therefore, although phonological disorders and specific language impairment can co-occur, they are more likely to occur independent of each other (Shriberg et al., 1999).

Children with developmental phonological disorders that involve omission and substitutions of several sounds in childhood who are without concomitant language disorders, are less likely to have difficulties in reading, writing, and spelling or require special education resources, suggesting that this is a disorder confined to speech articulation and not involving other aspects of language development (Shriberg & Kwiatkowski, 1988). A cross-sectional study confirmed that the outcome of a phonological disorder alone is better than when both phonological and language disorders co-occur in childhood (Lewis & Freebairn, 1992). When children with phonological disorders but without language disorders are grouped according to whether or not other members of their nuclear family are affected with speech disorders, those with a family history of speech disorders perform more poorly than those without a family history (Lewis & Freebairn, 1997). This suggests that for phonological disorders, as with many other disorders, the familial form is more severe than the sporadic form of the disorder and may represent a more homogenous form of the trait.

Investigators have been interested in the role of speech perception difficulties in developmental phonological disorders with the aim of determining whether perceptual difficulties may contribute to production difficulties. Speech discrimination functions for identification of the words “way” versus “ray” in children who misarticulated [r] as [w] showed less sharp discrimination curves compared with normally articulating children (Hoffman, Daniloff, Bengoa, & Schuckers, 1985). Because the stimuli differed in their first, second and third formant frequencies, the results might suggest that some misarticulating children have difficulties with using frequency spectra differences for categorizing speech sounds. In a similar study (Rvachew & Jamieson, 1989), two experiments compared groups of children who had multiple speech articulation errors involving fricatives with normally speaking children and adults. The discrimination curves were relatively flat in the speech disordered children on word identification tasks contrasting the words “seat” with “sheet” and “sick” with “thick” which differed in their frequency spectra for the initial fricative. Some children in this study, however, had mild language impairments which may have contributed to their perceptual difficulties. In fact, the authors questioned whether the speech perception difficulties were deviant or merely delayed and a consequence of the speech and language delay in many of these children rather than part of the developmental phonological disorder (Rvachew & Jamieson, 1989).

Nonverbal auditory perception and speech perception difficulties are frequently associated with specific language delay (Tallal & Piercy, 1973, 1974, 1975; Tallal & Stark, 1981). The difficulties with some studies of children with phonological speech disorders has been the possible confounding effect of concomitant delays in language development on their speech perception skills (Bird & Bishop, 1992; Frumkin & Rapin, 1980; Sussman, 1993). In these studies, children had both delayed language development as well as speech impairment (Frumkin & Rapin, 1980; Sussman, 1993). Stark examined this issue across several studies in language-impaired children with and without speech articulation errors (Stark & Heinz, 1996; Stark & Tallal, 1979, 1988). In one study, language impaired children with speech production errors were impaired on stop consonant perception while those language-impaired children who were without speech errors did not have speech perception deficits (Stark & Tallal, 1979). In a later study, however, a group of children without language impairment but with speech articulation errors performed normally on stop consonant identification tasks (Stark & Tallal, 1988). Later, Stark and Heinz (1996) found that whether or not language-impaired children could perform the identification task for [ba] and [da] was related to whether or not they could accurately produce the sounds. That is, those language-impaired children without articulation errors performed normally on the identification task (Stark & Heinz, 1996). By studying adults with residual speech disorders but with no concomitant language deficits, we planned to examine the role of speech perception in phonological disorders independent of developmental language disorders.

Although several studies indicate that children with developmental phonological disorders have speech perception difficulties (Bird & Bishop, 1992; Broen, Strange, Doyle, & Heller, 1983; Hoffman et al., 1985; Hoffman, Stager, & Daniloff, 1983; Rvachew & Jamieson, 1989; Sherman & Geith, 1967), the role of the speech perception deficits in developmental phonological disorders in children is unclear. When studied in children with developmental phonological disorders, the speech perception difficulties may be either a consequence of the phonological disorder, a concomitant developmental delay, or an essential part of the disorder. The same issue has confronted the literature on the role of speech perception difficulties in specific language impairment. Bernstein and Stark (1985) studied language-impaired children twice four years apart. They initially found such children had poor discrimination functions for/ba/and/da/(which differed in second formant frequency changes) but when studied four years later, the children had developed normal perceptual skills (Bernstein & Stark, 1985). The authors questioned whether the speech perception deficits found in language-impaired children play a significant role in developmental language delays or might be a concomitant developmental factor or secondary to the language impairment. By examining speech perception skills in adults with a residual speech articulation disorder, we hope to address the issue of whether speech perception difficulties are an essential component of developmental phonological disorders.

Most of the speech perception skills examined in developmental phonological disorders have involved word or syllable identification when the stimuli differ in formant frequencies (Hoffman et al., 1985; Rvachew & Jamieson, 1989). Identification of stops in consonant-vowel (CV) syllables such as [ba] and [da] can depend on perception of rapid (40 ms) formant transitions which seem to present particular difficulties in language-impaired children with phonological impairments suggesting impaired processing of rapid frequency transitions in these children (Bird & Bishop, 1992; Frumkin & Rapin, 1980).

Both temporal and frequency-based acoustic cues can be used to identify words differing in the presence or absence of a stop plosive [t] for the identification of the words “say” and “stay” (Morrongiello, Robson, Best, & Clifton, 1984; Nittrouer, 1992, 1999). Redundant cues such as the change in the first formant frequency (a spectral cue) and the stop gap duration (a temporal cue) can be used for perception of a [t] between [s] and [eI] for identification of “stay”. A recent study by Steinschneider et al. (2005) found that auditory evoked potentials recorded from the anterior region of Heschl’s gyrus, within the primary auditory cortex in a patient, were modulated both by voice onset time and differences in the first formant frequency suggesting that both temporal and spectral cues are responded to within the primary auditory cortex in humans (Steinschneider et al., 2005). Whether the same or different neuronal systems are involved in the use of temporal or spectral cues for categorical speech perception is unknown.

Nittrouer (1992) studied the use of these different cues for word identification, when presented to normal children and adults. Separate continua varied the change in the first formant transition while keeping the stop gap constant (either with a 40- or a 20-ms stop gap) and another continuum was developed that varied only the stop gap duration while the first formant remained unchanged. She reported that children were more sensitive to frequency transitions and, therefore, required less of a stop gap cue than older children and adults. It seems that with development, English speakers tend to use temporal rather than spectral cues to identify voiceless stops (Nittrouer, 1992). By using two different continua with same stimuli (i.e., “say-stay”), one varying formant frequency changes and the other varying the duration of the stop gap, one can determine if participants can use either of these cues for word identification. We used these tasks in this study to determine whether the adults with residual speech disorders had normally shifted from the predominant use of frequency cues to using temporal cues for stop gap identification in words.

Baddeley and his colleagues have proposed that phonological working memory has a significant role in a variety of complex cognitive tasks (Baddeley & Hitch, 1974). They suggested that some individuals fail to develop a memory trace of speech input long enough to abstract acoustic features for phonological encoding. Several reports have linked deficits in phonological short-term memory with language or reading impairment (Brady, 1997; Gathercole & Baddeley, 1990; Kahmi, Catts, Mauer, Apel, & Gentry, 1988; Kahmi & Catts, 1986; Kirchner & Klatzky, 1985). Short-term verbal memory deficits have been found for verbal material such as digits, word and sentence strings in speech-impaired children (Saxman & Miller, 1973; Smith, 1967). In another study using the Seashore tests of auditory perception and auditory memory (Seashore, Lewis, & Saetveit, 1939), speech-impaired children were deficient relative to controls on two tests of nonverbal auditory memory, the Tonal Memory Test and the Rhythm Test (Bergendal & Talo, 1969). Therefore, results in speech-impaired children have indicated that both nonverbal auditory memory as well as verbal memory impairments may be associated with developmental speech disorders.

Our purpose was to examine speech perception and short-term memory functioning for verbal and non-verbal auditory stimuli in adults with residual speech impairments resulting from a developmental familial speech disorder. We chose not to study non-speech auditory perception because deficits in non-speech auditory perception are not related to impaired speech perception (Nittrouer, 1999; Rosen, 2003; Rosen & Manganari, 2001). By studying adults with a history of a familial speech disorder, speech perception and short-term verbal and nonverbal memory deficits can be assessed independent of developmental delays. Each of these adults had developed normally in other areas; all were gainfully employed and leading productive lives. The aim was to examine whether speech perception and short-term memory difficulties are part of a phenotype of a persistent familial speech disorder. Based on previous research, we hypothesized that adults with a persistent speech disorder would have related deficits in both speech perception and short-term memory for verbal and nonverbal auditory stimuli.

2. Methods

2.1. Participants

Nine adults (mean age = 49 years; range 22 to 66) with both an individual and family history of a developmental speech disorder and 20 adult controls (mean age = 39 years; range 18 to 56) participated in this study. Participants were recruited for the research by advertising in local newspapers and through the NIH Recruitment Office for families with several members having speech disorders. Participants in the control group were recruited by advertising the study through the Normal Volunteer Office at the NIH.

All participants were interviewed over the telephone and excluded if they were not a native English speaker or described other problems in addition to a speech disorder. After passing the telephone screening, participants underwent informed consent at the NIH to participate in a study approved by the Institutional Review Board of the National Institute of Neurological Disorders and Stroke. Each adult received a history and physical examination from a physician prior to testing. Personal and family speech/language histories as well as general health and developmental histories were obtained by a speech-language pathologist who asked each participant about difficulties with language learning and excluded those persons reporting language problems. Following this, formal diagnostic testing was used to further exclude families and individuals with hearing, cognitive or language disorders (see below).

Participants in both groups were native monolingual speakers of American English without exposure to another language in the home during childhood. Participants in the control group did not have a family history of speech disorders. All had normal hearing on audiometric screening and none reported use of psychoactive drugs or any history of neurological or psychiatric disorders. Several of the speech-impaired adults were related to each other (Table 1); P1 and P2 were brothers, P3 and P4 were sisters, and P5 and P6 were father and son. The speech-impaired participants had received different amounts of therapy for their disorders in childhood (Table 1).

Table 1
Characteristics of adults with a persistent developmental speech disorder.

2.2. Characterization of Speech and Language

Diagnostic testing was administered by a speech-language pathologist to eliminate participants with language and nonverbal cognitive deficits and assign participants to groups. Because extensive developmental speech and language histories were acquired before admitting persons or families to the study, few were excluded because they demonstrated language or cognitive deficits on formal testing. The diagnostic battery included: Peabody Picture Vocabulary Test III (PPVT-III) (Dunn, 1959), Expressive Vocabulary Test (EVT) (Williams, 1997), Test of Auditory Comprehension of Language-Grammatical Morpheme Subtest (TACL-3) (Carrow-Woolfolk, 1999), Oral Speech Mechanism Screening Examination (St. Louis & Riscello, 1981), Revised Token Test (RTT) (McNeil & Prescott, 1978), Test of Nonverbal Intelligence (TONI-2) (Brown, Sherbenou, & Johnsen, 1990), and the Goldman-Fristoe Test of Articulation (Goldman & Fristoe, 1986).

Participants with a persistent familial speech disorder had at least one consistent articulation error, a history of multiple speech articulation errors as a child, and at least one immediate family member with a history of a multiple-sound speech disorder. Each of the nine speech-impaired adults had only expressive speech errors and was within normal limits on the measures of syntactic and morphological comprehension and receptive vocabulary (Table 2). The groups were comparable in cognitive abilities, expressive vocabulary, grammatical and morphological comprehension. The speech-impaired participants scored within the impaired range of articulation as defined by the Goldman-Fristoe test norms (Table 2).

Table 2
Scores of control and speech-impaired participants on the diagnostic test battery used for group assignment.

2.3. Synthesized Speech Stimuli

We tested whether or not the participants could use temporal and frequency-based acoustic cues to identify the words “say” and “stay.” Both changes in the first formant frequency (a spectral-based cue) and the duration of the stop gap (a temporal cue) can lead to the perception of a stop consonant between [s] and [eI] for the identification of “stay”. At one end of a continuum, the stimuli are typically labeled as “say” and at the other end the stimuli are identified as “stay.” To assess an individual’s ability to use spectral cues for word identification, we varied the change in frequency of the first formant between the initial consonant (i.e., [s]) and following vocalic portion ([eI)]. The first formant transition signals the presence or absence of a stop consonant based on its frequency change. A formant transition with a high onset frequency (e.g., ~ 611 Hz) between [s] and [eI] has no frequency change and tends to be perceived as “say” while one with a low onset frequency (e.g., ~211 Hz) has a large frequency change and usually is perceived as “stay.” The lower the onset frequency of the first formant, therefore, the greater the frequency change relative to the steady state of the final vowel. To determine the amount of frequency change in the first formant required for an individual to identify “stay,” two sets of synthesized speech stimuli varied the onset frequency of the first formant. One set had a stop gap duration of 20 ms while the other had a stop gap of 40 ms. If an individual had difficulty detecting frequency change, they required a lower onset frequency to identify the word “stay.”

Pilot testing in normal adults on the first formant continuum with the 20-ms stop gap indicated that the 50% boundary frequency was between 311 and 361 Hz, at the lower end of the continuum, and that the task was difficult for many adults. To adjust for the decreased sensitivity of normal adults, we included a continuum with a longer stop gap duration (i.e., 40 ms) that placed the 50% boundary in the middle of the continuum closer to 411 Hz. In the current study, adults received both the 20-ms and 40-ms gap continua to determine whether there were discrepancies in sensitivity to frequency transitions between controls and speech-impaired adults.

To assess an individual’s ability to use temporal cues for word identification, the duration of the silent gap between the initial [s] consonant and vocalic portion [eI] was varied to determine the length required for an individual to identify “stay.” Here, the duration of the stop gap signals the presence or absence of a stop consonant. A synthesized speech token with a short stop gap duration (≤ 20 ms) between [s] and [eI] sounds most like “say” while longer stop gap durations tend to be perceived as “stay.” An individual who has difficulty detecting stop gaps may require a longer stop gap in order to perceive the word “stay.”

The use of two types of continua, one varying frequency and the other varying time, makes it possible to individually manipulate two types of cues using the same basic stimuli (i.e., “say-stay”). The measures derived from participants’ identification of individual tokens were the boundary frequency (50% crossover frequency) for the identification of “stay” instead of “say” and the slope of the identification functions. These measures determined if participants could use acoustic variations in frequency and time for word identification. The speech continua used in the current study were modeled after previous publications (Nittrouer, 1992, 1999).

2.4. Frequency-based cues

The two “say-stay” continua varying only the first formant onset frequency contained a natural sample of [s] frication noise (120 ms), a stop gap of either 40 (Figure 1a–b) or 20 ms, followed by one of nine different first formant transitions during the first 40 ms of the vowel following the stop gap. The first formant onset frequency varied from the lowest starting frequency of 211 Hz (Figure 1a) in successive increments of 50 Hz up to 611 Hz (Figure 1b). After the 40 ms transition, the first formant reached 611 Hz, where it remained for 120 ms and then fell to 304 Hz over 90 ms, where it stayed for the final 50 ms. During the vocalic portion of all stimuli, the fundamental frequency fell from 120 to 100 Hz and the third formant fell through the first 40 ms, from 3196 Hz to 2694 Hz, remained there for 120 ms, then rose to 2929 Hz over 90 ms, and remained there for the next 50 ms. The second formant remained constant at 1840 Hz during the first 160 ms, and then rose to 2240 Hz over the next 90 ms, where it remained for the final 50 ms.

Figure 1
Spectrograms of endpoint stimuli from the “say-stay” continua including: the 211 Hz (a) and the 611 Hz (b) onset frequencies from the 40-ms continuum varying in first formant onset frequency and the 0 ms (c) and 104-ms (d) stop gap durations ...

2.5. Temporal Cues

A second set of “say-stay” stimuli (Figure 1c–d) varied the stop gap duration, (i.e., following the [s] frication and preceding the first formant transition) to determine if participants had difficulties with identification based on a temporal cue (stop gap). The initial [s] frication noise (120ms) was followed by a silent (stop) gap varying in duration between 0 and 104 ms in 8 ms intervals in 14 different stimuli. Longer stop gap durations provide a larger cue for a stop plosive consonant like [t] than shorter stop gap durations. The stop gap was followed by the same 300-ms vocalic portion used for the continua varying in the first formant onset frequency. A constant onset frequency of 411 Hz was used to provide a cue that was perceptually unbiased toward either the “say” or “stay” end of the continuum. The second and third formants were as described above.

2.6. Speech Perception Testing

On each trial for both the stop gap and formant transition, the participant was asked to point to the orthographic spelling of the word they perceived (“say” versus “stay”). Before presentation of the experimental stimuli, participants were given 20 practice items, 10 each of the two endpoint stimuli, to familiarize them with the endpoints of the stimuli. Because the participants were adults, no effort was made to teach the correct identification and no feedback was provided regarding response accuracy.

Experimental presentation of stimuli was controlled by the computer and occurred at an interstimulus interval of seven seconds. Stimuli were presented via David Clark H10-00 headphones at a comfortable loudness level using a Dell OptiPlex GXI computer. Stimulus presentation was controlled by ECos/Win (AVAAZ Innovations, 1994) software designed to implement listening experiments based on a set of specified parameters (e.g., number, order, and timing of acoustic stimulus presentations). The experimenter registered responses directly to the computer using a mouse click on the appropriate screen. Black on white orthographic spellings (72-point font) of the words “say” and “stay” were printed in capital letters on a sheet of paper separated in the middle by a heavy black line.

The presentation order of the three continua was randomly ordered across participants. For the two continua varying in the first formant onset frequency, five blocks of 18 randomly ordered stimuli were presented (total of 90 stimuli) and for the stop gap continuum, seven blocks of 20 randomly ordered stimuli (total of 140 stimuli) were presented. Each stimulus within a continuum was presented ten times. Participants were given an opportunity to rest between any of the blocks.

Participants labeled all tokens of each stimulus on a continuum so that the probability of a given response at each step on the continuum (e.g., “stay”) could be determined and the identification functions graphed. Hypothetically, a flat slope on an identification function with all points close to the 50% level would be expected if a participant could not process the cue and was guessing the stimulus identity. If a participant could use the acoustic cue normally for word identification, the identification function would have a steep slope reflecting a categorical perception between “say” and “stay”. For each participant, a probit analysis (Systat11, 2004) was performed for each of the three synthesized speech continua to calculate the slope of the identification functions and the 50% crossover point. The participants’ slope values were used for group comparisons. Each participant received all three continua except for one speech-impaired participant who was unvailable for the first formant continuum with the 20-ms gap.

2.7. Non-Verbal Auditory Short-term Memory Testing

The nonverbal auditory short-term memory tests were the Rhythm Test for temporal patterns and the Tonal Memory Test for frequency information. The Rhythm Test (Reitan Neuropsychology Laboratory) was a tape-recorded copy of the Rhythm Discrimination Subtest of the Seashore Tests of Musical Abilities (Saetveit, Lewis, & Seashore, 1940; Seashore et al., 1939) containing pairs of 5-, 6-, and 7-note rhythmic patterns requiring “same” or “different” judgments. The tones were 50 ms beeps of constant frequency of 500 Hz. The tempo, meter, accent placement and number of notes were constant for each pair with different intervals between notes. The tones in a five-note rhythmic pattern were separated by intervals of 130, 265, and 400 ms while the six- and seven-note patterns also contained intervals of 525 ms. The pattern of intervals between the notes in a pair of rhythmic patterns was either the same or different. The experimenter first described the stimuli and the need to make a same-different judgment to participants. Three sample items preceded presentation of the test stimuli.

The Tonal Memory Subtest of the Seashore Tests of Musical Abilities was used to examine frequency-based nonverbal short-term memory. Series A of The Tonal Memory Subtest was reconstructed based on published descriptions and musical specifications (Saetveit et al., 1940; Seashore et al., 1939) using the Practica Musica software (Evans, 1989). Thirty pairs of stimuli, 10 each of three, four and five tones (of 250 ms each) in sequences varied only in frequency, requiring the participant to remember the melody. The second sequence in a pair always contained one note whose frequency differed from the first. Participants were asked to give the tone number (i.e., first, second, third, fourth, or fifth in the sequence) that differed between the two patterns. Three sample items preceded formal testing.

2.8. Verbal Short-Term Memory

Because performance on the Wechsler Digit Span Subtest was previously shown by Baddeley and colleagues (Baddeley, 1998) to correlate with performance on a non-word repetition test, the Digit Span Subtest from the Wechsler Adult Intelligence Scale (Wechsler, 1955) was administered as a test of short-term verbal memory and phonological working memory. By using digits rather than non-words, scores were more likely to reflect memory rather than speech production errors in the speech impaired participants.

2.9. Statistical Analyses

Analysis of variance/covariance techniques were used to assess differences between the speech-impaired and control groups. For some measures it was necessary to transform the original data to better meet the assumptions of normality and homogeneity of variance across groups. A logarithmic transformation (base e) was employed for the Tonal Memory and slopes for the three “say-stay” continua. For the 20 and 40 ms slopes, the original data were first altered by adding .004 (the smallest non-zero value for each variable) to all values before taking the natural log. For the stop gap slopes, because all the values were negative, they were multiplied by −1 before taking logarithms and then multiplied by −1 again after the transformation. For tonal memory, one observation from the control population was removed as this score was more than 5.5 standard deviations (computed after the log transformation and without this aberrant case) away from the mean for the controls and more than 3.5 standard deviations below any of the other control or patient measurements. The results were qualitatively similar whether this individual is included or not – this individual’s results were omitted because their results are odd enough to question whether they are truly within the population of normally hearing individuals.

To determine whether the two groups differed on speech perception testing, the slopes for each of the three “say-stay” continua were compared using an analysis of covariance with disordered/control group as a factor and age as covariate. To determine if the groups differed in auditory and verbal short-term memory skills, group comparisons were conducted on scores for the Rhythm Test, the Tonal Memory Test, and Digit Span Subtest scores using the same model. As there is some evidence of an age difference between groups (p = 0.04 based upon two-group t-test), age was included as a covariate in all results. Effect sizes were computed as the absolute value of the difference in the group averages (after correcting for any age effect) divided by the estimated standard deviation of error in the analysis of covariance model. The alpha level required for statistical significance was Bonferroni corrected to 0.008 (0.05/6 = 0.0083) because of the six group comparisons. Spearman rank correlations were calculated to determine the relationship between articulation impairment and perceptual and short-term memory test scores within the disordered group; a Bonferroni corrected alpha value of .0083 was used for statistical significance. Finally, Spearman rank correlations were computed to determine the relationship between performance on the memory and speech perception tests and statistical significance determined using a Bonferroni corrected alpha level of 0.0056 (0.05/9 = 0.0056).

3. Results

3.1. First formant starting frequency variation

The mean phoneme boundary on the “say-stay” continuum with the 40-ms gap was between 311 and 361 Hz for speech-impaired adults and between 361 and 411 Hz for controls, indicating that the impaired participants required a larger frequency change in order to identify “stay.” The impaired adults made fewer “stay” responses at the 211 Hz onset frequency and more “stay” responses at the 611 Hz onset frequency compared to controls (Figure 2a). For the “say-stay” continuum with the 40-ms gap, the analysis of covariance indicated a significant group difference (Bonferroni corrected alpha level = 0.008) in the slopes of the participants’ identification functions (p = 0.001, effect size = 1.60). The group variability in the speech-impaired adults on the 40-ms gap continuum was patterned differently from the controls, particularly at the endpoints of the continuum. The control participants exhibited low within-group variability at the endpoints of the continuum, while their variability increased at the midpoints. However, across the impaired participants, variability was relatively high at all onset frequencies (Figure 2a).

Figure 2
Percentage “stay” responses of speech-impaired (solid black line) and control (dotted gray line) participants as a function of first formant onset frequency on the 40-ms continuum (a), first formant onset frequency on the 20 ms continuum ...

For the 20-ms gap continuum, the average phoneme boundary for both groups was between 311 and 361 Hz. The impaired participants made fewer “stay” responses at the 211 Hz onset frequency (Figure 2b) and more “stay” responses to the 611 Hz onset frequency compared to controls. The analysis of covariance model indicated a significant group difference in the slopes of the participants’ identification functions (p = 0.002, effect size = 1.56). As noted for the 40-ms gap continuum, the speech-impaired participants’ responses were more variable relative to controls, particularly at the lowest starting frequencies.

3.2. Stop gap variation

Both groups performed similarly on the stop gap duration continuum (Figure 2c), although there was a slight tendency for the speech-impaired participants to make more “stay” responses at the shortest stop gap durations. Analysis of covariance demonstrated no significant group difference in the slopes of the participants’ identification functions (p = 0.15, effect size = 0.65). The average phoneme boundary was between 32 and 40 ms for the speech-impaired participants and between 40 and 48 ms for the controls and the variability was similar for both groups.

3.3. Individual Participant’s Task Performance

Two speech-impaired participants had significant difficulty detecting contrasts on the continua varying first formant transitions. Participant 3 was unable to perceive any stimuli as “stay” on either the 20-ms or 40-ms continuum while Participant 5 had the same problem only on the 20-ms continuum. None of the control participants reported similar difficulties perceiving stimuli with the lowest first formant onset frequencies as “stay” or the highest onset frequencies as “say.” However, both speech-impaired participants exhibited responses that were indistinguishable from the controls on the “say-stay” continuum varying stop gap duration. Because these speech-impaired participants demonstrated understanding of the task requirements on the stop gap continuum, their responses on the formant frequency transitions were included in the data.

3.4. Non-verbal Auditory Short-Term Memory

The speech-impaired participants were less accurate in remembering rhythmic patterns (Figure 3). A group comparison using a combined score for each participant across the 5-, 6-, and 7-tone sequences on the Rhythm Test was significant (p < .001, effect size = 2.38). The 5-tone sequence, which was the easiest of the three sets, showed the most difference between the groups, in part because controls had relatively more difficulty on longer sequences (Figure 3).

Figure 3
Box and whisker plot of percent correct responses at three levels of difficulty (5-, 6-, and 7-tone sequences) on the Rhythm Test administered to speech-impaired and control participants. The center line within the box indicates the median of the sample. ...

The speech-impaired participants also had greater difficulty on the Tonal Memory Test (Figure 4). Significant group differences occurred when participants’ mean values across the three lengths of tone sequences were compared (p = .002, effect size = 1.54). Greater group differences were noted as task difficulty increased.

Figure 4
Box and whisker plot of percent correct responses at three levels of difficulty (3-, 4-, and 5-tone sequences) on the Tonal Memory Test administered to speech-impaired and control participants. The center line within the box indicates the median of the ...

3.5. Verbal Short-Term Memory

The average percentile rank on the WAIS Digit Span subtest for the speech-impaired participants was 38 compared to controls’ average score of 70. An analysis of covariance model indicated a significant difference between the groups (p < .001, effect size = 1.73). When the raw scores for digits forwards and digits backwards were examined as repeated factors in an ANOVA, no interaction was found between the group effect and whether scores were for digits forwards or backwards indicating that similar group effects occurred on both digits forwards and digits backwards (Figure 5).

Figure 5
Box and whisker plots of the raw scores for digit forwards (in black) and backwards (in grey) on the Digit Span Subtest of the Wechsler Adult Intelligence Scale administered to speech-impaired and control participants. The center line within the box indicates ...

3.6. Correlations with Speech Impairment

Spearman rank correlations were computed between speech articulation test scores and speech perception and memory test scores within the speech-impaired adults (Table 3). None of the rs values were significant at the Bonferroni corrected p values of 0.008 indicating no relationship between articulation scores and perception or memory performance. A non-significant trend was found between speech impairment and the Digit Span Subtest (rs = −0.724, p < 0.05), indicating that poorer articulation might be associated with an increased short-term memory span. A scatterplot of percentile scores on the Goldman-Fristoe Test of Articulation with Digit Span test scores indicated that those with reduction errors tended to have lower digit span scores and higher articulation test scores than those with distortion errors (Figure 6).

Figure 6
Plots of the relationship between Digit Span scores and percentile scores on the Goldman-Fristoe test of Articulation for speech–impaired adults who had either distortion errors (circles) or reduction errors (X’s) in their speech.
Table 3
Measures of correlation between degree of speech impairment (percentile rank score on the Goldman-Fristoe test of Articulation) and perceptual and memory test scores within the speech-impaired group. P-values for the two-sided hypothesis of zero correlation ...

3.7 Types of Speech Errors

As mentioned above, both participants 3 and 5 had difficulties discriminating “say” and “stay”. Both these participants had speech distortion errors rather than speech reduction errors. To determine if the type of speech error related to performance on the speech perception and nonverbal and verbal short-term memory tasks, we compared these subgroups of speech-impaired adults on each of the tasks using an ANOVA with the log transformed data. None of the contrasts had a resulting p value ≤ 0.008. However, a non-significant trend found that the short term memory performance was lower in those with reduction speech errors (mean score = 24.0) compared to those who had distortion errors (mean score = 49.8) with an effect size of 1.85.

3.8. Correlations with Short-term Memory

Spearman rank correlations were computed on short-term memory scores and the slope values from the speech perception testing within the speech-impaired adults (Table 4). None of the correlations met the Bonferroni corrected significance level of 0.0056. The highest correlations (>rs = 0.83) were obtained between the slope for the 40-ms continuum for first formant transitions and the Tonal Memory scores indicating that good discrimination between “say” and “stay” based on first formant transitions may be related to good performance on Tonal Memory. Both of the frequency-based speech perception continua had positive correlations with the Tonal Memory Test. On the other hand, the stop gap duration continua where the speech-impaired participants did not differ from the controls, showed a negative correlation with Tonal Memory scores.

Table 4
Rank correlation coefficients between performance on the speech perception tests and performance on the tests of auditory and verbal short term memory within the speech-impaired group. P-values for the hypothesis that the correlation is 0 are shown in ...

4. Discussion

The adults with speech sound disorders had difficulty identifying tokens of the words “say” and “stay” when the contrast was cued solely by variations in first format transition onset frequency. For both the 20 ms and 40 ms continua the speech-impaired participants had significantly shallower slopes on their identification functions, suggesting greater difficulty with using spectral cues to develop categorical boundaries between the two words. Furthermore, two speech-impaired adults were unable to perceive “stay” for any of the stimuli when the first formant onset frequency was varied, while none of the control participants had similar difficulties. While the speech-impaired participants needed a greater frequency change to identify a stop consonant such as “t,” even when the size of the frequency change increased at onset frequencies around 211 Hz, the speech-impaired adults still had a lower percentage “stay” responses than the controls.

There were methodological shortcomings of this study; only nine speech-impaired adults were studied and some of the adults came from the same families. We selected the adults very carefully to assure that they represented a homogenous sample. However, because three pairs were related it could be argued that they were not independent; however, we found no relationship between pairs from the same family. For example, the most severely affected participants on the speech perception tasks were participants 3 and 5, who came from different families. Moreover, we only studied speech perception using one particular set of words, “stay” and “say”. We do not know whether the same results would have been found on other types of speech discrimination tasks such as differentiating “ba” from “da”.

The results presented here are the first to indicate that adults with persistent familial speech disorders with normal language functioning also have difficulty with word identification using spectral cues such as formant transitions. Similar to Nittrouer (1992), we found fewer perceptions of “stay” when the stop gap was reduced from 40 to 20 ms in the adult controls. Nittrouer (1992) found that children were more sensitive to the formant transition cue than the stop gap duration cue. Furthermore the children’s identification functions were shifted to the right of the adult functions, towards higher onset frequencies. Therefore, our speech-impaired adults did not demonstrate patterns similar to unimpaired children and the concept of a developmental delay does not fit the pattern of perceptual deficits displayed by the speech-impaired participants.

The control and speech-impaired participants performed similarly on identification of “say-stay” when using a temporally-based cue, i.e., stop gap duration. This may suggest that these adults needed both stop gaps in addition to spectral cues to perceive the categories correctly. Stop gap detection in speech stimuli has been less well studied than place of articulation as a speech perception cue in language and speech-impaired children. Only a few studies have manipulated this temporal cue independently of spectral changes. One study found adults with familial dyslexia required longer silence durations than controls to identify stimuli as [sta] (Steffens, Eilers, Gross-Glenn, & Jallad, 1992). Tallal and colleagues (Tallal & Stark, 1981) reported that while significant differences were not observed between controls and language-impaired children on discriminating [sa]-[sta], three times as many language-impaired children as compared to controls failed to reach training criteria using these stimuli.

Some differences between our findings and those of previous investigators may reflect differences in the underlying populations and/or methodologies. For example, others used a synthesized [sa]-[sta] stimuli, which are non-meaningful, compared to the meaningful “say-stay” stimuli used in this study (Steffens et al., 1992; Tallal & Stark, 1981). However, our speech-impaired participants did not benefit from the meaningful stimuli when the perception involved brief formant transitions as in the “say-stay” continua varying first formant transitions. Thus, these results suggest that small, spectral cues may be difficult for these participants. The ability of the speech-impaired adults to categorize the stimuli similarly to controls using the stop gap durations indicates that their difficulty was not with categorical perception per se, although they may need redundant cues to identify speech sounds. Further, because the task demands for all three continua were exactly the same, the stop gap results indicate that both speech-impaired and control participants understood the task equally well and could perform the task.

The differences between the speech-impaired and control participants were not due to differences in language or nonverbal intelligence, which were controlled through participant selection. However, the results of the Rhythm Test and Tonal Memory Test indicated that the speech-impaired adults had poor short-term auditory memory skills compared to controls. The Tonal Memory Test (Seashore et al., 1939) is based on memory for tones that are sufficiently long in duration (250 ms) to provide an adequate window for encoding (Tallal & Piercy, 1973), but differ in frequency.

Our results of impairments on the Digit Span Subtest in speech-impaired adults are consistent with a previous study finding deficits in short-term verbal memory in speech-impaired children (Smith, 1967). However, the short-term memory deficits of these speech-impaired adults, were not limited to phonological stimuli based on finding deficits with non-speech stimuli (i.e., the Rhythm Test and Tonal Memory Test results). The phonological working memory model does not separate phonological memory from perceptual processing skills required for accurate identifying the phonological representations, thus confounding measurement of memory and perceptual abilities (Bowey, 1997). It would appear that limitations in both verbal and nonverbal auditory memory and speech perception are affected in these speech-impaired adults.

These results in speech-impaired adults are similar to some reports of phonologically-impaired children having speech perception difficulties (Bird & Bishop, 1992; Broen et al., 1983; Hoffman et al., 1985; Rvachew & Jamieson, 1989). In each of these studies of speech-impaired children, however, the authors noted differences among the speech-impaired children with some having perceptual difficulties and others performing normally. None of these studies determined whether children had family histories of speech disorders and we do not know whether some or all of these children eventually recovered from their speech disorders later in childhood. Neither verbal nor nonverbal memory was assessed in the children with speech perception difficulties so it is unknown whether short-term memory deficits co-existed with speech perception difficulties in these children. On the other hand, Stark and Tallal (1988) compared children with speech articulation disorders with normal children and language-impaired children on a variety of speech, language, auditory processing, memory, speech perception, motor and sensorimotor skills. Many of their speech-impaired children previously had delays in language development, and 21/36 came from families with histories of speech and/or language disorders. When compared with the controls, the speech-impaired participants did not differ from the normal control group on speech discrimination testing on the/ba/-/da/stimuli pair for formant transitions for 40 ms. However, the speech-impaired participants was significantly impaired relative to the control group on the Serial Memory subtest for cross-modal stimuli incorporating three elements (Stark & Tallal, 1988). Although the authors profiled the speech-impaired participants as primarily having deficits in motor performance, the children were not impaired on the diadochokinetic speech test suggesting that the children did not have speech apraxia. The description of the childrens’ speech articulation errors was limited to the Templin Darley test scores with no description of the types of speech errors. This was the only study incorporating both speech discrimination and short-term memory in children with speech errors and only found deficits on short-term serial memory. By finding both speech perception and verbal and nonverbal deficits in adults with residual speech errors, we hypothesize that our participants may have been those most impaired on both speech perception and verbal and nonverbal memory as children.

Other studies have examined short-term memory for verbal material in speech-impaired children (Saxman & Miller, 1973; Smith, 1967), although Saxman and Miller (1973) concluded that diminished linguistic ability could better account for differences between the speech-impaired and control groups rather than short-term memory. Bergendal and Talo (1969) examined short-term memory skills using nonverbal stimuli in the Seashore tests, similar our study. They studied severely speech delayed children with multiple speech errors, normal vocabularies and significant grammatical errors in their sentences, suggesting that they were language-impaired. The estimated effect sizes of differences found between the normal and speech-impaired children were 2.2 on the Tonal Memory Test and 1.45 on the Rhythm Test of 1.45 (Bergendal & Talo, 1969), similar to found here in our speech-impaired adults.

Thus, speech perception difficulties and verbal and nonverbal memory deficits have been reported separately in studies of speech-impaired children, some of whom may have had language delays. However, both sets of skills (speech perception and short-term memory) have not been assessed together in most studies and have not been found to co-exist in speech-impaired children as a group. Furthermore, none of the studies in children selected those with family histories or followed such children to determine if they continued to have residual speech disorders into adulthood. Therefore, because children with familial speech disorders that persist into adulthood have not been studied on both speech perception and short-term memory skills, we can only hypothesize based on our results in adults that the co-existence of both speech perception and verbal and nonverbal memory deficits may be associated with residual speech errors continuing into adulthood.

Our finding that speech-impaired adults with residual deficits have difficulties in both speech perception and short-term memory may suggest that these deficits may be part of this syndrome. Of course, this does not determine whether or not the presence of deficits in speech perception and short-term memory for spectral cues played a role in the persistence of the speech impairment in this syndrome. Prospective studies of children with familial speech impairments are needed to determine whether associated deficits in speech perception and short-term memory are predictive of the degree to which these children’s deficits can be remediated with therapy.


  • AVAAZ Innovations, I. Speech Assessment and Interactive Learning System (Versions 1.2) London, Ontario, Canada: AVAAZ Innovations, Inc; 1994.
  • Baddeley A. Recent developments in working memory. Current Opinion in Neurobiology. 1998;8(2):234–238. [PubMed]
  • Baddeley AD, Hitch GJ. Working memory. In: Bower G, editor. The Psychology of Learning and Motivation. Vol. 8. New York: Academic Press; 1974. pp. 47–90.
  • Bergendal BI, Talo ES. The responses of children with reduced phonemic systems to the Seashore Measures of Musical Talents. A preliminary study of the ability to discriminate between differences in pitch, loudness, rhythm, time, timbre and tonal memory. Folia Phoniatrica (Basel) 1969;21(1):20–38. [PubMed]
  • Bernstein LE, Stark RE. Speech perception development in language-impaired children: a 4-year follow-up study. Journal of Speech and Hearing Research. 1985;50(1):21–30. [PubMed]
  • Bird J, Bishop D. Perception and awareness of phonemes in phonologically impaired children. European Journal of Disorders of Comminication. 1992;27:289–311. [PubMed]
  • Bowey JA. What Does Nonword Repetition Measure? A Reply to Gathercole and Baddeley. Journal of Experimental Child Psychology. 1997;67(2):295–301. [PubMed]
  • Brady S. Ability to encode phonological representations: An underlying difficulty of poor readers. In: Blachman B, editor. Foundations of Reading Acquisition and Dyslexia. Mahwah, New Jersey: Lawrence Erlbaum Associates; 1997. pp. 21–47.
  • Broen PA, Strange W, Doyle SS, Heller JH. Perception and production of approximant consonants by normal and articulation-delayed preschool children. Journal of Speech and Hearing Research. 1983;26(4):601–608. [PubMed]
  • Brown L, Sherbenou RJ, Johnsen SK. Test of Nonverbal Intelligence. 2. Austin: Pro-Ed; 1990.
  • Carrow-Woolfolk E. Test for Auditory Comprehension of Language-3. 3. Austin, Texas: PRO-ED, Inc; 1999.
  • Dunn L. Peabody Picture Vocabulary Test. Circle Pines, MN: American Guidance Service; 1959.
  • Evans J. Practica Musica: Ars Nova Software; 1989.
  • Frumkin B, Rapin I. Perception of vowels and consonant-vowels of varying duration in language-impaired children. Neurophsychologia. 1980;18:443–454. [PubMed]
  • Gathercole S, Baddeley A. Phonological memory in language-disordered children: Is there a causal connection? Journal of Memory and Language. 1990;29:336–360.
  • Goldman R, Fristoe M. Goldman-Fristoe Test of Articulation. Circle Pines: American Guidance Service, Inc; 1986.
  • Hoffman PR, Daniloff RG, Bengoa D, Schuckers GH. Misarticulating and normally articulating children’s identification and discrimination of synthetic [r] and [w] Journal of Speech and Hearing Research. 1985;50(1):46–53. [PubMed]
  • Hoffman PR, Stager S, Daniloff RG. Perception and production of misarticulated (r) Journal of Speech and Hearing Research. 1983;48(2):210–215. [PubMed]
  • Kahmi AG, Catts HG, Mauer D, Apel K, Gentry BF. Phonological and spatial processing abilities in language- and reading-impaired children. Journal of Speech and Hearing Disorders. 1988;53:316–327. [PubMed]
  • Kahmi AG, Catts HW. Toward an understanding of developmental language and reading disorders. Journal of Speech and Hearing Disorders. 1986;51:337–347. [PubMed]
  • Kirchner DM, Klatzky RL. Verbal rehearsal and memory in language-disordered children. Journal of Speech and Hearing Research. 1985;28:556–565. [PubMed]
  • Lewis BA, Freebairn L. Residual effects of preschool phonology disorders in grade school, adolescence, and adulthood. J Speech Hear Res. 1992;35(4):819–831. [PubMed]
  • Lewis BA, Freebairn L. Subgrouping children with familial phonologic disorders. J Commun Disord. 1997;30(5):385–401. [PubMed]
  • McNeil M, Prescott T. Revised Token Test. Baltimore: University Park Press; 1978.
  • Morrongiello BA, Robson RC, Best CT, Clifton RK. Trading relations in the perception of speech by 5-year-old children. Journal of Experimental Child Psychology. 1984;37:231–250. [PubMed]
  • Nittrouer S. Age-related differences in perceptual effects of formant transitions within syllables and across syllable boundaries. Journal of Phonetics. 1992;20:351–382.
  • Nittrouer S. Do temporal processing deficits cause phonological processing problems? Journal of Speech-Language-Hearing Research. 1999;42(4):925–942. [PubMed]
  • Rosen S. Auditory processing in dyslexia and specific language impairment: is there a deficit? What is its nature? Does it explain anything? Journal of Phonetics. 2003;31(3–4):509–527.
  • Rosen S, Manganari E. Is there a relationship between speech and nonspeech auditory processing in children with dyslexia? Journal of Speech-Language-Hearing Research. 2001;44(4):720–736. [PubMed]
  • Rvachew S, Jamieson DG. Perception of voiceless fricatives by children with a functional articulation disorder. Journal of Speech and Hearing Research. 1989;54(2):193–208. [PubMed]
  • Saetveit J, Lewis D, Seashore C. Revision of the Seashore Measures of Musical Talents. Iowa City: The University of Iowa Press; 1940.
  • Saxman JH, Miller JF. Short-term memory and language skills in articulation-deficient children. Journal of Speech and Hearing Research. 1973;16(4):721–730. [PubMed]
  • Seashore C, Lewis D, Saetveit J. Manual of Instructions and Interpretations for the Seashore Measures of Musical Talents. Camden, New Jersey: Radio Corporation of America; 1939.
  • Sherman D, Geith A. Speech and discrimination and articulation skill. Journal of Speech and Hearing Research. 1967;10(2):277–280. [PubMed]
  • Shriberg LD, Aram DM, Kwiatkowski J. Developmental apraxia of speech: II. Toward a diagnostic marker. Journal of Speech-Language-Hearing Research. 1997;40(2):286–312. [PubMed]
  • Shriberg LD, Austin D, Lewis BA, McSweeny JL, Wilson DL. The speech disorders classification system (SDCS): extensions and lifespan reference data. Journal of Speech-Language-Hearing Research. 1997;40(4):723–740. [PubMed]
  • Shriberg LD, Kwiatkowski J. A follow-up study of children with phonologic disorders of unknown origin. J Speech Hear Disord. 1988;53(2):144–155. [PubMed]
  • Shriberg LD, Tomblin JB, McSweeny JL. Prevalence of speech delay in 6-year-old children and comorbidity with language impairment. Journal of Speech, Language, and Hearing Research. 1999;42:1461–1481. [PubMed]
  • Smith CR. Articulation problems and ability to store and process stimuli. Journal of Speech and Hearing Research. 1967;10(2):348–353. [PubMed]
  • St Louis KO, Riscello DM. Oral Speech Mechanism Screening Test. Baltimore: University Park Press; 1981.
  • Stark RE, Heinz JM. Perception of stop consonants in children with expressive and receptive-expressive language impairments. Journal of Speech and Hearing Research. 1996;39(4):676–686. [PubMed]
  • Stark RE, Tallal P. Analysis of stop consonant production errors in developmentally dysphasic children. Journal of the Acoustical Society of America. 1979;66(6):1703–1712. [PubMed]
  • Stark RE, Tallal P. Language, speech and reading disorders in children: Neuropsychological studies. San Diego, CA: College-Hill Press; 1988.
  • Steffens ML, Eilers RE, Gross-Glenn K, Jallad B. Speech perception in adult subjects with familial dyslexia. Journal of Speech and Hearing Research. 1992;35(1):192–200. [PubMed]
  • Steinschneider M, Volkov IO, Fishman YI, Oya H, Arezzo JC, Howard MA., 3rd Intracortical responses in human and monkey primary auditory cortex support a temporal processing mechanism for encoding of the voice onset time phonetic parameter. Cerebral Cortex. 2005;15(2):170–186. [PubMed]
  • Sussman J. Perception of formant transition cues to place of articulation in children with language impairments. Journal of Speech and Hearing Research. 1993;36:1286–1299. [PubMed]
  • Systat11. SYSTAT 11 Statistics 1. Richmond, CA: SYSTAT Software, Inc; 2004.
  • Tallal P, Piercy M. Developmental aphasia: Impaired rate of non-verbal processing as a function of sensory modality. Neuropsychologia. 1973;11:389–398. [PubMed]
  • Tallal P, Piercy M. Developmental aphasia: Rate of auditory processing and slective impairment of consonant perception. Neuropsychologia. 1974;12:83–93. [PubMed]
  • Tallal P, Piercy M. Developmental aphasia: The perception of brief vowels and extended stop consonants. Neuropsychologia. 1975;13:69–74. [PubMed]
  • Tallal P, Stark RE. Speech acoustic-cue discrimination abilities of normally developing and language-impaired children. Journal of the Acoustical Society of America. 1981;69:568–574. [PubMed]
  • Wechsler D. Manual for the Wechsler Adult Intelligence Scale. New York, N.Y: Psychological Corporation; 1955.
  • Williams KT. Expressive Vocabulary Test. Circle Pines, Minnesota: American Guidance Service, Inc; 1997.