|Home | About | Journals | Submit | Contact Us | Français|
Jaw movement patterns were examined longitudinally in a 3-year-old male with childhood apraxia of speech (CAS) and compared with a typically developing control group. The child with CAS was followed for 8 months, until he began accurately and consistently producing the bilabial phonemes /p/, /b/, and /m/. A movement tracking system was used to study jaw duration, displacement, velocity, and stability. A transcription analysis determined the percentage of phoneme errors and consistency. Results showed phoneme-specific changes which included increases in jaw velocity and stability over time, as well as decreases in duration. Kinematic parameters became more similar to patterns seen in the controls during final sessions where tokens were produced most accurately and consistently. Closing velocity and stability, however, were the only measures to fall within a 95% confidence interval established for the controls across all three target phonemes. These findings suggest that motor processes may differ between children with CAS and their typically developing peers.
Childhood apraxia of speech (CAS) has been described as a disorder of praxis which results in significantly impaired communication skills. Many children with CAS produce unintelligible speech, making verbal communication extremely challenging. It has been hypothesized that the speech production difficulties seen in children with CAS relate to problems with motor processing (Crary, 1984; Grunwell and Yavas, 1988; Nijland, Maassen, and van der Meulen, 2003a; Nijland, Maassen, van der Meulen, Gabreels, Kraaimaat, and Schreuder, 2003b), suggesting that speech motor control may be altered in CAS. The purpose of the present investigation was to examine longitudinal changes in articulatory control in a child with CAS. Examining movement of the articulators will identify constraints underlying speech production in CAS which will fill gaps in our understanding of this complex disorder.
There is a great deal of controversy among researchers and clinicians as to whether CAS is a distinct speech disorder, the speech and non-speech characteristics associated with this disorder, as well as appropriate assessment and treatment procedures (ASHA, 2007). The differing terms historically used to identify this disorder (e.g. childhood apraxia of speech, developmental apraxia of speech; developmental verbal dyspraxia) add to the challenges in discussing and understanding this speech impairment. A recent technical report on CAS by the American Speech-Language and Hearing Association acknowledges that there is no definitive list of diagnostic criteria to differentiate CAS from other speech impairments (ASHA, 2007). Speech characteristics commonly associated with CAS include inconsistent errors, reduced sound inventories, vowel errors, difficulties producing sound sequences, difficulties imitating speech, groping, as well as impaired prosody (cf. Hall, Jordan, and Robin, 1993; Velleman and Strand, 1994; Shriberg, Aram, and Kwiatkowski, 1997a; b; c; Davis, Jakielski, and Marquardt, 1998; Odell and Shriberg, 2001). Although there is no consensus on a particular feature or set of features that must be present for a diagnosis of CAS, most accounts of the disorder highlight the inconsistency of speech errors. A sound can be produced inconsistently in certain word positions (e.g. at the end of a word), in particular words (e.g. lexically more challenging words) and across multiple productions of the same word (Betz and Stoel-Gammon, 2005). CAS is suspected when a child's speech errors are inconsistent or highly variable, particularly when misarticulations in a given word differ each time the word is produced. Thus, many view inconsistent error patterns as an identifying feature of CAS (Morley, 1957; Rosenbeck and Wertz, 1972; Ferry, Hall, and Hicks, 1975; Williams, Ingham, and Rosenthal, 1981; Dodd and McCormack, 1995; Thoonen, Maasen, Gabreels, Schreuder, and de Swart, 1997; Davis et al., 1998; Forrest, 2003; Marquardt, Jacks, and Davis, 2004).
As an alternative to identifying a list of features in CAS, Shriberg, Campbell, Karlsson, Brown, McSweeney, and Nader (2003a) proposed a tiered approach to differentiating this disorder from other speech impairments. Within this framework, descriptive features (e.g. articulatory groping) are represented on the highest tier, complex contexts (e.g. multisyllabic words) on the middle tier, and linguistic outcomes (e.g. reduced speech intelligibility) on the lowest tier. The tiers interact to represent the relationships between these different entities. In the descriptive feature tier, Shriberg et al. (2003a) identified segmental and suprasegmental features which are noted to be core indicators of apraxia of speech. The segmental features have face validity as markers for CAS and include articulatory groping, metathetic errors, inconsistent productions of the same token, sound and syllable deletions, and vowel errors. These features are rank-ordered based upon how strongly they reflect a praxis problem, with the first, articulatory groping, being the strongest indicator of a deficit in praxis. Suprasegmental features include inconsistent stress, reduced temporal variation, and inconsistent oral-nasal gestures. Shriberg et al. (2003a) proposed that this tiered approach will aid in the differential diagnosis of CAS by highlighting the primary features of the disorder.
While a great deal of attention has been given to the characterization of CAS, many questions remain regarding the precise aetiology of the disorder. CAS is believed to involve a deficit in the planning, programming, and/or execution of speech movements (Bradford and Dodd, 1994; Crary, 1984; Grunwell and Yavas, 1988; Nijland, Maassen, van der Muelen, Gabreels, Kraaimaat, and Schreuder, 2002; Nijland et al., 2003a; b; van der Merwe, 2009). Van der Merwe (2009) incorporated these areas into a four level framework to describe normal and disordered speech motor processing. During the first phase, linguistic-symbolic planning, semantic content, as well as syntactic, morphologic, and phonologic structures in a message are planned. The second phase, motor planning, involves formulating a motor strategy in which spatial and temporal goals for a given speech task are specified. Contextual factors can result in adaptations to the motor plan which can be seen in shortening of the units in the plan, reducing speech rate, distorting sound production, reducing the targets in a word and duplicating sounds. Motor programming is involved in the third phase where movement parameters are specified, such as velocity, displacement, timing, direction, and force. During the final phase, motor execution, motor plans, and programs are translated into speech movements. In the present work, we are interested in exploring motor planning and programming levels of speech motor processing through changes in articulator movement. There is a great deal of emphasis in the literature on the neural structures responsible for motor planning and programming (c.f. Allen and Tsukahara, 1974; Eccles, 1977; Schmidt, 1978; Marsden, 1980; 1984; Evarts and Wise, 1984; Brooks, 1986; Miall, Weir, and Stein, 1987; Dreher and Grafman, 2002; Sakai, Ramnani, and Passingham, 2002), with much less known about deficits in these areas in CAS. Although it has been implied that articulatory control is impaired in CAS, there are no published data on articulator movement in this population. Thus, it is not clear whether the spatial and temporal control of articulator movement differs between children with CAS and their typically developing peers.
Developmental studies have shown that children with age-appropriate speech and language skills modify spatial and temporal features of articulator movement throughout childhood and adolescence (Watkin and Fromm, 1984; Smith and Gartenberg, 1984; Sharkey and Folkins, 1985; Smith and McLean-Muse, 1986; Smith and Goffman, 1998; Goffman and Smith, 1999; Green, Moore, Higashikawa, and Steeve, 2000; Green, Moore, and Reilly, 2002; Walsh and Smith, 2002). As children mature, lip and jaw movements have been reported to decrease in duration and displacement, as well as to increase in velocity (Sharkey and Folkins, 1985; Smith and Goffman, 1998; Goffman and Smith, 1999; Green et al., 2000; Grigos, Saxman, and Gordon, 2005). Children's articulatory movements have also been described to be more variable than those of adults, becoming more stable with maturation (Watkin and Fromm, 1984; Sharkey and Folkins, 1985; Smith and McLean-Muse, 1986; Smith, Goffman, and Stark, 1995a; Grigos, Saxman, and Gordon, 2005; Grigos, 2009).
Studies examining early speech motor control have identified articulator specific patterns of change during development, with respect to lip and jaw movement. There is evidence to suggest that the jaw plays a dominant role in early speech movements in typically developing children (MacNeilage and Davis, 1993; 2000; Green et al., 2000; 2002). In their study of bilabial production in young children (1, 2, and 6-year-olds), Green et al. reported jaw movement to be distinct at all ages, with reduced spatial and temporal coupling between the jaw and lips in 1-year-olds compared to the 2- and 6-year-olds. Excessive jaw displacement was seen in 1-year-olds, while the excursion of jaw movements was smaller in the older children. Further, movement stability was more adult-like for the jaw than the lips, in 1- and 2-year-olds. Similarly, Grigos et al. (2005) reported significant jaw movement changes in 2-year-olds who were followed longitudinally as they acquired the voicing contrast. Production of the voiceless target was facilitated by jaw movement changes. Jaw displacement and velocity increased as the children began consistently producing voiceless bilabial plosives. These changes were observed during the production of the bilabial voiceless target /p/ but not for the voiced cognate /b/. Improved coordination between the jaw and lips was also seen, while coordination between the upper and lower lips did not improve. Taken together, the findings reported in Green et al. (2000; 2002) and in Grigos et al. (2005) indicate that the jaw and lips contribute differently to early speech production, with young children relying more on the jaw to achieve bilabial targets. While changes in jaw displacement and velocity, as well as improved jaw stability may lay the foundation to achieve accurate articulatory contacts in typically-developing children, the contribution of the jaw to speech production in children with CAS is not well understood.
In the current study, we explored the hypothesis that deficits in speech motor control influence articulatory performance in CAS. Articulatory control associated with both correct and incorrect target word productions was examined in a 3-year-old child with CAS. Specifically, jaw movement was tracked over time until the child began consistently producing bilabials and was compared with patterns displayed by typically developing 3-year-olds. In addition, we explored whether spatial and temporal control of movement differed between the child with CAS and the typically developing controls.
Four children participated in the study. The child with CAS was a 3 year 2 month old male. The control group consisted of three males with typically developing speech and language skills between the ages of 3 years and 3 years, 4 months (mean age = 3 years 2 months). The child with CAS was followed longitudinally over an 8-month period (approximately every 3 weeks) until he accurately and consistently produced the bilabial phonemes /b/, /p/, and /m/ across two consecutive sessions with at least 90% accuracy. The three control participants were seen for one data collection session in order to provide an index of the range of movement seen in typically developing 3-year-olds. All participants were native speakers of American English. The children passed an audiometric screening with pure-tone thresholds at or below 25 dB at .5, 1, 2, and 4 kHz.
The diagnosis of CAS was based upon analysis of a 100 word conversational speech sample collected during informal play between the child and the first author, as well as performance on the Goldman Fristoe Test of Articulation (GFTA-2, Goldman and Fristoe, 2000). Language skills were measured using the Test of Early Language Development (TELD-3, Hresko, Reid, and Hammill, 2007). Oral motor skills were examined through the Global Motor Control, Focal Oromotor Control, and Sequencing components of the Verbal Motor Production Assessment for Children (VMPAC, Hayden and Square, 1999). In addition, The Vineland Adaptive Behavior Scales (VABS, Sparrow, Balla, and Cicchetti, 1984) was used as a measure of how the children use their social and communicative skills to interact with the environment.
Speech samples were transcribed, using narrow transcription, by one of the investigators. The samples were used to examine segmental inventory, word shapes, error types, and suprasegmentals, as well as to calculate the percentage consonant error (# of consonants produced in error/total # of consonants × 100) and the percentage vowel error (# of vowels produced in error/total # of consonants × 100). The criteria for the diagnosis of CAS were based upon the eight features (five segmental and three suprasegmental) identified by Shriberg et al. (2003a) as being typical of the disorder. These include articulatory groping, metathetic errors, inconsistent productions of the same token, sound and syllable deletions, vowel errors, inconsistent stress placement, reduced temporal variation, and inconsistent oral-nasal gestures. Using the tiered framework outlined by Shriberg et al. (2003a), the speech sample was examined to determine the frequency of occurrence of each descriptive feature, the complex contexts in which features were seen, as well as the linguistic outcomes of articulation difficulties.
The child with CAS displayed all eight descriptive features described by Shriberg et al. (2003a), suggesting a praxis deficit. Articulatory groping was noted on 27% (27/100) of utterances. Metathetic errors (e.g. ‘food’ /fud/→/duf/) occurred with a frequency of 7% (7/100 of utterances). Within the 100 utterance speech sample there were nine words that were produced more than twice. Inconsistent errors were seen in the repeated productions of 8/9 of these words. Both phoneme and syllable deletions were produced. Phoneme deletions included initial consonant (i.e. ‘camel’ /kæmᵊl/→/αm/) and final consonant (i.e. ‘fish’ /fɪʃ/→/bɪ/) deletions, which occurred with a frequency of 20% (18/90 words) and 31% (22/71 words), respectively. Syllable deletions primarily involved weak syllables (‘elephant’ /εlᵊfᵊnt/→/ααt/). Idiosyncratic sound substitutions were also seen and included /ɾ/→/w/ (i.e. ‘potty’ /pαɾi/→/bwi/) and /f/→/j/ (i.e. ‘fight’ /fait/→/jε/). The child also produced vowel errors, such as diphthong reduction (i.e. /fait/→/jε/, diphthongization (‘cut’ /kt/→/bai/), vowel lowering (i.e. ‘eat’ /it/→/ɪt/), as well as vowel distortions primarily involving diphthongs. Suprasegmental errors involving inconsistent stress placement were seen with a frequency of 78% (18/23 multisyllabic words) and primarily involved placing equal stress on syllables. Reduced temporal variation was perceived as rhythmic patterns in the child's speech were uniform, with reduced pitch and loudness changes. Inconsistent oral-nasal gestures primarily involved nasalization (i.e. ‘go’ /go/→/no/) which was seen with a frequency of 11% (11/100 words) in the speech sample.
The presence of features increased in specific contexts. Phoneme/syllable deletions and vowel errors were more prominent in two and three syllable than in single syllable words. Phoneme and word imitation were difficult for the child, and more articulatory groping was evident during imitation as compared to less volitional tasks. Phoneme sequence constraints were evident as few clusters were produced by the child. As a result, the child primarily produced simple syllable shapes. Performance on diadochokinetic tasks were characterized by reduced rate and accuracy. In addition, the child had difficulty producing volitional oral movements (i.e. lip protrusion/retraction). The linguistic outcomes of the speech production difficulties outlined here include delayed speech development, significantly reduced speech intelligibility, as well as limited improvement in speech production skills with treatment.
Percentage consonant error and percentage vowel error scores were 84.1% and 43.4%, respectively. Examination of the segmental inventory revealed sounds in each manner class. Phonemes were productive in some word positions and marginally represented (fewer than two productions) in others. For example, /b/ and /m/ were productive in the word initial and medial positions, but not in the word final position. There was one accurate production of word initial /p/ (1/4 opportunities) and none in the word medial and final positions. Thus, /p/ was not productive in any word position. The alveolar stop /d/ was productive in the initial and final word positions, and /t/ was productive in the final but not initial position. Velar stops did not appear in the sample, but the glottal stop // was produced in all word positions. Other productive phonemes included /n/ and /w/. The use of fricative and affricates was extremely limited, with /h/, /v/, /θ/, and /dʒ/ marginally represented.
Performance on the GFTA-2 (Goldman and Fristoe, 2000) was at the 1st percentile. The phonetic inventory and the errors produced were consistent with the speech sample results. In addition to the speech sample findings, there was one production each of the velar /g/ (i.e. ‘girl’ /gl/ → /gɔ/) and the fricative /s/ (i.e. ‘scissors’ /sɪzɚz/→/sɪdᵊ/) in the word initial position. The Khan-Lewis Phonological Analysis (KLPA-2) (Khan and Lewis, 2002) was applied to analyse the use of phonological processes in the words elicited by the GFTA-2 (Goldman and Fristoe, 2000) and revealed a standard score of 86 and a percentile of 21. The phonological processes seen with an occurrence greater than 20% were Final Consonant Deletion −54% (e.g. ‘house’ /has/ →/hα/), Cluster Simplification −46% (e.g. ‘window’ /wɪndo/→/wɪno/), Syllable Reduction −30% (e.g. ‘wagon’ /wægᵊn/→/wæ/), and Initial Voicing −28% (e.g. ‘pencils’ /pεnsᵊlz/→/bεno/. There was also evidence of Initial Consonant Deletion (e.g. ‘watch’ /wαtʃ/ →/αt/), Glottal Replacement (e.g. ‘monkey’ /mŋki/ →/mi/), and Metathesis (e.g. ‘jumping’ dʒmpɪŋ/→/mdʒi/).
On the TELD-3 (Hresko et al., 2007), the child with CAS scored one standard deviation below the mean on the Receptive Language Sub-test and two standard deviations below the mean on the Expressive Language Sub-test. Receptively, the child had difficulty with spatial concepts (e.g. in, on, around, behind), comparatives (e.g. biggest, smallest) and in completing three step commands. During expressive language testing, the child had difficulty repeating words, labelling items, and using personal pronouns. There were several occasions where he did not respond, such as when asked ‘Tell me about your favourite game/thing to do’. Performance on the Communication, Daily Skills Living, and Socialization Domains of the VABS (Sparrow et al., 1984) was between one and two standard deviations below the mean. On the Receptive portion of the Communication Domain, the child accurately completed 25/26 items. During the Expressive portion, the child's performance was consistent with the TELD-3 and included difficulties using spatial concepts, asking ‘wh’ questions, and producing utterances longer than three words. On the Daily Skills Living Domain, the child did not receive a score for many personal care items (e.g. washing, dressing). The parent reported to do most of these tasks for the child but noted that ‘he probably could do them if he had to’. Performance on the Socialization Domain seemed to be influenced by the child's verbal abilities, as he had difficulty with items requiring imitation or lengthy responses.
It should be noted that the child with CAS was initially shy, had marked difficulty with imitation, and was aware of his speech production difficulties. As a result, he did not respond when verbal tasks were challenging. Together these factors appeared to compromise the child's performance on the TELD-3 (Hresko et al., 2007) and the VABS (Sparrow et al., 1984). An informal language assessment was also performed revealing age-appropriate pragmatic skills. Difficulty understanding and appropriately expressing spatial concepts was seen. The child with CAS demonstrated an understanding of various syntactic forms, including comparatives, superlatives, the pronouns ‘I’, ‘me’, ‘you’, ‘he/she’, as well as plural and possessive /s/. Syntactic markers, however, were not consistently used at the ends of words which may be more reflective of speech than language difficulties.
The control participants did not display any of the characteristics outlined by Shriberg et al. (2003a). Percentage consonant and vowel error scores were less than 15%. Segmental inventories included /p/, /b/, and /m/, as evidenced by accurate productions at least 20 times during the speech sample. The control participants displayed age-appropriate language, articulation, and oral motor skills. The characteristics of all participants, including performance on standardized measures, are shown in Table I.
A motion capture system (Vicon 460, Vicon Motion Systems, 2001) was used to track jaw movement in three-dimensions. Three reflective markers (each 3 mm in diameter) were placed on the midline of the face. Jaw movement was tracked using a marker placed on the mandible. Head movement was accounted for using reference markers placed on the nose and nasion. To account for vertical head movement, jaw movement was calculated by subtracting y-coordinates from a stationary point on the forehead. Kinematic data analysis was conducted using Matlab, version 7.2 (Matlab, 2007). The system tracked reflective markers at a sampling rate of 120 frames per second. Audio recordings were made using a digital minidisc recorder (M-Audio, MicroTrack 2496). Participants wore a lapel microphone (Audio Technica, Model AT831W) which was placed on the shirt ~ 6 inches from the mouth. All recordings were made in a sound attenuated audiometric booth at New York University.
The target speech utterances, ‘Bob’, ‘Pop’, and ‘Mom’, were elicited from the children during a play activity. These utterances were selected because they included bilabial phonemes in a CVC structure, which allowed for visualization of jaw movement in word-initial and word-final contexts. These are also phonemes which the child with CAS was capable of producing, although they were not consistently produced accurately.
Between 10–12 productions of each target were elicited. The children were first introduced to three dolls: ‘Bob’, a ‘Bob the Builder’ doll, ‘Pop’, an elderly male doll, and ‘Mom’, a female doll. The investigators engaged in play with the children to familiarize them with the characters. The children were encouraged to produce the target words in response to probes presented by the examiner. For example, the children were presented with several toy food items. The investigator would say ‘It is lunch time and everyone is hungry. Who will eat the apple?’ There were instances where additional verbal cues and/or models were required to elicit the appropriate response. Data collection began when the participants were familiar with the target words and understood the task. The targets were elicited in a randomized order. To monitor production of the target phonemes, the child with CAS was asked to name nine probe pictures at the end of each session. The probe words all included word-initial and/or word-final /p/, /b/, or /m/ in a CVC structure (‘mop, hop, cap, bib, cab, tub, home, come, game’).
The child with CAS produced the phonemes /p/, /b/, and /m/ inconsistently at the onset of the study. Errors included omissions (i.e. ‘Bob’ /bαb/→/bα/), distortions (i.e. ‘Pop’ /pαp/→/ pαç/), and substitutions (i.e. ‘Mom’ /mαm/→/bαm/). Data were collected every 3 weeks until the child with CAS produced /p/, /b/, and /m/ in the target words and in the set of probe utterances with 90% accuracy for two consecutive sessions. There were 10 data collection sessions in total over an 8-month period. During this period the child received speech and language therapy three times a week for 45 minute sessions. Treatment was not provided by the investigators. PROMPT therapy was provided by a PROMPT certified clinician. Treatment focused on improving jaw movement for the production of the targets /p/, /b/, and /m/ in VC, CV, and CVC combinations. The child first began receiving speech and language therapy at the age of 2 years, 4 months.
Two listeners (one of the investigators and one naïve listener, experienced with narrow phonetic transcription) separately transcribed each utterance using the International Phonetic Alphabet (IPA). The transcription analysis was based on audio recordings. Reliability was performed on the tokens from three sessions, T1, T5, and T10, and one randomly selected control participant. Rather than randomly selecting sessions to examine for reliability in the child with CAS, T1, T5, and T10 were chosen to reflect a range of performance in terms of accuracy. The percentage agreement between the listeners was 84.2% for the child with CAS and 91% for the control. The first eight trials that did not involve any extraneous movements, atypical vocalizations (e.g. laughing, yawning, or excessive loudness during production of target words), or missing reflective markers were included in the analysis. From these trials, we were interested in examining errors at the phoneme level, the overall accuracy of the target word, as well as the consistency of the error pattern. Four analyses were computed separately for each token (‘Mom, pop, bob’) at each session: (1) PCE = percentage of consonant errors (# of consonants produced in error/total # of consonants × 100); (2) PVE = percentage of vowel errors (# of vowels produced in error/ total # of consonants × 100); (3) PTE = percentage of token error (# of tokens produced in error/total # of tokens × 100); and (4) PEC = percentage of error consistency (1 − (# of error types/# of errors of a given token) × 100). This last calculation was made by subtracting the error ratio from 1 in order to represent error consistency, rather than the consistency of accurate productions (Betz and Stoel-Gammon, 2005). In this analysis, a production was considered in error if any consonant or vowel were produced incorrectly.
A variety of methods for quantifying error inconsistency have been described in the literature. Researchers have examined the error consistency associated with the individual phonemes in words (Tyler, Lewis, and Welch, 2003), as well as across whole word productions (Dodd, 1995; Shriberg et al., 1997c; Ingram, 2002; Betz and Stoel-Gammon, 2005). The method selected for examining error inconsistency in the present work (analysis 4) was described in Betz and Stoel-Gammon (2005) as the ‘overall consistency of error types’. The rationale for using this measure was that it examines error consistency at the word level, as we were interested in changes in consistency across word positions.
Duration, displacement, and peak velocity of jaw movements were measured from kinematic tracings (Figure 1). All trials were examined to determine thresholds for the onset and offset of jaw movement for the target utterances. The onset and offset of movement were based upon peak velocity points in the jaw kinematic trace. The onset of movement was selected as the peak velocity into closing for the word initial bilabials. The peak velocity into closing for the word final bilabial was chosen as the movement offset. Measures of duration were calculated based on the onset and offset of movement in the jaw velocity trajectory.
The onset and offset of movement identified in the velocity trace were used to segment displacement data. Measures of opening and closing displacement were calculated. Opening displacement was calculated as the peak to trough displacement into the vowel /a/ and closing displacement was calculated as the trough to peak displacement into the final plosive (see Figure 1). Peak opening and closing velocities were used as measures of movement velocity.
Segmented displacement traces were normalized for amplitude and time in a method described by Smith, Goffman, Zelaznik, Ying, and McGillem (1995b). For each displacement trace, amplitude normalization was achieved by subtracting the mean of the displacement record and dividing by its standard deviation. Time normalization was achieved by using a cubic spline procedure to interpolate each waveform onto a time base of 1000 points. The spatiotemporal index (STI) (Smith et al., 1995b) was then calculated to examine the stability in movement trajectories across repeated productions of target utterances. The STI was computed by calculating standard deviations at 2% intervals across the repetitions of the time- and amplitude-normalized displacement traces. The STI is the cumulative sum of these 50 standard deviations. The STI indicates the degree to which the set of trajectories converge onto one fundamental movement pattern (Smith, Johnson, McGillem, and Goffman, 2000). Using this method of examining movement trajectory stability enabled us to explore whether there are changes in the stability of the underlying movement pattern in the child with CAS over time. Figure 2 shows normalized jaw displacement traces for the child with CAS (Time 1 and 9) and one control participant.
Paired samples t-tests were performed to test within-participant comparisons between Time 1 (T1; the first data collection session) and Time 10 (T10; the second consecutive session in which bilabial phonemes were produced with greater than 90% accuracy). T10 was chosen for this comparison as it represents the latest point during the data collection period in which the participant's productions were consistently and accurately produced. Separate analyses were performed for each target word. One comparison was made for each kinematic parameter, T1 vs T10. This yielded five comparisons per target word (five parameters (total duration, opening displacement, closing displacement, opening velocity, closing velocity) × 1 interval (T1 vs T10)) which adjusted the alpha level to .01. T-tests were not performed for STI values. Interval estimation was used to compare the child with CAS to the control group. This method uses an interval as an estimator rather than a single value (McCall, 1994). A 95% confidence interval based on between-subject variance was calculated from the control group for each kinematic parameter, establishing an upper and lower limit. Comparisons were then made between the child with CAS's mean performance and the control group's confidence interval.
Lastly, we examined the relationship between frequency of errors (PCE and PVE), consistency of errors (PEC), and jaw movement stability (STI). The percentage consonant error, percentage vowel error, percentage error consistency, and STI values at each session were normalized by subtracting the mean (Time 1 to Time 10) and dividing by the standard deviation. Pearson correlations were then computed between the normalized PCE, PVE, PEC, and STI values, across sessions. Three pairwise comparisons were made: PCE and STI, PVE and STI, PEC and STI, which adjusted the alpha level to .017.
A total of 240 utterances were included in the analysis from the child with CAS (3 targets × 8 productions × 10 sessions) and 72 utterances from the control participants (3 targets × 8 productions × 3 controls). Perceptual (PCE, PVE, PEC, PTE) and kinematic parameters (duration, displacement, peak velocity, and STI) were examined for the child with CAS from T1 to T10, as well as for the control group.
Consonant and vowel errors decreased across sessions, although not always uniformly. The point at which the child with CAS consistently and accurately produced each bilabial phoneme with at least 90% accuracy for two consecutive sessions was identified. This point was achieved at T7 for ‘Mom’, at T7 for ‘Bob’, and at T10 for ‘Pop’ (Table II). The mean PCE ranged from 68.8% at T3 to 0% at T10. Vowel errors were much less prominent, although variable, ranging from 25% at T7 to 0% at T2, T5, and T9. Error consistency was examined using the PEC, where 0% indicates a different error for each token produced in error and 100% indicates that one error type occurs per token (Betz and Stoel-Gammon, 2005). Mean PEC scores varied across sessions, ranging from 36.7% at T3 to 100% at T10. Although fewer consonant and vowel errors were produced across sessions, errors patterns remained inconsistent until T9. Error types included substitutions, distortions, and additions.
As expected, the typically developing controls had much lower percentages of error than the child with CAS (mean PCE of 2.8%, mean PVE of 1.4%, and mean PTE of 2.8%). In addition, the mean total PEC score for the controls was 100%, indicating a high degree of error consistency.
In the controls, movement duration was longest in ‘Mom’ and shortest in ‘Pop’. Across sessions, the child with CAS produced the targets with successively shorter durations. By T10, durations were similar across targets in the child with CAS (Table III). Decreases in mean duration between T1 and T10 reached significance in ‘Pop’ (p = .010), but not in ‘Mom’ (p = .038) or ‘Bob’ (p = .041).
At the onset of the study (T1), mean duration across targets was greater for the child with CAS (M = .91 s) than for the controls (M = .73 s). Changes in movement duration varied according to the target word. The durations of ‘Mom’ and ‘Bob’ fell within the range of the control groups' confidence interval at T1, decreased over time, and were less than the range at T10. In contrast, the duration of ‘Pop’ was initially greater than the control group's confidence interval at T1, decreased over time, and fell within the range by T10.
Mean maximum jaw displacement into oral opening decreased from T1 to T10 in ‘Bob’ (9.95 mm to 8.18 mm) and ‘Pop’ (11.89 mm to 9.97 mm) and increased in ‘Mom’ (6.90 mm to 10.67 mm). Differences between T1 and T10 were not statistically significant. Mean opening displacement was smaller in the child with CAS than in the control group. The child with CAS's opening displacement fell below the range of the control group's confidence interval at T1 and T10 for ‘Mom’, ‘Bob’, and ‘Pop’ (Figure 3).
Fluctuations in oral closing displacement were seen across sessions. Although there was an overall increase in mean closing displacement from T1 to T10 across target words, these differences did not reach statistical significance. Mean closing displacement from T1 to T10 increased from 6.62 mm to 10.19 mm in ‘Mom’, 8.22 mm to 12.39 mm in ‘Bob’, and 10.45 mm to 12.27 mm in ‘Pop’. The child with CAS produced smaller mean closing displacements than the control group at T1. By T10, the child with CAS's closing displacement fell within the 95% confidence interval of the control group for ‘Pop’ and outside of this range for ‘Mom’ and ‘Bob’.
Different patterns of change in peak velocity into oral opening were seen across target words. Mean opening velocity significantly increased in the production of ‘Mom’ from T1 (M = 74.74 mm/s) to T10 (M = 148.43 mm/s) (p = .006). Although mean opening velocity also increased in ‘Bob’ and ‘Pop’ from T1 to T10, these changes did not reach significance. Peak opening velocity was slower for the child with CAS than the controls. Mean opening velocity associated with all three targets fell below the range of the control group's confidence interval at T1. While productions of ‘Bob’ were still below the control group's range at T10, the means for ‘Mom’ and ‘Pop’ were within the 95% confidence interval at this point (Figure 4).
Increases in closing peak velocity were observed in the child with CAS across sessions in all three targets. Peak velocity into oral closing increased across sessions from T1 to T10 in ‘Mom’ (40.21 to 107.37 mm/s), ‘Bob’ (47.27 to 148.65 mm/s), and ‘Pop’ (59.67 to 163.05 mm/s). Increases in closing velocity from T1 to T10 were significant in all three targets (p < .001). Peak closing velocity for all targets fell below the control group's 95% confidence interval at T1 and became more similar to the control group over time. Productions of ‘Mom’, ‘Bob’, and ‘Pop’ were all within the range of the control group's confidence interval at T10.
Comparisons of spatiotemporal stability were performed by examining changes in the STI across sessions. High STIs indicate greater spatiotemporal variability, and low STIs represent more stability across movement trajectories. The stability of the child with CAS's jaw movement patterns increased over time as his STIs decreased, more closely resembling patterns seen in the controls by T10. The child with CAS's mean STI decreased between T1 and T10 from 40.79 to 25.7 in ‘Mom’, 41.45 to 23.22 in ‘Bob’, and 38.20 to 28.05 in ‘Pop’. The control group's STI was 23.49 for ‘Mom’, 24.70 for ‘Bob’, and 21.43 for ‘Pop’. STIs for all target words in the child with CAS did not fall within the control group's 95% confidence interval at T1, but did so at T10 (Figure 5).
PCE, PVE, PEC, and STI values at each session (for the child with CAS) were normalized. Pearson correlations were then computed between the normalized PCE, PVE, PEC, and STI values, across sessions. Correlations closer to +1 represent a strong positive relationship between parameters and correlations closer to −1 indicate a strong negative relationship between parameters. Across sessions, decreases were seen in PCE, PVE, PEC, and STI. A significant positive correlation between PCE and STI values across sessions (r = .653; p = .01) was found. There were no significant correlations between PVE and STI (r = .289; p = .417) and PEC and STI (r = .518; p = .125).
Developmental changes in articulator movement are seen throughout childhood and adolescence. In the present work, jaw movement patterns were examined longitudinally in a child with CAS as he produced bilabial phonemes in CVC structures. This child was followed for 8 months, during which time he began producing /p, /b/, and /m/ with a high degree of accuracy and consistency in word initial and final positions. Jaw movement changes across this period varied by phoneme and primarily included increases in jaw velocity and decreases in duration. Movement pattern stability also increased and was greatest during the final sessions where tokens were produced with the greatest accuracy and consistency. These findings indicate that articulatory control was refined in the child with CAS and that improved accuracy and consistency may have been facilitated by changes in jaw movement. Some degree of maturation was likely to have occurred over this 8-month time period. In addition, the child with CAS was receiving PROMPT therapy during this time. It is plausible that our findings were to some extent influenced by these factors.
At the onset of the study, the 3-year-old child with CAS did not display the well established jaw movement patterns that characterize articulatory control in young typically-developing children (MacNeilage and Davis, 1993; 2000; Green et al., 2000; 2002, Grigos et al., 2005). At T1, movement duration in ‘Pop’ was longer than the controls, but within the control group's confidence interval in ‘Mom’ and ‘Bob’. Displacement and velocity, particularly into oral closure, were smaller and slower than the controls. STIs at T1 were higher for the child with CAS than the controls, indicating reduced movement stability. Interestingly, jaw STIs in the child with CAS at T1 (mean STI = 40.79 (‘Mom’), 41.45 (‘Bob’), and 38.20 (‘Pop’)) were greater than those previously found for 18-month old typically-developing children producing bilabial phonemes (mean STI = 28.7) (Grigos, 2009). Thus, articulatory control in CAS at T1 does not appear to mirror patterns seen in age-matched and younger children with typical speech and language development. It is important to highlight that a high incidence of consonant and vowel errors was also seen at T1. The kinematic findings from T1 are, therefore, associated with tokens in which the child with CAS made numerous articulatory errors. In this context, longer movement time, smaller displacement, slower velocity, and reduced movement stability reflect poorly organized articulatory control that is associated with the production of speech errors. In order to fully understand speech motor processing in CAS, it is also essential to compare articulatory control in accurate productions between groups. This comparison is discussed in the next section.
Notwithstanding differences between the child with CAS and the control group at T1, our longitudinal analysis revealed that the child with CAS may have gradually begun to use movement strategies by T10 that are similar to those seen in younger typically-developing children to achieve the standard production of phonemes. One- and 2-year-olds followed longitudinally as they acquired the voicing contrast were found to increase jaw displacement and velocity as they began producing voiceless bilabial plosives (Grigos et al., 2005). This was taken to suggest that phonemic acquisition may be facilitated by increasing the excursion and speed of jaw movements. In the present work, similar kinematic changes were seen in the child with CAS as his productions became more accurate and consistent. Changes in movement stability were also observed over time. STIs decreased across sessions and were lowest at T10, indicating an increase in jaw movement stability. There was not a uniform increase in stability, however. Fluctuations in STI across sessions suggest that the child was exploring the capabilities and/or limitations of production (Sharkey and Folkins, 1985). The significant correlation between STIs and the PCE supports a relationship between increased jaw stability and improved consonant accuracy. It is important to consider that the child with CAS was receiving PROMPT therapy during the duration of the study, with an emphasis on improving jaw control. The establishment of more mature jaw movement patterns over time through this therapy approach may have contributed to increased accuracy in the production of bilabial phonemes.
Changes in duration also distinguish the child with CAS from the controls. By T10, durations for ‘Mom’, ‘Bob’, and ‘Pop’ were similar in the child with CAS. Thus, the child with CAS did not differentiate durations based on coda consonant type. In contrast, duration was longest in ‘Mom’ and shortest in ‘Pop’ in the control participants, which is consistent with predictions of vowel durations based on consonant manner and voicing classes (Peterson and Lehiste, 1960). The duration pattern seen in the child with CAS may reflect constrained temporal control to achieve these bilabial targets. Other possible explanations include a lack of phonological knowledge, as testing revealed phonological deficits in the child with CAS, or an unintended treatment effect.
One motivation for examining articulatory control in the present study was to identify possible constraints underlying speech production difficulties in CAS. Specifically, it was hypothesized that deficits in speech motor control influence articulatory performance in children with CAS. Our results support this hypothesis. Although the child with CAS accurately produced tokens at T10, some jaw movement parameters continued to fall outside of the 95% confidence interval established for the control group. Thus, articulatory control improved across a time period during which accuracy and consistency of bilabial production improved, yet remained different from the control group, even at T10. For example, closing displacement in ‘Mom’, ‘Bob’, and ‘Pop’ increased over time. While displacement associated with ‘Pop’ fell within the range of the controls by the last session (T10), ‘Bob’ and ‘Mom’ continued to fall below the confidence interval. These results suggest that motor processes differ between children with CAS and those developing speech normally.
It has been proposed that deficits in linguistic planning, motor planning, motor programming, and/or motor execution contribute to speech production difficulties in CAS (Bradford and Dodd, 1994; Crary, 1984; Grunwell and Yavas, 1988; Nijland et al., 2002; 2003a; b; van der Merwe, 2009). In the present study, variations in duration, displacement, velocity, and movement stability appear to reflect motor programming deficits in the child with CAS. Specifically, movement duration sharply declined between T1 and T2 in ‘Mom’ and T1 and T3 in ‘Bob’ and ‘Pop’, with smaller changes seen across the remaining sessions. In contrast, movement displacement, velocity, and stability were modified until T10. These findings can be interpreted to suggest that the manner in which jaw movements were programmed for the experimental task may have changed over time. It is important to highlight that movement duration was either greater than or within the range of the controls at T1, decreasing over time. This pattern differs from the slow movement durations reported in children with language impairment during fine motor tasks (Bishop and Edmundson, 1987; Dewey, Roy, Square-Storer, and Hayden, 1988; Preis, Schittler, and Lenard, 1997; Hill, 2001; Bishop, 2002; Corriveau and Gosswami, 2009) as the child with CAS appears to be using these early timing changes to facilitate accurate production. Such temporal changes may reduce the degrees of freedom to control, which could lead to improved coordination between the jaw and other articulators, as well as other sub-systems to help the child achieve the target (Gracco, 1994). These findings suggest that temporal control may be established before spatial control, as speech production skills improve in children with CAS.
Highly inconsistent consonant and vowel errors can also be indicative of a motor programming deficit. Although fewer consonant and vowel errors were seen over time, changes in error number, error type, or in movement kinematics were not uniform. Such inconsistencies may reflect difficulty programming articulatory movements. Improved phoneme accuracy, along with changes in articulatory control by T10, suggests that the child with CAS may have gradually developed control over aspects of motor programming for the experimental task.
Many questions remain regarding the precise aetiology of CAS. The findings from this study support the hypothesis that motor deficits underlie speech difficulties in CAS and provide direction for future research. We are cautious, however, in generalizing these results from a case study. An additional limitation is that longitudinal measures for the control children were not obtained to track the typical speech motor changes seen across development. Future research should explore articulatory control in larger groups of children with CAS, as well as other speech sound disorders. To examine a linguistic component to CAS, these studies should explore the interaction between language and motor processes in tasks with varying linguistic and prosodic demands. Although the present work focused on jaw movement, future studies should also examine the contribution of the lips and tongue to articulatory performance in this population.
This research was supported in part by a research grant (IR03DC009079-0IAI) from the National Institute on Deafness and Other Communicative Disorders. The authors would like to acknowledge Cara Goldberg for assistance with data processing. We are grateful to all the children who participated in this study, but particularly to the experimental participant and his family for their dedication to the project.
Declaration of interest: The authors report no conflicts of interest. The authors alone are responsible for the content and writing of the paper.