|Home | About | Journals | Submit | Contact Us | Français|
In a sample of 46 children aged 4 to 7 years with Autism Spectrum Disorder (ASD) and intelligible speech, there was no statistical support for the hypothesis of concomitant Childhood Apraxia of Speech (CAS). Perceptual and acoustic measures of participants’ speech, prosody, and voice were compared with data from 40 typically-developing children, 13 preschool children with Speech Delay, and 15 participants aged 5 to 49 years with CAS in neurogenetic disorders. Speech Delay and Speech Errors, respectively, were modestly and substantially more prevalent in participants with ASD than reported population estimates. Double dissociations in speech, prosody, and voice impairments in ASD were interpreted as consistent with a speech attunement framework, rather than with the motor speech impairments that define CAS. Key Words: apraxia, dyspraxia, motor speech disorder, speech sound disorder
A continuing question about persons with Autism Spectrum Disorder (ASD) is whether reported diminished abilities in gross, fine, and oral motor control are causally associated with reported deficits in speech acquisition and performance. The classification term for the speech deficit in question, recently adapted by the American Speech-Language-Hearing Association (ASHA; 2007a, 2007b), is Childhood Apraxia of Speech (CAS). Medical literatures and speech literatures in other countries continue to prefer several other classificatory terms for this disorder, including dyspraxia and developmental verbal dyspraxia. “Childhood” apraxia of speech differentiates congenital and early acquired forms of apraxia of speech from adult acquired forms, but creates a nosological problem because childhood apraxia of speech generally persists into adulthood. We will use the ASHA (2007a) recommended term— CAS.
The strong form of the hypothesis in the title of this paper, hereafter, the ‘CAS-ASD’ hypothesis, is that CAS is a sufficient cause of lack of speech development in at least some children classified as nonverbal ASD. The weak form of the CAS-ASD hypothesis is that CAS contributes to the inappropriate speech, prosody, and/or voice features reported in some children and adults with verbal ASD. Although the present report addresses only the weak form of the hypotheses, the conceptual framework and implications for treatment apply to both forms of the hypothesis. Forthcoming research addresses the strong form of the hypothesis. The following sections provide (a) rationales for the CAS-ASD hypothesis, (b) an overview of idiopathic speech sound disorders, and (c) a summary of speech, prosody, and voice findings in verbal ASD.
The American Speech-Language-Hearing Association Position Statement recommends the following definition of CAS:
Childhood apraxia of speech (CAS) is a neurological childhood (pediatric) speech sound disorder in which the precision and consistency of movements underlying speech are impaired in the absence of neuromuscular deficits (e.g., abnormal reflexes, abnormal tone). CAS may occur as a result of known neurological impairment, in association with complex neurobehavioral disorders of known or unknown origin, or as an idiopathic neurogenic speech sound disorder. The core impairment in planning and/or programming spatiotemporal parameters of movement sequences results in errors in speech sound production and prosody. (p. 1)
Three conceptual and empirical perspectives motivate the hypothesis that CAS may be causal to the absence of speech development in some children with ASD or in others, to perceptible differences in speech, prosody, or voice.
A primary rationale for the CAS-ASD hypothesis is findings indicating that persons with ASD have praxis deficits affecting imitative processes and impairing acquisition and performance of a range of motor skills. Reviews of this literature and the neural correlates of praxis findings in ASD are beyond the scope of the present report; for representative data and overviews of research during the past two decades see Dawson, Mottron, and Gernsbacher (in press); Dowell, Mahone, and Mostofsky (2009); Dziuk et al. (2007); Gernsbacher, Sauer, Geye, Schweigert, and Goldsmith (2008); Green et al. (2002); Goldman Gross and Grossman (2008); McDuffie et al. (2007); Mostofsky, Burgess, and Gidley Larson (2007); Mostofsky et al. (2006); Ozonoff et al. (2008); Page and Boucher (1998); Rogers (2009); Rogers, Bennetto, McEvoy, and Pennington (1996); Russo, Larson, and Kraus (2009); Smith and Bryson (1994); Vivanti, Nadig, Ozonoff, and Rogers (2008); and Zadikoff and Lang (2005). A parsimonious extension of the findings from studies in other motor domains is that a praxic deficit in speech may account for the failure of some children with ASD who have adequate cognitive ability and communicative intent to acquire articulate speech (the strong version of the CAS-ASD hypothesis), and for others with ASD to have atypical speech, prosody, and/or voice (the weak version of the CAS-ASD hypothesis). As reviewed in the following sections, CAS is the one subtype of speech sound disorder whose neurobehavioral substrates could account for the speech, prosody, and voice findings reported in ASD (Shriberg 2010a, 2010b). Unlike dysarthria, a class of neuromuscular speech disorders that constrains the precision of speech production, the transcoding (planning/programming) deficits that define CAS (van der Merwe, 2009) are functionally sufficient to disrupt the onset of speech and/or speech precision and stability.
A constraint on the CAS-ASD hypothesis is that many speech researchers have concluded from diverse conceptual and empirical considerations that speech is domain specific (e.g., Dewey, Roy, Square-Storer, & Hayden, 1988; Kent, 2000, 2004, 2010; McCauley, Strand, Lof, Schooling, & Frymark, 2009; Potter, Kent, & Lazarus, 2009; Smith, 2006; Weismer, 2006; Watkins, Dronkers, & Vargha-Khadem, 2002; Ziegler, 2002, 2008). The perspectives in these and other sources are that the neural substrates of apraxia of speech differ from the neural substrates posited for other motor systems and other types of apraxia (e.g., oromotor apraxia, limb apraxia, ideomotor apraxia).
A recent empirical constraint on the CAS-ASD hypothesis are results discussed in Pickett, Pullara, O’Grady, and Gordon (2009), which summarizes findings from reports of 167 individuals with nonverbal ASD who acquired speech at age 5 or older. Records indicated that these individuals learned skills including “imitating sounds, words, and phrases,” “answering simple questions,” “requesting spontaneously,” “using complete sentences,” and “speaking in spontaneous complex sentences” (p.13). Crucially for the strong version of the CAS-ASD hypothesis, however, the speech findings in Table 1 of Pickett et al. (2009) do not include information consistent with the signs of CAS described later in the present report.
A second rationale for the CAS-ASD hypothesis is based on the possibility of common genetic origins. Whereas numerous candidate genes and regions of interest for autism spectrum disorders have been reported, the widely-studied FOXP2 transcription gene is the only gene to date associated with CAS. The origins of both disorders are viewed as strongly heritable and both involve cognitive-linguistic impairments, suggesting the possibility of genes common to both disorders (e.g., Poot et al., 2010; Vernes et al., 2008).
A constraint on the likelihood of inherited or sporadic genetic comorbidity of CAS and ASD is the wide differences in their reported prevalences, with idiopathic CAS estimated at approximately .1% (Shriberg & Kwiatkowski, 1994) and ASD reportedly at approximately 1% (Rice, 2009), a 10-fold increased risk. Unless a more highly prevalent subtype of CAS than the idiopathic form is posited for either or both nonverbal and verbal ASD, comorbid ASD and CAS would be expected to be extremely rare (i.e., 1/100,000, multiplying the individual probabilities of each disorder).
A third rationale is based on findings reviewed presently indicating that the speech, prosody, and voice characteristics of some children with low and high verbal ASD reportedly are similar to those found in children and adults with apraxia of speech. The validity of this claim for the CAS-ASD hypothesis, the most testable of the three rationales reviewed, requires close examination of the ASD-speech literature, in particular, findings for prosody and voice characteristics. A constraint, however, is that literature findings to date are heterogeneous and lack the conceptual organization needed for comparative analyses. Peppé, McCann, Gibbon, O’Hare, and Rutherford (2007) provide a useful perspective on the precedent speech literature in ASD:
In the research literature, numerous adjectives are used to describe atypical expressive prosody in autism, for example, dull, wooden, singsong, robotic, stilted, overprecise, and bizarre (Baltaxe & Simmons, 1985; Fay & Schuler, 1980); terms that perhaps reflect perceived characteristics of autism more than acoustic features. The fact that adjectives with opposite meanings, such as monotonous and exaggerated (Baron-Cohen & Staunton, 1994), can be used to describe this atypicality suggests a wide variation in either the perception of atypical expressive prosody or in the prosody itself. (p. 1016)
The following overview of idiopathic speech sound disorders attempts to redress this situation. The goal of this tutorial is to introduce terms and concepts needed for efficient review of the ASD-speech literature. The system described in the next section is also used later to organize findings from the present study.
The cover term Speech Sound Disorders (SSD) was adopted by the American Speech-Language-Hearing Association in 2005 to replace both the early 20th century term functional articulation disorders, and the term used for the same clinical entity from approximately 1980 to 2005, phonology disorders of unknown origin. There is no current professional consensus, however, on nomenclature for subtypes of SSD (i.e., excluding disorders of known origin, such as those due to cleft palate, Down syndrome, deafness, traumatic brain injury, or other frank cognitive, structural, sensory, motor or affective disorder). The nosology in Table 1, from a system termed the Speech Disorders Classification System (SDCS: Shriberg et al., 2010a), has evolved for genomic and other descriptive-explanatory research in SSD of currently unknown origin. As indicated, the speech classification terms and concepts in Table 1 are needed to organize both the literature review in Table 2 and findings from the present study. Technical information on perceptual and acoustic procedures used to classify participants’ speech status using the SDCS in a software environment is available elsewhere (Shriberg et al., 2010b; see also http://www.waisman.wisc.edu/phonology/).
Speech Delay (SD) is the SDCS classification term for 3 to 9 year-old children with mildly to severely reduced intelligibility due to age-inappropriate speech sound deletions, substitutions, and distortions. As indicated in Table 1, children with SD generally do not have notable impairments in prosody or voice, an important differential diagnostic sign between SD and CAS discussed below. Relative to typically-developing children, however, children with SD have higher rates of language impairment, lowered intelligibility, and are at greater risk for reading impairment. Two American English population estimates of speech sound disorders using similar definitions and methods (Campbell et al., 2003; Shriberg, Tomblin, & McSweeny, 1999) report approximately similar point prevalence population estimates (15.6%, 15.2%, respectively) at 3 years of age (interpolated finding in Shriberg et al., 1999) and similar estimates (3.8%) at 6 years of age. A third large British English epidemiological study, also using the SDCS definition of SD, reported a population estimate at 8 years of age of 3.8% (Wren, Roulstone, Miller, Emond, & Peters, 2009).
Speech Errors (SE) is the SDCS term for 6 to 9 year-old children whose speech impairment is limited to distortions of one or two English sounds or sound classes: the sibilants /s/ and /z/ and the rhotic consonant /r/ and/or the stressed and unstressed rhotic vowels (as in “bird” and “sister,” respectively). Elementary-school American English children with SE are typically not provided speech services because, as shown in Table 1, SE is generally not associated with prosody-voice impairment, language disorder, or intelligibility deficits and children with SE are not at risk for reading impairment (Shriberg, 2010b; Wren et al., 2009). Using definitions and methods for SE classification adapted from the SDCS, the Wren et al. (2009) epidemiologic study reported a point prevalence of SE at 8 years of age of 7.9%.
Persistent Speech Disorder (PSD) is the SDCS term for speech disorders that persist past 9 years of age and for some speakers, for a lifetime. By 9 years of age, most children with histories of either SD or SE have normalized speech production, but a percentage of adolescents and adults continue to misarticulate. Children with prior SD may continue to have speech sound deletions, substitutions, and/or distortions, and children with prior SE may have persistent sibilant and/or rhotic distortions. As shown in Table 1, depending on whether such speakers have histories of SD or SE, they also may have persistent impairments in language, intelligibility, and/or reading. Flipsen’s (1999) review of survey and epidemiology studies, which also used the SD and SE classification constructs to organize the literature, yielded an estimated prevalence rate for PSD of 2.4%–3.9%.
The fourth classification entity for speakers with idiopathic SSD, Motor Speech Disorder (MSD), includes speakers of all ages whose significant intelligibility deficits are associated with motor speech impairment. As shown in Table 1, MSD subsumes three subclassifications. MSD-Apraxia of Speech (MSD-AOS) is the same clinical entity as Childhood Apraxia of Speech (CAS), a term that the American Speech-Language-Hearing Association adopted in 2007 to replace the prior terms Developmental Apraxia of Speech and Developmental Verbal Dyspraxia (the latter term continues to be used in medical contexts and in most other countries). As indicated previously, CAS will be the reference term for this classification in the present paper.
The core feature of both congenital and acquired apraxia of speech is a deficit in the planning/programming processes that transcode linguistic representations to the articulatory movements for speech. Motor Speech Disorder-Dysarthria (MSD-DYS), the second subclassification of MSD, is itself, a cover term for several subtypes of neuromuscular deficits (e.g., spastic dysarthria, ataxic dysarthria, hyperkinetic dysarthria) in the production of speech sounds (Duffy, 2005). Motor Speech Disorder-Not Otherwise Specified (MSD-NOS) is a recently proposed classification entity (Shriberg et al., 2010a) for speech signs that are not specific for apraxia or dysarthria and for speakers who have signs of motor speech involvement, but do not meet inclusionary criteria for either CAS (i.e., MSD-AOS) or MSD-DYS.
As indicated in Table 1, each of the three MSD classifications is characterized by deletions, substitutions, and distortions of sounds. Unlike SD, SE, and PSD, however, each MSD classification is also characterized by significant and persistent deficits in prosody and voice features. Speakers with MSD likely have concomitant language disorder, typically have significant intelligibility deficits, and generally are at increased risk for reading impairment. As cited previously, based on clinical referrals to one University speech clinic in a moderate-sized city, a preliminary estimate placed the population prevalence of CAS at .1% (Shriberg & Kwiatkowski, 1994). Several published and unpublished sources internationally indicate false positive rates for CAS of 80 to 90%, reflecting the lack of consensus on the inclusionary and exclusionary criteria for this disorder, especially as suspected in toddlers, preschool, and early elementary age children. There are no available prevalence estimates for MSD-DYS or MSD-NOS, although many researchers suggest that subclinical dysarthria and delays in maturation of sensorimotor systems subserving speech (i.e., MSD-NOS) may account for a substantial proportion of idiopathic speech sound disorders.
The considerable body of research on the language characteristics of speakers with ASD (see Smith, 2007; Tager-Flusberg, 2009; Tager-Flusberg, Paul, & Lord, 2005 for reviews) has reported extensive heterogeneity of expressive ability among children with verbal ASD, ranging from children with only single word or simple word combinations to children with precocious levels of vocabulary and sentence structure. Tager-Flusberg and Joseph (2003) have proposed a system for classifying subtypes of language development within speakers with ASD, with other investigators raising validity issues about the system (e.g., Eigsti, Bennetto, & Dadlani, 2007; Whitehouse, Barry, & Bishop, 2007).
In contrast to the widespread intense interest in the language abilities of children with ASD, few studies have focused on the speech abilities of children, adolescents, and adults with ASD. Table 2 includes a summary of prevalence estimates for subtypes of speech sound disorders and speech, prosody, and voice impairment findings in speakers with verbal ASD. The entries in Table 2 do not include questionnaire data or single case study observations. Only information on productive speech, prosody, and voice behaviors is included, not studies of speech perception or comprehension in ASD; for reviews of the latter domains see Diehl, Bennetto, Watson, Gunlogson, and McDonough (2008); Diehl, Watson, Bennetto, McDonough, and Gunlogson (2009); McCann and Peppé (2003); and Paul, Augustyn, Klin, and Volkmar (2005). The format and content in Table 2 is the first to organize prevalence and descriptive findings using the SDCS classifications described in Table 1.
As shown in the top section of Table 2, three studies have estimated the prevalence of subtypes of speech impairment in ASD, each using definitions of one or more subtypes of speech impairment consistent with the subtypes in Table 1. Impairment consistent with Speech Delay (SD) occurred in 12% of the 3 to 9 year-old children with ASD studied by Cleland, Gibbon, Peppé, O’Hare, and Rutherford (2010). Rapin, Dunn, Allen, Stevens, and Fein (2009) reported SD and Speech Errors (SE) in 24% of participants with ASD during this age period. Cleland et al. also reported that 33% of the children with ASD studied had either SE or Persistent Speech Disorder (PSD). Shriberg, Paul, et al. (2001) reported that 33% of adolescents and adult study participants had PSD. Thus, although each of the subtypes of speech impairment in Table 1 have been reported in ASD, few between-study comparisons are possible due to differences in the age groups studied. Among the 11 studies in Table 2 in which the data could be interpreted as absence of support or support for SD in ASD (indicated by “X”), four have reported absence of support for SD and seven have reported support for SD in ASD. Velleman et al. (2010) is the only study series to date supporting speech impairment consistent with the SDCS term Motor Speech Disorder-Not Otherwise Specified. Although frank CAS was not observed in their studies, the findings Velleman and colleagues report using an array of perceptual and acoustic indices are consistent with MSD-NOS.
The remaining entries in Table 2 organize findings in the ASD literature using the Prosody (Phrasing, Rate, Stress) and Voice (Pitch, Loudness, Laryngeal Quality, Resonance) domains in the SDCS. The most well-studied prosody domain is Stress, with the 16 studies in Table 2 reporting impairments in participants’ with ASD ability to produce correct contrastive, emphatic, sentential, syntactic, and syllable stress (for reviews, see McCann & Peppé, 2003; McCann, Peppé, Gibbon, O’Hare, & Rutherford, 2008; Paul, Augustyn, et al., 2005; Paul, Bianchi, Augustyn, Klin, & Volkmar, 2008; Peppé et al., 2007). As shown in Table 2, impairments have been reported in at least one published study of ASD for each of the other 6 prosody and voice domains.
The primary goal of the present study was to assess the weak version of the CAS-ASD hypothesis—the hypothesis that concomitant CAS may be a sufficient causal explanation for at least some of the speech, prosody, and voice impairments reported in ASD. A secondary goal of the study was to estimate in a sample of verbal young children with ASD the prevalence of the two primary forms of speech impairment of unknown origin: Speech Delay and Speech Errors.
Participants with ASD in the present report were involved in a larger study of prosody in young children with ASD. This study contained tasks that required both fluent language production and a mental age of at least 4 years for their completion, thus limiting the present sample to children who met these inclusionary criteria. Participants with ASD were recruited from three sources: (a) a database of children who had participated in a research study on early identification of autism spectrum disorders at the Yale Child Study Center during the four years prior to data collection for the present study, (b) families who had responded to requests for participants posted on websites describing research at the Yale Child Study Center, and (c) local community fairs at which representatives provided literature describing research studies at the Yale Child Study Center. Inclusionary criteria for participants who completed the assessment protocol were: (a) a previous diagnosis of autism, PDD-NOS, ASD, or Asperger syndrome from a qualified clinician; (b) full scale IQ ≥ 70; (c) mean length of utterance of at least 3.0, based on transcription of a 3–5 minute conversational sample; (d) >70% of words intelligible in the language sample; and (e) normal hearing and vision (or corrected with glasses) on standard screening. Exclusionary criteria included known craniofacial or neurological impairment or bilingual background.
Table 3 provides descriptive statistics for the 46 children aged 4 to 7 years (78% boys) who met the inclusionary and exclusionary criteria and were consented to participate in the present study. Diagnostic characterization was determined by findings from administration of the Autism Diagnostic Observation Schedule-Generic (ADOS-G: Lord, Rutter, DeLavore, & Risi, 2000), using diagnostic algorithms from Gotham, Risi, Pickles, and Lord (2007), combined with data from parent report forms and clinical observation by two experienced clinicians, using DSM-IV-TR (APA, 2000) diagnostic criteria. The ADOS-G was administered and scored by a clinician certified for ADOS administration. Thirty-eight of the participants were administered ADOS Module 3, and 8 Module 2. The examiner also obtained information from the Social Communication Questionnaire (SCQ; Rutter, Bailey, & Lord, 2003), a parent report measure that has shown a high degree of sensitivity and specificity for discriminating children classified as having ASD (Chandler et al., 2007). Diagnostic assignment was made by the clinician administering the ADOS, in consultation with a second and third clinician who viewed the ADOS videotape. One of these clinicians had also administered other measures in the assessment protocol shown in Table 3, and all had conferred with the child’s parent(s) regarding current and past diagnostic presentation. The clinician who administered the ADOS presented her observations to the others. All participating clinicians discussed observations derived from the ADOS interview and responses to the assessment battery, including findings from the parent reports and questionnaires.
Children who met clinicians’ consensus DSM-IV-TR (American Psychiatric Association, 2000) criteria for an ASD diagnosis based on history and clinical presentation, and who met ASD threshold criteria for both ADOS and SCQ, were assigned a diagnosis of ASD. Those who met clinical criteria based on history and current presentation, but failed to meet threshold criteria by 1–2 scale points on either the ADOS, using the Total Algorithm score (Gotham et al., 2007), or the SCQ were classified as “borderline ASD.” A total of 29 participants (63%) met criteria for ASD and the remaining 17 participants for borderline ASD. Statistical analyses completed on over 120 independent and dependent variables to be reported in this study indicated no statistically significant between-group differences or directional trends, allowing the two groups to be combined.
Table 3 includes descriptive statistics for the ADOS and five other measures of cognition and language. Cognitive measures included the two Wechsler intelligence tests shown in the footnote to Table 3, depending on chronological age. The Syllable Repetition Task (SRT; Shriberg et al., 2009) was designed to eliminate the productive speech constraint when assessing speech processing in speakers of any age who misarticulate speech sounds. The Peabody Picture Vocabulary Test-III (Dunn & Dunn, 1997) and the Clinical Evaluation of Language Fundamentals-4 (Semel, Wiig, & Secord, 2003) provide standardized information on receptive and expressive language in several linguistic domains. The Vineland Adaptive Behavior Scales-II (Sparrow, Cicchetti, & Balla, 2005) and the Socialization scale of the Autism Diagnostic Observation Scale (Lord et al., 2000) provide information on communication and social function. The ADOS interviews also served as the continuous speech samples for the participants with ASD.. Prior research (Shriberg & Kwiatkowski, 1985) has documented that continuous speech samples are robust to examiner topics, and detailed speech analyses based on ADOS samples compared to conventional continuous speech samples has been successfully used in prior ASD research (Shriberg, Paul, et al., 2001).
Raw scores on all standard measures except the ADOS Socialization scale were converted to standard scores (mean 100; standard deviation 15), with higher scores indicating better ability. Higher scores on the ADOS Socialization scale indicate greater autistic symptomatology. As shown in Table 3, most of the participants with ASD scored within the normal range on the cognitive and language measures, with averaged scores indicating impairment, as expected, on the Vineland and ADOS socialization scales.
To address the hypothesis of motor speech disorder in ASD, comparison data were assembled from three study groups.
Raw scores on the speech, prosody, and voice variables to be reported were standardized to z-scores using 10 typically-speaking, age-gender matched children from a database that included 40 children from 4 through 7 years, 10 each at ages 4, 5, 6 and 7 (see Table 3). Demographic and psychometric information on this database used for studies in speech sound disorders is provided in Potter et al. (2010); Shriberg et al. (2010a, 2010b); and Shriberg, Potter, & Strand (in press). Because the only source of recorded data for the present study was participants’ continuous speech during ADOS assessment, only the continuous speech samples from this database were used for the statistical comparisons to be described. As described presently, raw scores on tasks were used for some analyses; elsewhere, means and standard deviations from this reference group were used to obtain age-and gender-standardized scores for the speech, prosody, and voice variables reported for each of the other groups in the present study.
The second comparison group was a sample of 13 children from 4 to 6 years of age with Speech Delay (SD; Shriberg et al., 2010a). These children were selected from intake referrals to a speech clinic to sample the range of moderate to severe SD warranting early intensive speech services. As with the ASD and TD groups, only the data from continuous speech samples were used for the present analyses.
The third comparison sample was a group of 15 children and adults with CAS in the context of neurogenetic disorders (Shriberg, 2010c). The 15 speakers with CAS included 8 participants with classic galactosemia (Shriberg, Potter, & Strand, in press), 3 participants with disruptions in the FOXP2 gene (Shriberg et al., 2006; Shriberg, Jakielski, et al., 2010), 3 siblings with an unbalanced chromosome 4q;16q translocation, (Shriberg, Jakielski, & El Shanti, 2008) and 1 participant with Joubert syndrome (Shriberg, 2010c). Classification of each participant as CAS had been completed on the basis of their quantitative performance on an extensive research protocol. The unique feature of this group of speakers with CAS is that all had documented neurogenetic rather than idiopathic backgrounds. Their heterogeneity in gender, age, and etiologic backgrounds maximized their value as a comparison group with the core signs (i.e., those common and persistent across gender, age, and etiologic background) of CAS. As with the other two comparison groups, only the data from the continuous speech samples were used for the present analyses.
Perceptual and acoustic reduction of the digitally-recorded continuous speech samples from the four datasets were completed following procedures for research in pediatric speech sound disorders (Shriberg et al., in press-b). Transcription and prosody-voice coding was obtained from continuous speech during the ADOS for the ASD participants and from a conventional continuous speech sample for each of the comparison groups. Early research on conversational speech sampling for speech analyses documented that continuous speech samples are robust to examiner topics; prior speech analyses based on continuous speech during the ADOS (Shriberg, Paul, et al., 2001) supported the validity of ADOS-based samples for detailed speech analyses.
Three research transcribers transcribed and prosody-voice coded subsets of the speech samples using well-developed narrow phonetic transcription (Shriberg & Kent, 2005) and prosody-voice coding (Shriberg, Kwiatkowski, & Rasmussen, 1990) protocols. The Prosody-Voice Screening Profile (Shriberg, Kwiatkowski, & Rasmussen, 1990) provides perceptually-based coding information on participant’s appropriate production divided into seven orthogonal domains of prosody and voice termed Phrasing, Rate, Stress, Loudness, Pitch, Voice Quality, and Resonance. Among other standardization procedures, transcribers/prosody-voice coders were blinded to participants’ diagnostic status and followed a set of conventions to accommodate speech, prosody, and voice differences associated with regional dialects. Two acoustic analysts used a standardized protocol to obtain acoustic information from the recordings. As described in detail in Shriberg et al.(2010b), the protocol yielded information on 87 indices of participants’ competence, precision, and stability in each of the speech, prosody, and voice domains. All transcription, prosody-voice coding, and acoustic analyses were accomplished in a software environment that provided acoustic-aided transcription and transcription-aided acoustics (Shriberg, Allen, McSweeny, & Wilson, 2001). The software environment uses a set of decision rules to classify subtypes of speech impairment, including SD, SE, and PSD as described in Table 1. Extensive reliability data for the transcribers and acoustic analyses on challenging data sets have been reported (Shriberg, Potter, & Strand, in press). Due to the lack of discipline consensus on the pathognomic signs of CAS (i.e., MSD-AOS) and MSD-NOS, support for these classifications has to be induced by appeal to the percentage of motor speech indices on which a speaker is positive (i.e., exceeds a standard deviation from the mean of the standardization sample). As will be described, a series of speech, prosody, and voice indices of CAS and MSD-NOS were used to address the CAS-ASD hypothesis. Additional technical information on data collection, data reduction, and data analyses in the software environment is available elsewhere (Shriberg et al., 2010a; see also http://www.waisman.wisc.edu/phonology/).
Table 4 includes prevalence findings for three of the four types of speech impairment described previously in Table 1: Speech Delay (SD), Speech Errors (SE), and CAS. All of the ASD participants were younger than 9 years of age, and therefore ineligible for Persistent Speech Disorder (PSD). Additionally, based on the precedent literature and the sampling procedures for this study, MSD-DYS (dysarthria) was not expected in sample participants.
The left section of Table 4 is a summary of the percentage of participants with ASD who met criteria for SD, as defined in Table 1 and derived by software algorithms. Point-prevalence data from the three epidemiological studies of speech sound disorders reviewed previously are included in Table 4 for ease in comparing the present prevalence findings for ASD with population estimates. The only same-age data for such comparisons is for the 6-year-old participants with ASD, at which the small number of ASD participants includes one with SD (10%) compared to the population estimates in two studies of 3.8%. A more robust comparison is the mean of 15.2% finding for SD in the 46 ASD participants aged 4 to 7 years, a finding that is essentially similar to the population prevalence of SD in 3-year-old children estimated in two large scale epidemiologic projects (Campbell et al., 2003; Shriberg, Tomblin, & McSweeny, 1999).
The prevalence findings for SD in Table 4 are interpreted as indicating modest increased risk for concomitant SD in young children with ASD. This conclusion is influenced by the weight of the findings in the precedent literature in ASD summarized in Table 2. As noted, SD for 3 to 9 year-old children with ASD was estimated at 12%, generally consistent with 15.2% for the children with ASD in the present sample. Moreover, 7 of 11 studies (64%) cited in Table 2 reported means or percentage data supporting increased prevalence of SD (primarily based on articulation test percentile data) in children with ASD.
The middle section of Table 4 is a summary of the percentage of participants with ASD who met criteria for SE at the minimum age of 6 years (see Table 1) and at 7 years of age (n = 22). Findings are subtotaled for speech errors on sibilants and rhotics, with summary percentages indicating children who had errors in one or both sound classes. The only available prevalence comparison data for SE in a large epidemiologic study is Wren and colleagues’ (2009) prevalence estimate of 7.9% at 8 years of age. Although the prevalence finding of 31.8% SE averaged over the 6-and 7-year-old participants with ASD cannot be directly compared with the Wren et al. estimate, note that there would need to be 75% normalization rate by 9 years (i.e., from approximately 32% to approximately 8%) for the two values to be consonant. As such a one-year normalization rate is unlikely, the present findings are interpreted as support for substantially higher risk for concomitant SE in children with ASD. Although the ages of the present participants did not allow an estimate of the persistence of SD or SE past 9 years of age (i.e., PSD), it is reasonable to assume that prevalence rates past 9 years would also be higher than the 7.9% SE estimate from Wren et al. at 8 years of age. Finally, as indicated in Table 2 and Table 4, the present study is the third reporting that approximately one of every three children with ASD meet criteria for SE, particularly dentalized sibilants (Table 4).
The primary question addressed in this study is whether some children with verbal ASD have concomitant CAS. Because the speech data were obtained only from continuous speech samples, rather than from speech assessment protocols designed for comprehensive assessment of motor speech processes (e.g., lexical stress tasks, challenging word tasks, repeated measures tasks for target and error stability), only a subset of the diagnostic signs of CAS could be assessed in this study. As demonstrated in prior studies, however, sensitive and specific signs of CAS are available in continuous speech samples (Shriberg, Potter, & Strand, in press). Specifically, core signs of CAS, including slow speaking and articulation rates, spatiotemporal vowel errors (reduced vowel space, lengthened vowels), and distorted consonants and consonant transitions, can be identified using standardized perceptual and acoustic indices of these behaviors in continuous speech. To provide the widest possible screen, the analyses to be described included signs of both CAS and MSD-NOS. As noted previously, due to a lack of consensus on the exact set of speech, prosody, and/or voice features that are sensitive to and specific for CAS, the only clinical-research approach to classification is to appeal to some minimal number of positive markers (i.e., the number of proposed indices of CAS or MSD-NOS on which a speaker suspected to have CAS is positive).
The 10 putative indices of CAS or MSD-NOS in the top panel in Figure 1, titled Sensitivity, were those among a list of 38 that have consensual and construct validity to identify motor speech disorders (Shriberg et al., 2010a). The indices sensitive to CAS or MSD-NOS are arranged by decreasing order of speakers with CAS who scored positive on the index in the continuous speech (cross-hatched bars). For example, 100% of the speakers with CAS were positive on the index termed Lengthened Vowels, as defined by having average vowel lengths (in milliseconds) longer than one standard deviation from typically developing, same-or younger-aged speakers. The dashed lines at 50% and 75% are visual aids indicating the relative sensitivity of each index. Seven of the 10 indices were obtained using acoustic methods (bolded), two of the 10 with narrow phonetic transcription, and one with prosody-voice coding. The values above each pairwise comparison are the one-tailed effect sizes, with an asterisk indicating statistical significance at the .05 level (StatXact, Cytel Software, 2001). To minimize Type II errors of interpretation associated with cell sizes, we treat each motor speech sign family-wise (without adjustment for multiple testing), with emphasis on the magnitudes of the significant effect sizes.
As shown in Figure 1, 100% of the participants with CAS were positive for the first 3 of the 10 indices of motor speech disorder, with 50% to over 75% of participants positive on the remaining 7 indices. In comparison, the 46 speakers with ASD (filled bars) were over 75% positive on only one index of motor speech disorder, with two other indices over 50% positive. Effect sizes for the 7 statistically-significant pairwise comparisons were large by conventional criteria (greater than .80). Space constraints prohibit discussion of the technical details of each of the indices. Essentially, each captures a different element of the precision and stability of spatiotemporal aspects of speech production. These findings are interpreted as counter-support for the hypothesis of motor speech disorder consistent with apraxia of speech or MSD-NOS in children with verbal ASD.
Findings in the bottom panel in Figure 1, titled Specificity, are also interpreted as counter-support for the CAS-ASD hypothesis. To assess the possibility that there might be a subgroup of the participants with ASD who have concomitant CAS, findings for the 7 participants with SD (Table 4) were compared to findings for the comparison group with CAS. For these analyses, the reference group used to standardize index raw scores was the participant group with Speech Delay (SD). Thus, whereas the data in the top panel in Figure 1 assess the sensitivity of the motor speech markers to identify speech impairment, the bottom panel findings in Figure 1 assess the specificity of the 38 indices, i.e., their ability to discriminate CAS or MSD-NOS from Speech Delay. The bottom panel includes the five indices of motor speech disorder on which the comparison group participants with CAS differed by more than one standard deviation from the comparison speakers with SD. Four of the five pairwise comparisons with the ASD participant’s performance on these indices yielded large significant effect sizes, with none of the ASD participants scoring below the SD participants.
Figure 2 provides summary graphic and statistical information focusing on prosody-voice characteristics of the participants with ASD. As continuous speech samples are the primary, and for some variables, the only source for these indices, it is appropriate to compare findings for the speakers with ASD to those with each of the other comparison groups. As shown in the key to the symbols, these include in addition to the ASD target group (A-filled circles); the typically-developing speakers (T-open circles), the speakers with Speech Delay (S-open squares) and the speakers with CAS (C-open triangles). In each panel, the top section provides the numeric data on the three prosody and four voice domains for each of the four groups. Boxes around the numeric data in Panel A indicate significant one-way analyses of variance, with a key to the conventional symbols for significant p values at the bottom of each panel. For Panels B–D, the boxes in the numeric data indicate significant effect sizes for the pairwise comparisons. As indicated below each of the three panels, the conventional effect size (ES) adjectives from Cohen (1988; S: Small, M: Medium, L: Large), are extended for increased sensitivity to include V: Very Large and E: Extremely Large. Significant ES’s (i.e., confidence intervals not crossing 0) are underlined. These same values are plotted graphically below the numeric values. Scores of 80–90% on this measure (see dashed lines) are considered marginal impairment and scores below 80% as impairment.
As indicated in Figure 2, Panel A, significant analyses of variance results were obtained for the prosody variables of Rate and Stress and for the voice variables of Loudness, Pitch, and Laryngeal Quality. The findings in Panels B–D provide follow-up pairwise analyses to determine the source and effect sizes for all significant differences in the omnibus test. It is efficient for conceptual perspectives to review the findings by prosody-voice variable, rather than by comparison group.
As shown in Figure 2, Panel A (see both the numeric and graphic sections), an average of only 70.9% of the utterances in the continuous speech of the speakers with CAS had appropriate Rate. Inspection of subcodes not shown in Figure 2 indicated that the remaining approximately 29% were subcoded “Too Slow” for each speaker’s age (< 2 syllables/second, including pause time). In comparison, as shown best in Panel D, the ASD participants averaged 97.7% utterances that were appropriate in Rate for their age (2–4 syllables/sec). The significant ES for this difference was >1.0 (Very Large). As shown in Panel C, ASD participants had significantly more utterances with appropriate Rate than the group average for the participants with Speech Delay, although the latter average (87.3%) was within the marginal range for the PVSP. As shown in Panel B, appropriate Rate values for ASD and same-aged, typically-developing children were within 1 percentage point of one another.
Similar to literature findings, the 67.9% average number of utterances with appropriate Stress of the CAS speakers was significantly lower than the Stress findings for the other groups (Panel A). The pairwise effect size indicated that whereas the averaged Stress value of 82.6% appropriate utterances for the participants with ASD was not significantly different from the average of the Typically-Developing speakers (Panel B) or the speakers with Speech Delay (Panel C), it was significantly higher than the speakers with CAS (ES= >.80; Large). Inspection of subcodes in the Stress analyses and of the continuous speech transcripts were completed to determine the locus of inappropriate Stress in ASD compared to CAS participants. It is efficient to defer comments on these findings to the Discussion.
Speakers with ASD had significantly fewer utterances with appropriate Loudness (81.0%), as indicated in Figure 2, Panel A, with this value at the lower boundary of marginally inappropriate. The subcodes data indicated that the source of this finding was that participants with ASD were coded as having excessive loudness (Too Loud) on most of these inappropriate utterances. On this voice variable the speakers with ASD were significantly different from the Typically-Developing speakers (Panel B: Moderate ES), and the speakers with Speech Delay (Panel C: Large ES) and CAS (Panel D: Large ES).
Speakers with ASD differed significantly from speakers in the other three groups on the omnibus comparison of Pitch (Panel A). This difference was not significantly different than the average Pitch values for the Typically-Developing speakers (Panel B) or the speakers with Speech Delay (Panel C), but was significantly lower than obtained for the participants with CAS (98.5%; Moderate ES). Inspection of the subcodes indicated that speakers with ASD had utterances coded as Too High or Variably High pitch, findings also obtained in the acoustic analyses.
For the voice domain of Laryngeal Quality, speakers with Speech Delay had significantly lower percentage of utterances with appropriate laryngeal quality (61.9%) than the speakers with ASD (84.2%; Very Large ES). Inspection of the subcodes indicated that more of the speakers with Speech Delay had utterances coded as Rough.
Additional prosody-voice analyses addressed the possibility explored in the bottom panel of Figure 1 that the subgroup of 7 participants with ASD classified as SD might have prosody-voice findings more consistent with the values of the participants with CAS, or the less specific subtype of motor speech disorder, MSD-NOS. These subgroup analyses provided no statistical support or trends for CAS or MSD-NOS, with the prosody-voice profiles of the 7 ASD participants with SD wholly consistent with the averaged profile of the comparison speakers with SD.
The findings from this study do not support the hypothesis that CAS is a prevalent concomitant disorder in persons with verbal ASD. Moreover, they do not support the prevalence in the present sample of a less specific motor speech disorder termed Motor Speech Disorder-Not Otherwise Specified. With one exception, each of the pairwise statistical comparisons indicated significant dissimilarities in the speech, prosody, and voice characteristics of participants with ASD compared to the comparison group of participants with CAS in neurogenetic contexts. The one exception is the finding in Figure 1 that over 75% of the 46 speakers with ASD had Increased Repetitions and Revisions relative to same-age typically-developing speakers, a somewhat higher percentage on this fluency index than found for the comparison group of participants with CAS. Later discussion will address possible explanatory sources underlying this finding. Here, two sets of findings meeting criteria for a double dissociation are viewed as the primary counter-support for the CAS-ASD hypothesis.
The first dissociation indicated in the comparative analyses in Figures 1 and and22 is that participants with ASD did not have the significantly slow speech rate, lengthened vowels, and uncommon phoneme distortions that are core signs of motor speech disorders in adults (e.g., Duffy, 2005) and in contemporary research in CAS (ASHA, 2007b; Aziz, Shohdi, Osman, & Habib, 2010). With the exception of trends reported in Velleman et al. (2010), these spatiotemporal features of motor speech disorders have also not been reported in the speech studies of ASD of the past approximately five decades (Table 2).
The second dissociation is that participants with ASD had voice differences not reported in CAS. Unlike most of the pioneer studies in ASD, the present study used acoustics to quantify loudness (amplitude), pitch (frequency), duration (in milliseconds) and stress (intersyllabic ratios of amplitude, frequency, and duration). Using both perceptual and acoustic measures of these variables, the present study found that a statistically significant percentage of children with ASD had inappropriate loudness and inappropriate pitch (Figure 2). As shown in Table 2, there are 16 entries from studies reporting several types of inappropriate productive Stress (prosody) and 10 entries reporting inappropriate Loudness or Pitch (voice). Recently, for example, Grossman, Bemis, Skewerer, and Tager-Flusberg (2010) reported that participants with high functioning autism stressed the correct syllables in a sentence completion task, but both first-syllable stress and last-syllable stress were achieved with longer total word durations than obtained in typically-developing peer controls. Again, differences in an array of measures of Stress have been widely reported in both the adult and child literatures in autism and in apraxia of speech, but differences in vocal pitch and loudness increasingly reported in autism have not been reported in descriptions of apraxia of speech.
A framework, termed the speech attunement framework, may offer a conceptually and clinically coherent explanatory perspective on the present findings. The construct of attunement (bringing into harmony) has been used in many disciplinary topics in child development as a descriptive-explanatory framework to model the attentional and intentional processes underlying stability and change (e.g., Drake, Jones, & Baruch, 2000; Legerstee, Markova, & Fisher, 2007; Szajnberg, Skrinjaric, & Moore, 1989). The speech attunement framework posits that the acquisition of articulate speech and appropriate prosody-voice requires a child to ‘tune in’ to the oral communications of the ambient community and to ‘tune up’ the phonological and phonetic behaviors subserving intelligible and socially appropriate speech, prosody, and voice production. Neurobehavioral correlates of the auditory-perceptual, imitative, and motor control processes associated with these two place-marker primitives —tune in and tune up—are studied in basic and applied research in the communicative and social deficits in ASD. A preliminary version of the speech attunement framework was proposed to investigate individual differences in the origin and persistence of Speech Errors (Shriberg, 1994; Shriberg, Kwiatkowski, Best, Hengst, & Terselic-Weber, 1986). A second adaptation of the attunement framework for speech disorders derived the construct of ‘focus’ for a two-factor treatment approach (Kwiatkowski & Shriberg, 1993, 1998). Closer to current concerns, the speech-attunement framework has been proposed as a descriptive-explanatory heuristic to account for and conduct research in the conceptually challenging array of speech, prosody, and voice findings in ASD (Shriberg, Paul, et al., 2001; Paul, Shriberg, et al., 2005; Paul et al., 2008).
The potential utility of the speech attunement framework for research in ASD rests on the validity of two assumptions. The first assumption, based on auditory perceptual findings from several investigator groups in ASD (e.g., Bonnel et al., 2010; Heaton, 2003; Järvinen-Pasley, Wallace, Ramus, Happe, & Heaton, 2008; Jones et al., 2009; Mottron, Dawson, Soulières, Hubert, & Burack, 2006; Samson, Mottron, Jemel, Belin, & Ciocca, 2006), is that some individuals with ASD have enhanced auditory perceptual ability underlying the propensity to ‘tune in’ to the acoustic features of speech. The second assumption is that due to challenges in communication intent and social reciprocity, persons with ASD do not experience the pragmatic press to ‘tune up’ the precision of their speech, prosody, and/or voice, both during development and on-line in discourse. The following discussions elaborate on how the primary findings of the present paper might be consistent with these two somewhat opposing tenets of an explanatory framework for speech and prosody-voice development and control in persons with ASD.
Present and prior findings indicate modest increased risk for Speech Delay (SD) in verbal ASD. Thus, for most persons with verbal ASD, the speech processes in typical acquisition and performance—encoding phonological representations, retrieving the representations, transcoding representations to gestures underlying speech production (i.e., planning/programming), and executing the neuromotor commands for speech—are largely intact (e.g., McCleery, Tully, Slevc, & Schreibman, 2006). As concluded over a decade ago by Kjelgaard and Tager-Flusberg (2001) and as shown by the estimates in Table 2, articulation seems to be “spared” in most children with ASD. Support for the importance of that perspective is evident in the present study’s findings indicating age-typical performance on a nonword repetition task, a task that is sensitive to genetically-inherited verbal trait disorders, but is not part of the ASD phenotype (Bishop et al., 2004). Many genetic and environmental risk factors other than those associated with a heritable verbal trait disorder could underlie the modest increase in SD in the present sample, and in the studies reviewed in Table 2.
What is notable for theory, although relatively negligible for handicap, is the high percentage of children 6 to 9 years of age with ASD who have Speech Errors (SE), with a high percentage of common residual errors in both SD and SE not resolved by 9 years (i.e., PSD: Cleland et al., 2010; Shriberg, Paul, et al., 2001). As noted previously, the speech attunement framework was originally developed to explain the origin and persistence of common speech errors (dentalized sibilants, derhotacized rhotics) in children with no other known neurodevelopmental disorders (Shriberg et al., 1986). The original formulation proposed that some children do not ‘tune in’ to speech sounds and/or appear to need more time to ‘tune up’ their distortion errors. Other children who are attuned to speech-language (i.e., are advanced in developmental milestones) may ‘tune in’ before they have the required neuromuscular ability to make difficult sounds, resulting in a distortion that, through habit strength, is difficult to correct despite significant attempts to ‘tune up.’
In the present context, and for similar findings reported in Cleland et al. (2010) and Shriberg, Paul, et al. (2001), the two tenets of speech attunement framework might provide a useful explanatory perspective on the finding of increased rates of SE in autism. Several causal pathways may be posited. For the origin of SE, tuning in to speech early due to enhanced auditory perceptual propensity and capability might lead to speech production distortions, especially if there are some motor constraints that widen the gap between intention and capability (i.e., motor skill). Alternatively, affective social reciprocity constraints might delay tuning in to the nuances in auditory events that must be imitated precisely and/or successfully tuned up to avoid PSD.
Recall the observation that nearly all the prosody and voice findings in the present study are not consistent with CAS or MSD-NOS. They are consistent, however, with the speech attunement framework for speech acquisition and performance in children and adults with verbal ASD. The following sections review findings from four of the seven domains of prosody-voice from the attunement perspective. For the remaining three domains (Rate, Laryngeal Quality, and Resonance), speakers with ASD had similar percentages of appropriate utterances as found in the TD and SD participants.
The finding of Increased Repetition and Revisions in ASD participants is consistent with some descriptions of ASD speech as disfluent. In the present context, repetitions and especially revisions can be interpreted as support for on-line monitoring, with high repetition-revision frequencies evidence of being closely attuned to speech output, i.e., tuned in to one’s own speech, but variably successful in attempting to tune up as needed. It would require instrumental analysis of each repetition and revision to attempt to differentiate signs of motor difficulty with the original production relative to second and additional repetitions and revisions.
As shown in Table 2, a number of different inappropriate productive Stress patterns have been reported in the ASD literature. In contrast, in both adult and child literatures in apraxia of speech, the type of inappropriate Stress typically reported is termed excessive-equal stress (Duffy, 2005; Shriberg et al., 2003). Subcode analysis for the samples indicated that the more frequent code for the inappropriate Stress percentages shown in Figure 2 was Misplaced Stress. As well, subcodes for utterances coded as inappropriate Loudness (Too Loud) and inappropriate Pitch (Too High) occurred on syllables, words, and phrases not typically stressed for emphasis. Thus, rather than the excessive/equal stress documented as a core feature of congenital and acquired apraxia of speech, the prominence due to misplaced stress likely underlies the percepts of ‘odd,’ ‘singsong,’ and the other adjectives in the earlier excerpt from Peppé et al. (2007) and in Table 2 that may be taken to be signs of motor impairment.
The speech-attunement explanatory perspective also differs substantially from explanatory perspectives such as proposed in Sharda et al. (2010). Sharda and colleagues propose that pitch and other prosodic differences in ASD, such as indicated in the examples above, might reflect a productive version of ‘motherese.’ In the speech-attunement framework, such misplaced, unconstrained vocal behaviors are viewed as lapses in on-line attunement to conversational partners in discourse. Studies in progress are examining these syllable stress ratios, with the possibility that the typical weighting of loudness, pitch, and duration are unstable relative to same-age speakers with typical development, speakers with SD, and speakers with CAS.
The principal observation reported here is that the Stress findings termed Misplaced Stress in the current sample of children with ASD are dissimilar to the well-documented excessive-equal stress patterns in adult and child forms of apraxia of speech. As well, the significant Pitch and Loudness findings for participants with ASD in the present study and likely those in the ASD literature were different in form and origin from prosody and voice signs in motor speech disorders.
In contrast to support for speech-motor deficits in ASD, the present findings are interpreted as consistent with a speech attunement framework. The framework posits that persons with ASD have normal to enhanced auditory-perceptual and auditory-monitoring skills (they tune in), but have affective, social reciprocity challenges that mediate the acquisition, performance, and monitoring of appropriate speech, prosody, and voice in discourse (they variably tune-up). These place-holder terms, are, of course, proxies for the neurocognitive and neurophysiological correlates of tuning in and tuning up studied in basic research in ASD. The modest increase in the prevalence of speech delay in children with verbal ASD could reflect a number of genetic and/or environmental risk factors that delay the acquisition of articulate speech.
Generalizations from the present study are limited by several methodological constraints. Future research should include cell sizes with adequate power to assess the influence of a number of possible moderators and mediators of prevalence and type of speech impairment, including age, cognitive levels, language impairment, and intelligibility status. A significant methodological constraint was the limitation of findings to those available from a continuous speech sample. Acoustic studies using controlled imitations of vowel stimuli at increasing rates and other challenging speech tasks (e.g., Shriberg, Potter, & Strand, in press) are needed to assess the reliability and extensiveness of these findings and their implications for motor speech impairment in ASD. Another constraint on generalization from the present study was the minimum inclusionary criterion of 70% intelligibility, with the possibility that findings might be different for speakers with lower levels of intelligibility. Future research should also use protocols that classify, in the same study sample, participants’ speech status and status on the many other oral motor and fine and gross motor domains, towards an integrated account of motor systems in persons with verbal ASD.
Last, the present study did not directly address the strong CAS-ASD hypothesis—that comorbid CAS may be a sufficient cause of nonverbal ASD. Research designs assessing speech, prosody, and voice in nonverbal ASD are challenging. A number of preliminary and emerging assessment methods requiring only minimal early vocal behaviors may be informative (e.g., Davis & Velleman, 2008; Nagamani et al., 2009; Peppé, McCann, Gibbon, O’Hare, & Rutherford, 2006; Russo et al., 2009; Sheinkopf, Mundy, Oller, & Steffens, 2000; Strand, McCauley, Weigand, Stoeckel, & Baas, 2010; van Santen, Prud’hommeaux, & Black, 2009; van Santen, Prud’hommeaux, Black, & Mitchell, 2010) and many promising assessment approaches not requiring any speech production are emerging for the significant needs of this clinical population (NIDCD, 2010).
We thank the following people in Madison for their assistance with this research: Jessica Hersh, Heather Karlsson, Heather Lohmeier, Jane McSweeny, Rebecca Rutkowski, Christie Tilkens, and David Wilson. In addition, we thank the following people in New Haven: Maysa Akbar, Kathleen Koenig, Moira Lewis, and Allison Lee for their data collection and clinical characterization, and Fred Volkmar, Director, Yale Child Study Center.
This research was supported by National Institute on Deafness and Other Communication Disorders Grants DC007129 and DC000496 and by a core grant to the Waisman Center from the National Institute of Child Health and Development (Grant HD03352).
The following authors contributed to this work: Lawrence D. Shriberg, Waisman Center, University of Wisconsin-Madison, Madison, WI; Rhea Paul, Child Study Center, Yale University School of Medicine, New Haven, CT; Lois M. Black, Center for Spoken Language Understanding, Oregon Health & Science University, Beaverton, OR; and Jan van Santen, Center for Spoken Language Understanding, Oregon Health & Science University, Beaverton, OR.