|Home | About | Journals | Submit | Contact Us | Français|
A series of three experiments examined children’s sensitivity to probabilistic phonotactic structure as reflected in the relative frequencies with which speech sounds occur and co-occur in American English. Children, ages and years, participated in a nonword repetition task that examined their sensitivity to the frequency of individual phonetic segments and to the frequency of combinations of segments. After partialling out ease of articulation and lexical variables, both groups of children repeated higher phonotactic frequency nonwords more accurately than they did low phonotactic frequency nonwords, suggesting sensitivity to phoneme frequency. In addition, sensitivity to individual phonetic segments increased with age. Finally, older children, but not younger children, were sensitive to the frequency of larger (diphone) units. These results suggest not only that young children are sensitive to fine-grained acoustic–phonetic information in the developing lexicon but also that sensitivity to all aspects of the sound structure increases over development. Implications for the acoustic nature of both developing and mature lexical representations are discussed.
Acquisition of a natural language requires the extraction of a variety of types of distributional information from the ambient linguistic environment. From this distributional information, infants and young children must induce sets of rules that characterize the underlying structure of the native language, thereby allowing the generation of novel utterances. Some of these rules are syntactic, operating on the order and morphological inflections of words, whereas other rules are phonological, operating on the order and patterning of sublexical sounds. Mastery of the phonological rules of the native language presumably facilitates the acquisition of the lexicon by constraining the sound patterns that are entertained as candidate words. The goal of the current series of studies was to determine whether young children are sensitive to the probability with which phonemes occur and to the constraints under which pairs of phonemes occur in succession within words.
Acquisition of the phonology of the native language begins well before infants utter their first words at approximately 12 months of age. By six months of age, infants are able to discriminate virtually any phonetic contrast that they will hear in any of the world’s languages (Eimas, Siqueland, Jusczyk, & Vigorito, 1971; Trehub, 1976; Werker & Tees, 1984). By this age, infants become less likely to respond differentially to phonetically related, functionally equivalent tokens in spite of random variations in pitch, speaker, speaking rate, or surrounding phonetic context (Hillenbrand, 1983, 1984; Holmberg, Morgan, & Kuhl, 1977; Kuhl, 1983). At approximately 7.5 months of age, they have begun to learn the stress patterns of words in their input language (Jusczyk, Cutler, & Redanz, 1993; Jusczyk, Houston, & Newsome, 1999). By 10 months of age, they have become sensitive to how the individual speech sounds are combined to form the words of their input language (Friederici & Wessels, 1993; Jusczyk, Friederici, Wessels, Svenkerud, & Jusczyk, 1993), the relative frequencies with which the various speech sounds occur and co-occur (Jusczyk, Luce, & Charles-Luce, 1994), how the speech sounds vary depending on their position within words (Jusczyk, Hohne, & Bauman, 1999), and how those differences signal word versus syllable boundaries (Mattys & Jusczyk, 2001; Mattys, Jusczyk, Luce, & Morgan, 1999). Finally, Chambers, Onishi, and Fisher (2003) showed that 16-month-olds can acquire sensitivity to changes in phonotactic constraints after brief exposures.
Infants’ sensitivity to the sound system of their input language contrasts with the fact that older infants and young children who have begun to speak often make perceptual errors in judging potential words in their native language. Shvachkin (1948/1973), Garnica (1973), Eilers and Oller (1976), and Stager and Werker (1997) all reported that children failed to discriminate minimal phonetic contrasts in meaningful contexts even though they showed sensitivity to these same minimal phonetic contrasts at an earlier point in development. Because young children do not exhibit the same sensitivity to phonetic detail that adults do, researchers have suggested that children store words holistically, perhaps in terms of prosodic structure and/or over-all acoustic shape or perhaps in terms of some coarsely defined phonetic features (e.g., Ferguson & Farwell, 1975; Treiman & Baron, 1981). Some researchers have posited that children’s lexical representations are stored holistically only until the onset of the vocabulary spurt at approximately 19 months of age (Ferguson, 1986; Ferguson & Farwell, 1975; Locke, 1988; Menyuk & Menn, 1979; Studdert-Kennedy, 1986). Others have argued that holistic lexical representations continue throughout early childhood until the lexicon is restructured as either a precursor to or a consequence of learning to read1 (e.g., Fowler, 1991; Jusczyk, 1986; Metsala & Walley, 1998; Treiman & Baron, 1981; Walley, 1993).
Support for this claim comes from a body of research that has examined children’s speech perception abilities, speech production abilities, and patterns of lexical organization. Treiman and colleagues (Treiman & Baron, 1981; Treiman & Breaux, 1982) found that children age 4 years 4 months grouped spoken words based on overall word shape rather than on shared phonetic segments. Similarly, Nittrouer and colleagues (Nittrouer & Studdert-Kennedy, 1987; Nittrouer, Studdert-Kennedy, & McGowan, 1989) found that younger children’s speech perception and production were more influenced by the overall acoustic shape of fricative vowel syllables than by the individual segments that make up the syllables. Finally, Charles-Luce and Luce (1990, 1995) and Logan (1992) reported that children’s lexicons contain fewer similar sounding words than do adult lexicons. These sparser phonological neighborhoods allow children to use global holistic perceptual strategies because children do not need to attend to fine-grained phonetic detail. Findings from these studies have converged to show that children are not sensitive to the fine-grained phonetic detail that presumably characterizes adult lexical entries. Therefore, researchers have suggested that words in the developing lexicon are holistically stored until the early school years (e.g., Walley, 1993).
There are problems with the suggestion that the acoustic forms of children’s lexical representations are holistic for a protracted period and, consequently, are different from adult lexical representations. First, without sufficient fine-grained acoustic–phonetic detail, we would expect children to show deficiencies in word recognition and to make a large number of articulatory errors. Although there is evidence that children’s earliest words may be stored in terms of holistic properties (e.g., Ferguson, 1978; Reich, 1986), it seems unlikely that lexical entries after the vocabulary spurt would be so underspecified as to contain no information about the temporal order of the individual units. Second, at an age when children fail to differentiate minimal phonetic pairs in newly taught words (Eilers & Oller, 1976; Garnica, 1973; Shvachkin, 1948/1973; Stager & Werker, 1997), they can successfully differentiate minimal phonetic pairs if both members of the pairs are spontaneously produced in their everyday speech (Barton, 1976, 1978). As described earlier, by that age (10–26 months) children have achieved perceptual constancy (Hillenbrand, 1983, 1984; Holmberg et al., 1977; Kuhl, 1983). Evidence that they do not respond to changes in phoneme identity is not necessarily an indication that they do not notice these differences. Rather, it could be that failure to differentiate minimal phonetic pairs reflects young children’s difficulty in forming novel acoustic representations along with a willingness to accept a high degree of acoustic variation so long as it does not signal a change in meaning in the children’s lexical system.
An alternative to the protracted holistic lexical organization hypothesis is that children ‘s lexical entries do indeed contain fine-grained phonetic detail. However, children have poorer memory, attention, and processing abilities than do adults, and all of these are unrelated to their speech perception abilities. Children’s difficulty with word recognition tasks can be attributed to immature general cognitive abilities rather than to qualitatively different lexical representations. Evidence is emerging from studies of children’s speech perception abilities, their speech production abilities, and their patterns of lexical development showing that children are indeed sensitive to the fine-grained acoustic–phonetic detail that is characteristic of the adult lexicon.
Gerken, Murphy, and Aslin (1995) found that 3- and 4-year-olds confused a real word target with a nonword that differed by two features in a single position more often than they did with a nonword that differed by two features in different word positions. If children were perceiving holistically, these two conditions should have been the same. Swingley and Aslin (2000, 2002) conducted eye-tracking experiments and found that infants as young as 14 months are sensitive to subtle mispronunciations (e.g., opple for apple) that have been cited as evidence of children’s holistic perceptual strategies. Jaeger (1992) examined children’s speech production and found evidence of anticipation errors in children as young as 17 months (e.g., dig dog for big dog). These types of errors suggest that children are representing the consonant and vowel separately, rather than as a larger diphone unit, even though their articulation of these consonant–vowel sequences may overlap more than that of adults (Nittrouer et al., 1989). Dollaghan (1994) examined the structural properties of the expressive lexicons of young children (ages 10 months to 1 year 9 months) and found that 84% of the words in these children’s expressive lexicons had at least one near phonological neighbor. She concluded that children must have considerable acoustic–phonetic skill to be able to differentiate words in their lexicons. Coady and Aslin (2003) found that phonological neighborhoods are actually denser in the developing lexicon than in the mature lexicon, relative to vocabulary size. They concluded that children are not maintaining maximal distinctiveness among their lexical entries but rather are building their lexicons with a preference for words that contain more frequent phonotactic patterns. By using more careful measurements and more sensitive experimental paradigms, these studies provided evidence that very young children are sensitive to segmental lexical information in referential speech perception tasks, although representations at the earliest stages of lexical acquisition are not fully adult-like.
The current set of studies examined the degree to which children are sensitive to the fine-grained acoustic–phonetic properties of their input language by having them repeat nonsense words varying in phonotactic frequency. Although nonwords were originally used to minimize potential lexical frequency or familiarity effects on repetition accuracy, researchers have consistently found that children more accurately repeat those nonwords that reflect the properties of their lexicons. Gathercole and Baddeley (1989) found that children repeat nonwords containing singleton consonants (e.g., woogalamik) more accurately than they do those containing consonant clusters (e.g., blonderstaping). In a subsequent study, Gathercole and colleagues found that children more accurately repeated those nonwords that adults had given higher wordlikeness ratings (Gathercole, Willis, Emslie, & Baddeley, 1991; see also Gathercole, 1995). Dollaghan, Biber, and Campbell (1993, 1995) showed that older children more accurately repeated nonwords in which the stressed syllable corresponded to an actual English word (e.g., BATHesis) as compared with nonwords that did not contain embedded real words (e.g., FATHesis). Finally, Beckman and Edwards (1999) and Munson (2001) found that children more accurately repeated consonant sequences spanning syllable boundaries within nonwords that are actually attested in their lexicons (e.g., /ft/ in after) than they did those that are not (e.g., /fk/), even though adults gave these nonwords similar wordlikeness ratings. As Bowey (2001) stated, “Any manipulation that increases phonological complexity decreases nonword repetition performance” (p. 443).
These factors—the presence of singleton consonants versus consonant clusters, subjective wordlikeness ratings, the presence of embedded real words, and the presence of attested phoneme sequences—all can be described in terms of phonotactics. Phonotactics refers to the rules governing the arrangement of allowable speech sounds within a given language and can be further divided into sequential phonotactics, or absolute constraints on how the sounds of the language can be combined to form the syllables and words of the language, and probabilistic phonotactics, or the relative frequencies with which the sounds occur and co-occur in the syllables and words of the language. With regard to these various lexical factors, Gathercole et al. (1991) found a negative correlation between rated wordlikeness and the presence of consonant clusters. Later, Frisch, Large, and Pisoni (2000) reported that adults rated nonwords with higher frequency phonotactic patterns as more wordlike than those with less frequent phonotactic patterns (see also Bailey & Hahn, 2001). Also, embedded real words have higher phonotactic probability in that those particular sequences of phonemes actually co-occur at rates proportional to word frequency. Finally, the presence of attested consonant sequences is a direct measure of sequential phonotactics. To the extent that repetition accuracy depends on the degree of overlap between a nonword and existing lexical entries, and to the extent that other factors (e.g., motor planning, ease of articulation) can be controlled, the nonword repetition task can be used to examine children’s sensitivity to phonotactic structure within the lexicon.
Recent work on probabilistic phonotactics indicates that infants and adults are sensitive to this source of information. As reported previously, Jusczyk et al. (1994) found that 9-month-olds preferred to listen to lists of nonsense words with more frequent sounds and sound combinations (as compared with nonwords with lower frequency phonotactic patterns), whereas 6-month-olds showed no such preference. Vitevitch and colleagues (Vitevitch & Luce, 1998, 1999; Vitevitch, Luce, Charles-Luce, & Kemmerer, 1997; Vitevitch, Luce, Pisoni, & Auer, 1999) used these same nonsense words to show that adults respond more quickly to the nonsense words with higher frequency phonotactic patterns. However, what remains unclear is whether sensitivity to phonotactic information during the early stages of lexical development is affected by the words in the lexicon or whether it is determined solely by the first-order distributional information contained in the native language input independent of its sequential “packaging” in words.
The current investigation of children’s sensitivity to probabilistic phonotactic structure was motivated by three different factors. First, because infants become sensitive to probabilistic phonotactics during their first year (Jusczyk et al., 1994), it is important to determine whether this sensitivity is also reflected in their early speech productions. Is it the case that young children’s early productions are influenced more by what children can say than by the phonotactic structure of the lexicon? Second, because probabilistic phonotactic frequency and neighborhood density are highly correlated (words in denser neighborhoods contain more frequent sound patterns), examining children’s sensitivity to phonotactic frequency allows for an alternative method of assessing the structure of their expressive lexicons without being confounded by their lexicon’s smaller size. Third, previous studies of phonotactic sensitivity have simultaneously varied phoneme frequency, phoneme frequency by syllable position, and diphone frequency. Phonotactic sensitivity may have resulted from any or all of these factors. Because phonotactic frequency includes both the frequency of occurrence of individual segments and the frequency of co-occurrence of segments, children’s sensitivity to speech units of different sizes can be examined. If children store words holistically, in terms of some unit larger than the individual phoneme, they should show sensitivity to the frequency of larger speech units, such as the diphone, but not to the frequency of individual segments. However, if children are sensitive to individual segments, they should be sensitive to phoneme frequency when diphone frequency is controlled.
To this end, the current series of three experiments included children ages and years. The younger age was chosen because it falls approximately 1 year after the naming explosion. Children at this age have undergone roughly a year of rapid lexical development. During this period, they have established a sizable lexicon over which to draw phonetic and phonotactic regularities. Older children were also included to examine developmental trends in repetition accuracy. These older children have experienced an additional year of lexical growth and, therefore, have a larger corpus over which to draw generalizations. Furthermore, this is the age at which children begin to consistently use consonant clusters (Templin, 1953), and this is a relevant milestone in the acquisition of phonotactic structure. Importantly, both ages are well below the onset of literacy, at which point holistic lexical representations are said to become segmental (e.g., Walley, 1993). Experiment 1 was designed to establish the utility of the nonword repetition task as a tool for assessing children’s sensitivity to overall phonotactic frequency. Experiments 2 and 3 were designed to assess children’s more fine-grained sensitivity to phoneme frequency and their sensitivity to phoneme pairs with phoneme frequency held constant.
This experiment was designed to maximize the differences between the high- and low-phonotactic frequency nonwords. The high-phonotactic frequency condition contained a set of nonwords constructed from sounds that were both very frequent overall and frequent in all syllable positions in a sample of English. The low-phonotactic frequency condition contained a set of nonwords constructed from the least frequent sounds that were also infrequent in all syllable positions in a sample of English. The working hypothesis was that children would respond more accurately to nonwords with higher frequency phonemes than to those with lower frequency phonemes; that is, that memory for the distribution of phonotactic patterns would facilitate more accurate repetition of nonwords with frequent phonotactic patterns as compared with nonwords with less frequent sound patterns. Furthermore, these phonotactic frequency effects were hypothesized to account for children’s accuracy even after potentially confounding variables, such as ease of articulation and the lexical status of embedded syllables, were statistically controlled.
The participants were 12 -year-old children and 12 -year-old children. The older children ranged in age from 3 years 5 months 24 days to 3 years 8 months 13 days, with a mean age of 3 years 7 months 16 days. Of these 12 older children, 8 were girls and 4 were boys. In addition, 1 other girl (age 3 years 6 months 15 days) participated but was not included in the final data set because she refused to repeat the nonsense words. The younger children ranged in age from 2 years 4 months 19 days to 2 years 7 months 6 days, with a mean age of 2 years 5 months 20 days. Half of these younger children were girls and half were boys. In addition, 7 other children (5 girls and 2 boys) participated but were not included in the final data set. These children, whose mean age was 2 years 5 months 24 days, were excluded because they refused to repeat the nonsense words (5 children) or had poor articulatory abilities (1 child) or due to experimenter error (1 child). None of the children had a history of hearing problems, and all came from homes in which English was the only language.
An estimate of the phonotactic structure of English relevant to young children was obtained from the Brown corpus in the CHILDES database (Brown, 1973; MacWhinney, 1991). This corpus consisted of Adam and Sarah’s interactions with their mothers and included all sessions until each child had reached the age of 3 years 6 months. This age was chosen because it is roughly the age at which children begin to consistently use consonant clusters (Templin, 1953). Eve’s sessions were excluded from the analysis because her family moved away when she was 2 years 3 months, leaving an incomplete data set for current purposes. All words in each corpus were transcribed as described in Coady and Aslin (2003). Relative phoneme frequencies and diphone frequencies for Adam and Sarah, as well as for their mothers, were calculated separately. Syllable breaks were marked by hand based on Kahn’s (1976) formula for the syllabification of English and, along with word boundaries, were marked as individual segments. Then, for every occurrence of a given phoneme, the phonemes that followed it were tallied. From these raw counts, forward and backward transitional probabilities were calculated. As an example, consider the word bee. The probability of /i/ given the preceding /b/ is the forward transitional probability, whereas the probability of /b/ given the following /i/ is the backward transitional probability.
Based on these frequency counts, two lists of nonwords containing eight disyllabic and eight trisyllabic nonwords that differed in phoneme frequency alone were created. The phonemes chosen for the high-phoneme frequency nonwords were the most common phonemes with the highest frequency in their relative syllable positions. The opposite was true for the low-phoneme frequency nonwords. Voiced fricatives were excluded because children acquire them late. Nasals, laterals, and rhotics were avoided in syllable final position because they tend to color the vowel. The nonwords had the basic structure (CV)·CV·CVC, with stress always being placed on the penultimate syllable. Only tense vowels were used so as to preclude ambisyllabicity. The stimuli are listed in Appendix A. Although combining more frequent phonemes results in higher probability nonwords relative to combining lower frequency phonemes, there were no differences between the two groups of nonwords in terms of their diphone probabilities, F(1, 28) = 1.683, p > .10. For a description of how the stimuli for all three conditions were constructed, see Table 1.
A female native English speaker from the western New York area recorded the stimuli directly onto a Macintosh computer in a soundproof room. The nonwords were then transferred to a Sony minidisk. A speaker of the local dialect was used so that participants would not misunderstand the nonwords due to dialect differences. The nonwords were spoken at a normal speaking rate with a slightly exaggerated pitch, typical of child-directed speech. The low-phoneme frequency nonwords were of longer duration (high frequency: 834 ms; low frequency: 976 ms), F(1, 28) = 25.979, p<.05, possibly because the speaker hyperarticulated them to enhance children’s understanding. Because children were hypothesized to be less accurate when repeating longer items (e.g., Baddeley, 2003), the high- and low-frequency nonwords were normalized to be roughly equal in duration. However, pilot testing revealed that adults found the normalized waveforms to differ drastically in speaking rate. Therefore, the stimuli were left as is, with the duration difference present.
Children were brought to the laboratory by their parents and were tested individually, with parents sitting quietly in the room with their children and the experimenters. After a 10- to 15-min play session that allowed the children to become comfortable with the surroundings and the experimenters, the children were asked to play the sticker game. They then heard a series of four English words that they were asked to repeat for the benefit of either a kitty or a dragon puppet. These served as a warm-up to familiarize the children with the task. They were then told that they would be saying the silly kitty or dragon words. The nonwords were then presented in random order, and the children were asked to repeat them. If the children did not respond after coaxing, the nonwords were played again. If the children still did not respond, the experimenter said the word until they responded or refused a second time. The children received a sticker after every response attempt. The sessions were recorded and scored for accuracy.
Children’s responses were transcribed from the recording of the experimental sessions. Two independent scorers each did a first-pass transcription, and their results were compared. A third listener mediated all disagreements. Each phoneme produced was scored relative to the target phoneme. Certain deviations from the target were not considered as errors. When the children reduced the vowel in the first syllable of a three-syllable nonword, this was counted as accurate. Furthermore, any sound substitutions that a particular child consistently made were counted as correct. Common examples of this included children who did not differentiate /s/ from /ʃ/ and children who consistently replaced /r/ or /l/ with /w/ or /j/. In those cases where children did not respond after a single presentation of the target nonword and the experimenter either replayed or actually spoke the nonword, responses were counted as incorrect. As outlined in Dollaghan and Campbell (1998), only phoneme deletions and substitutions were considered as errors. Phoneme additions were not considered as errors because they do not represent the loss of information. The percentage of phonemes correctly repeated from target nonwords was calculated for each nonword. These percentages were then arcsine transformed and entered into stepwise multiple regression analyses.
Multiple regression analysis was used so that effects of potentially confounding variables, such as ease of articulation and lexicality of stressed syllables, could be statistically removed from the analysis. First, higher frequency sounds might be easier to articulate, or there might be innate preferences for these classes of sounds. Based on our own corpus analyses and those of Delattre (1965), higher frequency consonants are typically stop consonants, whereas lower frequency consonants are typically fricatives. The more frequent sounds in English typically show up as more frequent sounds in many of the world’s languages (Delattre, 1965). As Templin (1953) reported, fricatives are typically acquired later, suggesting that they are more difficult to articulate. Ladefoged and Maddieson (1996) pointed out that articulating a stop consonant requires a ballistic movement rather than the precise placement of the articulators required when articulating a fricative. Thus, there is evidence that more frequently occurring consonants might be easier to produce.
Unfortunately, there is no direct measure of ease of articulation that can be partialled out of the analysis. One possible way of measuring this is to examine the sounds that infants produce in their babbling. Locke (1980) argued that all infants, regardless of their linguistic environment, produce sounds from a universal babbling repertoire (cf. Goad & Ingram, 1987). He noted a positive correlation between how frequently individual sounds occur in infants’ babbling and the number of languages in which the individual sounds are attested in babbling. These two measures—relative frequency in babbling and the number of languages in which the particular sounds are attested—should provide a quantifiable measure of ease of articulation or of articulatory biases (see subsequent discussion).
The second factor that may affect participants’ repetition accuracy is the lexical status of the stressed syllables in the nonwords. As discussed previously, older children (ages 9 years 10 months to 12 years 0 months) more accurately repeated nonwords in which the stressed syllable corresponded to a real English word (Dollaghan et al., 1993, 1995). The nonsense words for the current experiment were created to maximize the phonotactic frequency differences between the two groups of nonwords. Because the higher frequency CV patterns correspond to actual English words, they were included in the nonsense words. Thus, 14 of 16 high-phonotactic frequency nonwords contained embedded English words (which appeared in the children ‘s corpora), whereas only 5 of 16 low-phonotactic frequency nonwords contained embedded English words. Embedded words are a common problem in spoken word recognition. McQueen, Cutler, Briscoe, and Norris (1995) reported that 84% of all English multisyllabic words contain shorter embedded words. Embedded words are quite salient in longer printed words to both reading-disabled children and normal readers (Zivian & Samuels, 1986), and embedded words in the first syllable of disyllabic words facilitate the recognition of spoken words by adults (Luce & Lyons, 1999). However, any potential facilitating effects that the presence of embedded words might have on the nonword repetition accuracy of younger children have not been explored. Although it was not the focus of the current investigation, lexicality of stressed syllables must be considered as a potential (and unavoidable) confounding variable in the current series of experiments (although we dealt with this potential confound using regression techniques).
The nonwords in the current experiment were created to exploit the naturally occurring phoneme and diphone frequency differences in English so as to examine speakers’ sensitivity to the phonotactic patterns of English. However, phonotactic frequency correlated with a cluster of related variables, including ease of articulation, nonword duration, and the lexical status of the embedded syllables. That is, those sounds that are easier to articulate are spoken more fluently, are used more frequently, and are more likely to appear in the words of the language. Attempts to control for these potentially confounding variables resulted in an insufficient set of speech sounds from which to create the nonsense words. Therefore, we decided to control for these potentially confounding variables by quantifying them and partialling out their respective effects. To partial out any potential articulatory biases, the relative frequencies of the sounds used in the nonsense words were taken from Locke’s (1980) analysis of cross-linguistic babbling repertoires. Similarly, the number of languages in which the individual sounds were attested was also taken from Locke’s analysis. Each of these variables was then separately partialled out of the results. First, the average relative frequency in babbling of the consonants in the nonwords (henceforth babbling frequency) and the average number of languages in which the sounds occurred in babbling cross-linguistically (henceforth number of languages) were determined for each nonsense word. Second, a variable (henceforth lexicality) was created whereby each of the nonwords was marked as to whether its stressed syllable appeared as a real word in any of the children ‘s corpora (Brown, 1973; MacWhinney, 1991), and this lexical variable was also partialled out of the accuracy results. That is, these confounding variables were controlled by way of post hoc analyses, with the remaining effects directly attributable to the phonotactic frequency manipulation.
Raw accuracy scores for both groups of children are shown in Fig. 1. Transformed accuracy scores were entered into a stepwise multiple regression analysis, with phoneme frequency, nonword length, age, and all interactions as the relevant variables. To account for within-subjects variance, 11 dummy subject variables were created per group such that each participant’s results were represented by one variable; the 12th participant’s results were indicated by zeroes in all dummy variables. These subject variables were entered in the first step and accounted for a significant portion of the variance, R2 = .2414, F(22, 336) = 10.60, p < .0001. In the next step, potentially confounding variables babbling frequency and lexicality were forced into the analysis. Babbling frequency accounted for a significant portion of the variance, ΔR2 = .0340, F(1, 11) = 34.33, p < .0001, whereas lexicality did not, ΔR2 = .0025,F(1, 11) = 2.55, p > .10. The experimental variables phonotactic frequency, nonword length, age, and all interaction terms were entered in the third step. Of these, nonword length accounted for the most variance and entered the analysis first, ΔR2 = .0603, F(1, 11) = 66.50, p < .0001. Phonotactic frequency entered next, also accounting for a significant portion of the variance, ΔR2 = .0195, F(1, 11) = 22.20, p < .001. Age entered the analysis next and accounted for a significant portion of the variance, ΔR2 = .0073,F(1,11) = 8.33, p < .05. None of the interaction terms was significant; consequently, they did not enter the analysis. Although the age interactions were not significant, we examined the two age groups separately so that any potential developmental differences in sensitivity to the phonological materials might be revealed.
For the analysis of older children’s results, subject variables were entered in the first step, accounting for a significant portion of the variance, R2 = .2112, F(11, 336) = 8.90, p < .0001. In the second step, the potentially confounding variables babbling frequency and lexicality were forced into the analysis, with neither accounting for a significant portion of the variance, ΔR2 = .0035,F(1,11) = 1.69, p > .10, and ΔR2 = .0235,F(1, 11) = 3.20, p > .10, respectively. The experimental variables phonotactic frequency, nonword length, and the interaction term were entered in the third step. Of these, nonword length accounted for the most variance and entered the analysis first, ΔR2 = .053,F(1, 11) = 28.38, p < .001. Phonotactic frequency entered next, also accounting for a significant portion of the variance, ΔR2 = .029,F(1,11) = 15.70, p < .01. The interaction term did not account for any variance, ΔR2 = 0.00, F(1, 11) = .045, p > .10. The analysis in which number of languages was entered instead of babbling frequency provided a similar pattern of results.
The same analysis was performed on the transformed accuracy scores from the younger group of children. In the first step, subject variables were entered, accounting for a significant portion of the variance, R2 = .2658,F(11, 324) = 12.86, p < .0001. Babbling frequency and lexicality were forced into the second step. Babbling frequency accounted for a significant portion of the variance, ΔR2 = .027,F(1,11) = 14.08, p < .001, whereas lexicality did not, ΔR2 = .0004,F(1, 11) = 0.20, p > .10. The experimental variables were entered in the third step. As in the older children, nonword length accounted for the most variance and entered the analysis first, ΔR2 = .053,F(1, 11) = 41.05, p < .0001. Also as in the older children, phonotactic frequency entered next, accounting for a significant portion of the variance, ΔR2 = .0123,F(1, 11) = 7.26, p < .05. The interaction term again accounted for an insignificant portion of the variance, ΔR2 = .0002,F(1, 11) = 0.106, p > .10. For the analysis in which number of languages was entered as the ease of articulation measure, the same pattern of results emerged except that number of languages failed to account for a significant portion of the variance, ΔR2 = .008,F(1, 11) = 4.07, p > .05.
The results show that both groups of children were sensitive to phonotactic frequency, as evidenced by more accurate repetition of nonwords containing higher frequency phonemes. Furthermore, these accuracy differences were over and above those attributed to ease of articulation. By the age of years, children are sensitive to the relative frequencies of individual phonemes, at least when phoneme frequency differences are carried throughout the entire nonsense word. This sensitivity is a direct result of phoneme frequency, and is independent of possible innate perceptual or production biases and of the lexical status of the stressed syllables. On the one hand, this finding was expected given infants’ sensitivity to probabilistic phonotactic frequency by 10 months of age (Jusczyk et al., 1994). On the other hand, in terms of the debate concerning holistic versus segmental lexical representations, the results are more striking. Because the high- and low-frequency nonwords differed only in the frequency of the individual segments and not in the diphone frequencies, the results provide strong evidence of young children’s sensitivity to the relative frequencies of individual segments. As expected, there was a significant age effect in which overall accuracy improved with age. Also as expected, children were more accurate when repeating the shorter nonwords because longer nonwords tax phonological memory and other phonological processes to a greater extent than do shorter nonwords (e.g., Snowling, Chiat, & Hulme, 1991). Interestingly, ease of articulation affected accuracy in the younger group of children but not in the older group.
This experiment was designed to be a much subtler measure of children’s sensitivity to phoneme frequency. Because of the restrictions placed on syllable-final consonants (no nasals, laterals, or rhotics because they tend to color the vowel), the phonotactic frequency manipulation was accomplished by varying just syllable-initial consonants. The high-phonotactic frequency group contained nonwords with frequent sounds only in syllable-initial position, whereas the low-phonotactic frequency group contained nonwords with less frequent sounds in syllable-initial position. Both groups of nonsense words were created using the same set of vowels and coda consonants. Because the frequency differences were carried entirely by the syllable-initial consonants, frequency differences between groups of nonwords were much smaller than those in Experiment 1. As in the previous experiment, the working hypothesis was that participants would respond more accurately to the high-phonotactic frequency nonwords as a direct result of phoneme frequency. However, because of the reduced frequency differences, the younger children’s results are in question. They may showno accuracy differences, suggesting that they are not sensitive to these more subtle frequency differences. Alternatively, because younger children have less experience with lower frequency phonemes, they may repeat them less accurately.
The participants were 12 -year-olds and 12 -year-olds. The older children ranged in age from 3 years 4 months 21 days to 3 years 7 months 22 days, with a mean age of 3 years 6 months 2 days. Of these older children, 3 were girls and 9 were boys. In addition, 2 other children (1 girl and 1 boy, mean age 3 years 6 months 24 days) participated but were not included in the final data set because they refused to repeat the nonsense words. The younger children ranged in age from 2 year 5 months 18 days to 2 years 7 months 24 days, with a mean age of 2 years 6 months 20 days. Of these younger children, 8 were girls and 4 were boys. In addition, 6 other children (4 girls and 2 boys, mean age 2 years 6 months 15 days) participated but were not included in the final data set. All 6 were excluded because they refused to repeat the nonsense words. None of the children had a history of hearing problems, and all came from homes in which English was the only language.
Two lists of nonwords varying in phonotactic frequency were constructed such that the phonotactic frequency differences were carried only by the syllable-initial consonants. A larger set of consonants was included to be more representative of English (Table 1). However, as in the previous experiment, voiced fricatives were not included because children typically acquire them late. The syllabic and prosodic characteristics of this set of nonsense words were identical to those of the nonwords used in the first experiment. The syllable-initial consonants in the high-frequency group had an average phoneme frequency of 7.85%, whereas those in the low-frequency group had an average phoneme frequency of 2.80%. The same subset of vowels and coda consonants was used in all of the nonwords and were chosen because they all are mid-frequency. Thus, the higher frequency group of nonwords in this experiment was less probable than the high-frequency group of nonwords in the previous experiment, whereas the lower frequency nonwords in this experiment were more probable than the low-frequency nonwords in the previous experiment. Stimuli are listed in Appendix B. Except for the number of syllables, they did not differ in diphone frequency or duration (high frequency: 887 ms; low frequency: 918 ms). The stimuli were recorded exactly as described in Experiment 1.
As in the previous experiment, these stimuli were created by exploiting just the phoneme frequency differences drawn from a corpus analysis of American English. Other factors suspected of influencing repetition accuracy were not controlled. Therefore, as in the previous experiment, babbling frequency and number of languages were partialled out of the analysis to remove the effects of articulatory biases or articulatory ease. Also, in both the high- and low-phoneme frequency nonwords, 7 of the 16 nonwords had a real English word embedded as the stressed syllable. These lexicality effects should affect both groups of nonwords equally. Nonetheless, lexicality was partialled out of the analysis, as in the previous experiment, because this variable was not explicitly controlled.
The procedure was identical to that outlined in Experiment 1.
Accuracy results for - and -year-olds are presented in Fig. 2. As in Experiment 1, transformed accuracy scores were entered into a stepwise multiple regression analysis, with phoneme frequency, nonword length, age, and all interactions as the relevant variables. The dummy subject variables were entered in the first step and accounted for a significant portion of the variance, R2 = .1957,F(22, 329) = 8.16, p < .0001. In the next step, the potentially confounding variables babbling frequency and lexicality were forced into the analysis, with neither accounting for a significant portion of the variance, ΔR2 = .0004,F(1, 11) = 0.39, p > .10, and ΔR2 = .0022, F(1, 11) = 2.05, p > .10, respectively. Children’s data revealed the same pattern of results regardless of whether babbling frequency or number of languages was partialled out. Therefore, only the results with babbling frequency are presented. The experimental variables phonotactic frequency, nonword length, age, and all interaction terms were entered in the third step. Of these, nonword length accounted for the most variance and entered the analysis first, ΔR2 = .0424,F(1, 11) = 41.07, p < .0001. Age entered the analysis next and accounted for a significant portion of the variance, ΔR2 = .0200,F(1, 11) = 19.78, p < .001. Phonotactic frequency entered next, also accounting for a significant portion of the variance, ΔR2 = .0079,F(1, 11) = 7.97, p < .05. As in the previous experiment, none of the interaction terms was significant; consequently, they did not enter the analysis. However, we examined the two age groups separately so that any potential developmental differences in phonotactic sensitivity might be revealed.
For the -year-olds, subject variables entered in the first step and accounted for a significant portion of the variance, R2 = .0798, F(11, 329) = 2.53, p < .01. Babbling frequency and lexicality were forced into the analysis in the second step, with neither accounting for a significant portion of the variance, ΔR2 = .0028,F(1, 11) = 1.14, p > .10, and ΔR2 = .0001,F(1, 11) = .06, p > .10, respectively. In the third step, the experimental variables were entered into the analysis. Of these, nonword length accounted for the most variance and entered the equation first, ΔR2 = .071,F(1, 11) = 31.39, p < .001. Phoneme frequency entered next, accounting for a significant portion of the variance, ΔR2 = .016,F(1, 11) = 7.29, p < .05. The interaction term entered last and was marginally significant, ΔR2 = .009,F(1, 11) = 4.03, p = .07. Analysis of this interaction revealed no accuracy differences due to phoneme frequency for the two-syllable nonwords (with subjects, babbling frequency, and lexicality partialled out), ΔR2 = .0045,F(1, 11) = 0.94, p > .10, but it revealed a significant phoneme frequency effect for the three-syllable nonwords, ΔR2 = .063, F(1, 11) = 14.11, p < .01.
Transformed accuracy scores from the younger children were entered into a separate stepwise multiple regression analysis. The subject variables were entered first, R2 = .1163,F(11, 336) = 4.45, p < .0001. Confounding variables were then forced into the analysis, with neither babbling frequency nor lexicality accounting for a significant portion of the variance, ΔR2 = .0001,F(1, 11) = 0.03, p > .10, and ΔR2 = .0051, F(1, 11) = 2.13, p > .10, respectively. The experimental variables were entered in the third step. Only nonword length accounted for a significant portion of the variance in repetition accuracy, ΔR2 = .031,F(1, 11) = 13.59, p < 01. Neither phoneme frequency, ΔR2 = .0042,F(1, 11) = 1.80, p > .10, nor the interaction term, ΔR2 = .001,F(1,11) = 0.45, p > .10, accounted for a significant portion of the variance.
The overall analysis revealed that children repeated nonwords with higher frequency phonemes in syllable-initial position more accurately than they did nonwords with lower frequency phonemes in syllable-initial position. The older group of children ( -year-olds) showed this same effect, but the significant interaction revealed that the accuracy differences were limited to the longer nonwords, with no accuracy differences for shorter nonwords. The -year-old children, in contrast, did not show any sensitivity to the phoneme frequency information in this condition. Although the previous experiment showed that both groups of children are sensitive to phoneme frequency information when the phoneme frequency differences are carried throughout the entire word, this second experiment revealed that the older children, but not the younger children, are sensitive to the much more subtle measure of phoneme frequency limited to syllable onsets. The nonsignificant age interactions from the omnibus analysis did not pick up developmental differences in phonotactic sensitivity, but these post hoc results suggest a developmental trend toward increasing sensitivity to individual segments. These results are consistent with both holistic and segmental accounts of early lexical representation. To examine this question more thoroughly, children in the final experiment repeated nonwords differing along another frequency dimension.
As a final test of sensitivity to phonotactic frequency information, nonwords that vary in the frequency of combinations of sounds, rather than in the frequency of the individual sounds themselves, were presented to children. This third experiment, then, examined sensitivity to larger speech units. Results from Experiment 2 show that children become more sensitive to individual segments over development, consistent with both holistic and segmental accounts of early lexical representation. According to the holistic hypothesis, the acoustic forms of children’s lexical representations are stored in terms of some unit larger than the individual phoneme. The addition of newwords into the lexicon, then, effects a restructuring such that lexical entries gradually incorporate increasing amounts of segmental information until representations are fully segmental (e.g., Walley, 1993). Walley, Smith, and Jusczyk (1986) did not find support for children ‘s sensitivity to the syllable level. Therefore, for this experiment, a holistic representation was defined as being stored in terms of units smaller than syllables but larger than individual segments—namely, pairs of segments or diphones. This experimental design circumvented the effects of ease of articulation by using the same sounds in both lists of words, with the combinations of those sounds being either high or low frequency.
The participants were 12 -year-olds and 12 -year-olds. The older children ranged in age from 3 years 4 months 4 days to 3 years 7 months 25 days, with a mean age of 3 years 5 months 24 days. Of these older children, 8 were girls and 4 were boys. In addition, 3 other children (all boys, mean age 3 years 5 months 28 days) participated but were not included in the final data set because they refused to repeat the nonsense words. The younger children ranged in age from 2 years 4 months 13 days to 2 years 7 months 6 days, with a mean age of 2 years 5 months 13 days. Half of these younger children were girls and half were boys. In addition, 8 other children (4 girls and 4 boys, mean age 2 years 5 months 17 days) participated but were not included in the final data set. They were excluded because they refused to repeat the nonsense words (4 children), had poor articulatory abilities (3 children), or came from a bilingual home (1 child). None of the remaining children had a history of hearing problems, and all came from homes in which English was the only language.
A set of prevocalic consonants, vowels, and postvocalic consonants was chosen so that they were all mid-frequency. These sounds were then differentially combined so that one group had high diphone frequency and the other had low diphone frequency. Because these stimuli were constructed based on the actual phonotactic structure of English, it was not possible to perfectly counterbalance the stimuli so that there was an even number of each speech sound in both conditions (Table 1). Thus, certain sounds are underrepresented in one of the groups. However, every effort was made to ensure that the two groups did not differ in either phoneme frequency or phoneme frequency by syllable position. There were 8 disyllabic and 8 trisyllabic nonwords in each of the groups, so the 16 nonwords in each frequency group were made up of 40 syllables. As a result, in each group there were 40 prevocalic consonants, 40 vowels, and 16 postvocalic consonants. The two groups did not differ in the average phoneme frequency of the syllable-initial consonants, t(78) = −1.935, p > .05, the average frequency in syllable-initial position of the syllable-initial consonants, t(78) = 0.951, p > .10, the average frequency of the vowels, t(78) = −1.327, p > .10, the average phoneme frequency of the syllable-final consonants, t(30) = 0.784, p > .10, or the average frequency in syllable-final position of the syllable-final consonants, t(30) = 1.358, p > .10. However, the two groups did differ in diphone frequency (i.e., the independent variable), F(1, 14) = 4.646, p < .05, and in duration (high frequency: 898 ms; low frequency: 986 ms), F(1, 14) = 27.339, p < .05. Again, this latter effect is probably due to the speaker hyperarticulating the nonwords so as to enhance the syllable transitions. The stimuli are listed in Appendix C.
Both groups of nonwords were created by differentially combining the same set of speech sounds. By definition, the cooccurrence of two individual segments is less probable than the occurrence of those individual segments. Therefore, the relevant phonotactic frequencies of nonwords in this experiment are an order of magnitude smaller than those in the first two experiments. Also, because both groups of nonwords were created from the same set of speech sounds, any potential articulatory biases operating at the phoneme level should affect both groups of nonwords equally. Therefore, babbling frequency and number of languages were not considered as potential confounding variables and were not partialled out of the analyses. Furthermore, lexicality could not be partialled out because 16 of the 16 high-phonotactic frequency nonwords had real English words embedded as the stressed syllables, whereas only 4 of the 16 low-phonotactic frequency nonwords did. This is discussed in the results section for this experiment.
The procedure was the same as in Experiment 1 except that the scoring method was modified. In Experiments 1 and 2, accuracy was calculated as the percentage of phonemes correctly produced. The hypothesis of these experiments was that repetition accuracy would be facilitated by more frequent sound patterns. In the nonwords used in the first two experiments, phonotactic frequency differences were carried by the individual phonemes. However, in this experiment, nonwords did not differ in phoneme frequency. Therefore, percentage of phonemes correctly produced was hypothesized to be the same for the two groups of nonwords. Because nonwords in this condition differed in diphone frequency, the percentage of diphones correctly produced was calculated for each nonword. Thus, both phonemes in a diphone pair had to be veridically repeated in the correct order to be considered as correct. If extraneous phonemes were added between individual members of a diphone pair, an error was marked. For example, one child heard the nonword loo-bahg [lubag] and repeated bloo-blahg. In this case, the loo diphone was marked as correct because it was intact, but the bah diphone was marked as incorrect because of the intervening /l/ phoneme. Percentage diphones correctly produced was arcsine transformed and submitted to statistical analyses.
Children’s accuracy results for this third experiment are shown in Fig. 3. Transformed accuracy scores for all children were entered into a stepwise multiple regression analysis, with diphone frequency, nonword length, age, and all interactions as the relevant variables. The dummy subject variables were entered in the first step and accounted for a significant portion of the variance, R2 = .2475,F(22, 336) = 11.09, p < .0001. In the next step, the experimental variables phonotactic (diphone) frequency, nonword length, age, and all interaction terms were entered. Of these, only nonword length accounted for a significant portion of the variance and entered the analysis, ΔR2 = .0114,F(1, 11) = 11.39, p < .01. As in the previous experiments, no interaction terms were significant; consequently, they did not enter the analysis. However, we examined the two age groups separately so that any potential developmental differences in phonotactic sensitivity might be revealed.
For the -year-olds, subject variables were entered into the analysis in the first step and accounted for a significant portion of the variance in repetition accuracy, R2 = .2837,F(11, 335) = 13.36, p < .0001. Experimental variables were entered in the second step. Of these, nonword length accounted for the most variance, ΔR2 =.0142,F(1, 11) = 7.54, p < .05. The interaction term entered the analysis next, accounting for a marginally significant portion of the variance, ΔR2 = .0078, F(1, 11) = 4.17, p = .066. Diphone frequency entered last and did not account for a significant portion of the variance, ΔR2 = .0047,F(1, 11) = 2.54, p > .10. Analysis of the interaction revealed no accuracy differences due to diphone frequency for the two-syllable nonwords (with subjects partialled out), ΔR2 = .0003,F(1, 11) = 0.086, p > .10, but it revealed a significant diphone frequency effect for the three-syllable nonwords, ΔR2 = .0279,F(1, 11) = 8.05, p < .05.
Transformed accuracy results for younger children were entered into a separate stepwise multiple regression analysis. The subject variables were entered first, R2 = .0639,F(11, 334) = 2.30, p < .01, followed by the experimental variables. Nonword length accounted for the most variance, ΔR2 = .0102,F(1, 11) = 4.05, p = .07. Neither diphone frequency, ΔR2 = .0034,F(1, 11) = 1.37, p > .10, nor the interaction term, ΔR2 = .0014,F(1, 11) = 0.53, p > .10, accounted for a significant portion of the variance in repetition accuracy.
Experiment 1 revealed that both groups of children are sensitive to the frequency with which individual speech sounds occur in the language children hear. This was evidenced by the fact that both groups of children— -year-olds and -year-olds— repeated nonwords containing higher frequency phonemes more accurately than they did nonwords containing lower frequency phonemes. Experiment 2 revealed that this sensitivity to phoneme frequency increases over development. This was evidenced by the fact that older children ( -year-olds) repeated nonwords with higher frequency phonemes in syllable-initial position more accurately than they did nonwords with lower frequency phonemes in syllable-initial position, whereas younger children ( -year-olds) showed no such sensitivity. Experiment 3 was designed to measure children’s sensitivity to larger speech units. If children’s lexical representations progress from larger to smaller speech units, a pattern opposite that reported in Experiment 2 should have been obtained, with younger children, but not older children, repeating nonwords with higher frequency combinations of sounds more accurately than they do nonwords with lower frequency combinations of sounds. Instead, the pattern of results matched the previous experiment, suggesting increasing sensitivity to larger speech chunks with age rather than decreasing sensitivity.
However, these results should be interpreted with caution. Younger children’s accuracy was at or below 40% of diphones correctly repeated, suggesting floor effects. In one sense, this was expected because the phonotactic frequency manipulation created nonwords with lower phonotactic probabilities than in the first two experiments. As phonotactic probability drops, so does repetition accuracy. However, this reduced accuracy was associated with reduced within-subjects variance (roughly 6% of the total variance), suggesting that all of the younger children were performing comparably. This finding supports developing sensitivity to larger speech units.
As explained previously, the lexical status of the stressed syllables could not be partialled out of this analysis because all of the high-diphone frequency nonwords had stressed syllables corresponding to real English words. This variable was expected to enhance the accuracy of children’s responses. This factor is known to affect spoken word recognition in adults (Luce & Lyons, 1999) and nonword repetition accuracy in older children (Dollaghan et al., 1993, 1995). However, in the first two experiments, this factor did not account for a significant portion of the variance in the accuracy of children’s responses. Based on the current results, the repetition of a multisyllabic nonsense word is not enhanced by having an embedded real word as the stressed syllable, at least for younger children.
The ultimate goal of the current series of experiments was to examine the segmental– holistic dichotomy in how words are represented in the developing lexicon. Specifically, the hypothesis was that this strict dichotomy is unwarranted because young children exhibit considerable sensitivity to detailed acoustic–phonetic information. Rather than being polar opposites on a segmental–holistic continuum, the acoustic representations of lexical entries in the developing lexicon are hypothesized to contain considerable acoustic–phonetic detail, whereas those in the mature lexicon are hypothesized to incorporate both segmental and holistic properties. In the current series of experiments, two groups of young children participated in a nonword repetition task in which different sources of phonotactic frequency were manipulated. This allowed an examination of children’s sensitivity to speech units of different sizes and any developmental changes in their sensitivity to these different-sized speech units. Children’s sensitivity to fine-grained acoustic–phonetic detail would demonstrate that children’s lexical representations are fundamentally adult-like early in development. As phonological development progresses, more acoustic–phonetic detail would be added to the lexical entries, although their fundamental nature would not change.
The results from the nonword repetition task in Experiment 1 revealed that young children are indeed sensitive to the probabilistic phonotactic structure of their input language. By years of age, children repeat high-phonotactic probability nonwords more accurately than they do low-phonotactic probability nonwords when the phoneme frequency differences are present across the entire words. Furthermore, this increased repetition accuracy is independent of any potential ease of articulation effects and of the lexical status of the stressed syllable. This sensitivity to overall phonotactic frequency is not surprising in light of evidence that infants become sensitive to this information by 10 months of age (Friederici & Wessels, 1993; Jusczyk, Cutler et al., 1993, 1994; Kuhl, Williams, Lacerda, Stevens, & Lindblom, 1992; Mattys & Jusczyk, 2001; Mattys et al., 1999; Werker & Tees, 1984). However, this finding brings into question the claims that children store words in their lexicons holistically. Treiman and colleagues (Treiman & Baron, 1981; Treiman & Breaux, 1982) and Walley and colleagues (Walley, 1993; Walley et al., 1986) offered evidence that children are not sensitive to fine-grained acoustic phonetic detail. They found that children group syllables based on overall acoustic similarity rather than on any shared phonemes. Similarly, Nittrouer and colleagues (Nittrouer & Studdert-Kennedy, 1987; Nittrouer et al., 1989) reported that children’s speech perception and production are marked by sensitivity to units larger than the individual segment. However, in Experiment 1, the frequency of the individual segments varied, but the frequency of combinations of segments did not. Thus, our results show that young children are indeed sensitive to individual segments.
The current results also shed light on the development of children’s sensitivity to phonetic detail. The younger children in the current study ( -year-olds) were sensitive to the frequency of individual segments when the frequency differences were maintained throughout the entire nonsense word. However, when the frequency differences were limited to a single syllable position (Experiment 2), these younger children showed no difference in repetition accuracy. The older children ( -year-olds) were also sensitive to the frequency of individual segments for the global phoneme frequency differences. But unlike the younger children, they were sensitive to phoneme frequency even when the frequency differences were limited to syllable-initial consonants. Thus, children are sensitive to the frequency of individual segments early in development, and this sensitivity increases with age.
Finally, when frequency differences were signaled by combinations of segments rather than by individual segments (Experiment 3), older children, but not younger children, exhibited sensitivity, as evidenced by more accurate repetition. Sensitivity to larger (diphone) speech units appears to follow a trajectory similar to that for smaller (phoneme) speech units, with increasing sensitivity over development.
The current account of lexical development is consistent with a strong continuity position, whereas Walley’s (1993) account is consistent with a discontinuity position. The current account suggests that children’s lexical entries are basically adult-like at the earliest stages. This is not to say that children’s lexical representations are fully adult-like during early childhood. Templin (1953) reported that phonological development continues until approximately 9 years of age. Thus, children’s lexical representations cannot be fully adult-like early in development because the level of phonetic detail in the lexical representations at a particular age is limited by the children ‘s current state of phonological development. Walley, in contrast, argued that children’s lexical representations are holistically stored at the earliest stages but that segmental features are gradually added to the lexical entries, effecting a restructuring of the lexicon. The current series of studies cannot definitively address this issue of continuity versus discontinuity because the nonword repetition methodology has not been used with children younger than 2 years of age. Thus, we cannot address the issue of the level of phonetic detail in children’s earliest lexical representations. However, consistent with Stager and Werker (1997) and Swingley and Aslin (2000, 2002), both of these accounts agree that by the onset of the naming explosion at approximately 18 or 19 months of age, children begin incorporating acoustic–phonetic detail into their lexical entries. The results of the current study support this position.
Although the results of the current study replicate earlier findings of increased repetition accuracy for those nonwords that reflect properties of the lexicon (Beckman & Edwards, 1999; Dollaghan et al., 1993, 1995; Gathercole, 1995; Gathercole & Baddeley, 1989; Gathercole et al., 1991; Munson, 2001), they do not settle the question of what the nonword repetition task actually measures. Recent research suggests that accuracy differences for nonwords differing in subjective wordlikeness ratings reveal different sources of support (Gathercole, 1995). Repetition accuracy for those nonwords judged to be highly wordlike correlates most significantly with measures of phonological sensitivity, whereas accuracy for nonwords judged to be less wordlike correlates most significantly with measures of phonological memory. That is, because there is less lexical support for nonwords judged to be less wordlike, repeating them depends more on phonological memory. However, this still does not specify precisely what underlying mechanism accounts for these findings. Successful repetition of a nonsense word requires the listener to accurately perceive the phonological string, encode the acoustic–phonetic pattern as segments or syllables, and mark the temporal sequence of these units. It further requires sufficient phonological memory to store the novel phonological string and sufficient working memory to operate on the string. Finally, the transient representation constructed from the encoded speech units must be reassembled to guide the preparation and execution of a motor program to articulate the string. Facilitated repetition for nonwords with more frequent sounds and sound patterns could have its locus in any of these components. For example, children may have more facility for encoding more frequent sound patterns or, alternatively, may have more facility for articulating more frequent sound patterns. This question continues to be debated in the nonword repetition literature. Gathercole and Baddeley (1989) argued that nonword repetition measures phonological memory but not phonological sensitivity. Alternatively, Metsala (1999) argued that it measures phonological sensitivity but not phonological memory. Between these two extremes, Bowey (1996, 2001) argued that nonword repetition measures a latent phonological processing factor that incorporates both phonological memory and phonological sensitivity. Whatever the nonword repetition task measures, accuracy differences due to differences in sound pattern frequencies can only be interpreted as reflecting differences in sound pattern frequencies in the lexicon.
Although the current study examined children’s sensitivity to acoustic–phonetic information, the role that this information plays in early lexical development has not been determined. Schwartz (1988) claimed that “children may have a tendency to select words that are more discriminable, from their perspective, than words that are less discriminable relative to other words in their vocabulary” (p. 199). Similarly, Charles-Luce and Luce (1990, 1995) argued that words in the developing lexicon are maximally distinct from one another, affording children the opportunity to use global holistic perceptual strategies. Dobrich and Scarborough (1992) did indeed find some evidence of these selectional constraints, but only at the earliest stages of lexical acquisition. More recent analyses of the structure of the developing lexicon have shown that children’s lexicons actually contain many similar-sounding words (Coady & Aslin, 2003). Furthermore, the neighborhood structure of children’s early lexicons suggests that children might actually construct their lexicons by exploiting the more familiar and frequent sound patterns rather than by filling in the acoustic–phonetic gaps in the lexicon. Menn (1978) suggested that children’s word learning involved relying on subroutines for lexical generalizations. The addition of a new word into the lexicon is facilitated if it can be assimilated to an already existing subroutine. Similarly, Lindblom (1992) argued, “The probability that a new word will be added to the lexicon is inversely related to the amount of information that has to be committed to memory” (p. 150). So, the more similar a new word is to other words already in the lexicon, the more readily it will be learned. Presumably, then, learning will be facilitated for those words that contain the more frequent sounds and sound combinations.
Recently, researchers have begun to directly test these conflicting hypotheses by teaching children novel words varying in phonotactic frequency. A study by de Jong, Seveke, and van Veen (2000) showed that 5-year-olds with higher levels of phonological sensitivity learned novel names for novel objects better than did those with lower levels of phonological sensitivity. Their novel names, however, did not vary in phonotactic frequency. Storkel (2001) taught children, ranging in age from 3 years 2 months to 6 years 3 months, novel monosyllabic labels for novel objects. Like the Jusczyk et al. (1994) stimuli, the novel labels varied in phoneme frequency by syllable position and in diphone frequency. Storkel (2001) found that the children learned the high-phonotactic probability nonwords with fewer exposures than they did the low-phonotactic probability nonwords and concluded that sublexical factors influence early word learning. Interestingly, in other work, Storkel and Rogers (2000) found that older children (7-year-olds) did not show any phonotactic influences on the acquisition of disyllabic novel words, whereas 10- and 13-year-olds did. Given these contradictory results, the nature of probabilistic phonotactic influences on lexical development is currently unknown. Exploring the nature of these influences on lexical development could potentially elucidate the acoustic nature of lexical representations in the developing lexicon.
Finally, as Treiman and Baron (1981) originally formulated the hypothesis, segmental information is added to holistic lexical entries over the course of development until children have attained segmental adult lexical representations. However, this assumes that adults store words as sequences of individual phonemes, and this has not been shown definitively. Previous research has shown that adults are sensitive to multiple acoustic levels of analysis, including whole words (Luce & Pisoni, 1998; Luce, Pisoni, & Goldinger, 1990; Vitevitch & Luce, 1998, 1999), syllables (Mehler, Dommergues, Frauenfelder, & Seguí, 1981; Savin & Bever, 1970; Warren, 1971), diphones (Vitevitch & Luce, 1998, 1999), phonemes (Elman & McClelland, 1984; Pisoni & Luce, 1987; Pitt & Samuel, 1990), and subphonetic features (Andruski, Blumstein, & Burton, 1994; McMurray, Tanenhaus, & Aslin, 2002). Menyuk and Menn (1979) originally suggested that children are more sensitive to units larger than the individual segment based partly on the finding that adult perceivers are also more sensitive to units larger than the individual segment (e.g., Savin & Bever, 1970; Warren, 1971). Because both children and adults show evidence of sensitivity to multiple acoustic levels of analysis, a stark dichotomy between the acoustic forms of developing and mature lexical representations is unsupported.
This research was conducted as part of the first author’s dissertation at the University of Rochester. It was supported by a grant to the second author from the National Institutes of Health (HD-37082). We are grateful to the children and their parents for allowing us to test their lexical representations. We also thank John Mill, Daniel Swingley, and Daniel Urist for assistance in conducting the corpus analyses. Thanks are also due to Conni Augustine, Becky DaMore, Elizabeth Gramzow, Rachel Heafitz, Kelly Kinde, and Koleen McCrink for assisting in the data collection process and especially to Krista Kornylo for scoring the repetition responses. Finally, we thank an anonymous reviewer for providing helpful comments on an earlier version of the article.
|High phoneme frequency||Low phoneme frequency|
|High phoneme frequency||Low phoneme frequency|
|High phoneme frequency||Low phoneme frequency|
1Phonemes are simply a useful way in which to discuss the temporal sequences of sounds within words. A great deal of emphasis has been placed on the phonemic level of representation because phonemic sensitivity is correlated with the acquisition of literacy (e.g., Brady & Shankweiler, 1991). Children must have explicit knowledge of phonemic structure to read an alphabetic script. Children who do not have this sensitivity have difficulty in learning to read. Furthermore, adults who do not read an alphabetic script show low levels of phoneme awareness (Morais, Cary, Alegria, & Bertelson, 1979; Read, Zhang, Nie, & Ding, 1986). This is not to say, however, that their lexical representations do not contain a significant amount of acoustic–phonetic information. Rather, it is to suggest that fully specified mature lexical representations need not be stored in terms of phonemes per se. Because adults’ lexical representations might be encoded in units larger than the phoneme, children’s lexical representations, encoded in units larger than the phoneme, would be no more holistic than those of adults, albeit less specified.