Infants’ vocalizations were analyzed both perceptually and instrumentally. Perceptual analysis was accomplished by phonetically trained listeners and instrumental analysis by spectrographic measurement. The data bear on three questions: (a) Where do infant vowel productions fall within adult vowel space? (b) Are there developmental changes in vowel production between 12- and 20-weeks of age?, and (c) Do young infants exhibit vocal imitation?
A. Infant vowel space: Relating infants’ to adults’ vocalizations
Infants produced 224 utterances meeting the criteria for vowel-like vocalizations. The number of utterances as a function of age (12-, 16-, and 20-weeks old) was, respectively, 79. 83, and 62. The frequency of different utterance types (/a/-like, /i/-like, and /u/-like vowels) in the corpus was respectively, 105, 37, and 82.
The values of F1, F2, GA, CD, and F0 were obtained by instrumental measurement for each utterance. Statistical tests indicated that there were no significant differences in the parameter values depending on whether the middle three locations within an utterance (1/4, 1/2, or 3/4 points) or all five locations were considered. Therefore the average of five locations was used to specify each parameter (F1, F2, etc.) in the table, figures, and statistical analyses which follow.
A total of 5824 acoustic measurements (26 measurements per utterance × 224 utterances) were attempted across the corpus. In the case of 34 utterances, acoustic measurement was not possible due either to the presence of noise (Velcro noise from the strap that secured the infant in the seat), or the fact that the infant’s fundamental frequency was high and energy was present at all harmonics obscuring the formant frequencies. Although these factors did not prevent phonetic transcription (because the transcriber could perceptually segregate utterances from noise and identify vowel quality for utterances with high fundamental frequencies), these factors prevented accurate spectrographic measurement. These 34 utterances were therefore not analyzed acoustically. Descriptive statistics for each of the six acoustic dimensions (F1, F2, CD, GA, F0, and DUR) on the resulting corpus of 190 utterances are provided in . lists the mean, standard deviation, minimum, and maximum for each utterance type at each age.
Means, standard deviations, and ranges for the acoustic measures of infants’ vowels as a function of age.
displays the entire corpus of infant vocalizations. Each utterance (as transcribed by vowel category) is cast in an F
2 coordinate plot. The values of F
1 in the infant corpus range from 487 to 1645 Hz; F
2 from 1523 to 4120 Hz. The durations range from 178 to 2195 ms. Both the formant values and the durations are consistent with those reported in the one other study in which young infants’ vowel-like vocalizations were instrumentally measured (Kent and Murray, 1982
). As shown, the utterances transcribed as members of each particular vowel category are clustered. Moreover, the three vowel clusters are positioned in a way predicted by the acoustic measurement of adults’ vowels; F
1 and F
2 values for the three categories are in the appropriate relationship to one another (Peterson and Barney, 1952
The corpus of infant utterances plotted in an F1 versus F2 coordinate vowel space. Infants’ utterances are coded by vowel category (/a/-likc. /i/-like, or /u/-like) as determined by phonetic transcription.
plots our corpus of infant vowels within the vowel space reported in the classic study of Peterson and Barney (1952)
. In the Peterson and Barney study, the vowels of 76 speakers, including men, women, and child speakers of American English were measured. The closed curves shown in are Peterson and Barney’s, drawn by their visual inspection to encompass 90% of the utterances in each vowel category. Superimposed on the graph is a closed curve enclosing approximately 98% of the utterances from the infant corpus obtained in the present study. As shown, infants’ vowels overlap with certain adult vowel categories (particularly /ε
/ and /æ/) but extend the vowel space considerably beyond that used by adult and child speakers. This is as expected; infants’ vocal tracts are smaller and this results in higher resonant frequencies and correspondingly higher formant frequencies. While the infant vowel space is more restricted than that of the adult, illustrating that the infant’s vocal tract is not anatomically capable of producing the full range of formant frequency variation seen in adult and child speakers, they nonetheless produce vowel-like sounds that show substantial acoustic variation.
FIG. 2 The “vowel space” of 12-, 16-, and 20-week-old infants in relation to the plot published by Peterson and Barney (1952) that was based on vowel productions of 76 men, women, and children.
B. Developmental changes in vocalizations between 12 and 20 weeks of age
displays the vowels of 12-, 16-, and 20-week-old infants in F1/F2 coordinate spaces. In each graph, infants’ vowel utterances are coded according to the transcription provided by the phonetically trained listener. The closed curves, drawn by visual inspection of the graphs, enclose 90% or more of the utterances in each category.
FIG. 3 The /a/-like, /i/-like, and /u/-like vowels produced by 12-, I6-, and 20-week-old infants cast in F1 versus F2 coordinate plots. The closed curves were drawn by visual inspection to enclose 90% or more of the infants’ utterances. Infants’ (more ...)
It is clear from that utterances coded as a particular vowel form a cluster in acoustic space. This is the case even for the youngest infants, the 12-week-olds. For example, vowels with the highest F
2 values and relatively low F
1 values are coded as /i/, while those with lowest F
1 and F
2 values are coded as /u/. The relationship between the acoustic properties and the transcription observed for infants’ vowels is similar to the relationship between acoustic properties and transcription that exists in the categorization of adults’ vowels (Peterson and Barney, 1952
also reveals that the areas of vowel space occupied by infants’ /a/-, /i/-, and /u/-like vowels become progressively more separated
between 12 and 20 weeks of age, due to a tighter clustering of the vowels in each category over time. This developmental shift could be due to anatomical changes that stabilize infants’ articulatory movements over time. This would be compatible with the fact that infants’ vocal tracts are rapidly changing during this period (Sasaki et al., 1977
). On the other hand, it is intriguing to consider the possibility that the increasingly tighter clustering seen for categories of infants’ vowels could be due, at least in part, to vocal learning. Perhaps infants are listening to the vowels produced by adult speakers of the language and are striving to produce vowels themselves that perceptually resemble those they hear adults produce. This latter point hinges on infants’ abilities to match, with their own vocalizations, the vocalizations they hear another person produce (discussed below).
It is also of interest to examine developmental change in infants’ vocalizations using the featural measures. GA and CD measurements were taken for each of the vowels at each of the five locations. The GA and CD features distinguish adults’ vowels, especially the vowels /a/, /i/, and /u/ (see Syrdal and Gopal, 1986
). displays the featural measures of the utterances of 12-, 16-, and 20-week-old infants in compact-diffuse/grave-acute coordinate plots. The closed curves encircling the utterances of a particular type were drawn by visual inspection to encompass 90% or more of the utterances in each category. These plots suggest that, at the earliest age tested, 12 weeks, the /a/, /i/, and /u/ utterances are differentiated by the GA and CD acoustic features. Examination of the space occupied by the three vowels across age, however, suggests that infants’ vowel categories become much more separated over this 8-week period.
FIG. 4 The /a/-like, /i/-like, and /u/-like vowels produced by 12-, 16-, and 20-week-old infants cast in compact/diffuse versus grave/acute coordinate plots. The closed curves were drawn by visual inspection to enclose 90% or more of the infants’ utterances. (more ...)
C. Statistical analysis of the acoustic measurements
The plots in and suggest that the acoustic measures differentiate infants’ vowel categories (as defined by the transcriber). Statistical analysis of each acoustic variable was undertaken to verify which measures were statistically reliable. Using a 3 (age: 12, 16, 20)×3 (vowel category: /a/, /i/, /u/) analysis of variance (ANOVA), the main effects and interactions were examined for each of the six acoustic measurements (F1, F2, CD, GA, F0, and DUR). When appropriate, follow-up tests were conducted (simple effects and Tukey-HSD post hoc tests).
Analysis of the F1 measurements revealed a significant main effect of vowel category, F (2,181)=66.69, p<0.001, and a main effect of age, F (2,181)=5.75, p<0.005. The interaction between age and vowel category was not significant, p>0.20. Follow-up tests revealed that at 12 and 16 weeks, the F1 values for /a/ were significantly higher than for /i/ or /u/; the values of /i/ and /u/ did not differ. By 20 weeks of age, tests revealed that the F1 values for /a/, /i/, and /u/ all differed significantly. Thus, analysis of the F1 data reveals that at each age significant differences in the predicted direction exist among the /a/, /i/, and /u/ utterance types: moreover, the data show that the F1 values of/a/ are separated from /i/ and /u/ before the latter two are differentiated.
Analysis of the F2 measurements revealed a significant effect of vowel category. F(2,181)=40.75, p<0.001. Neither the effect of age, p>0.40, nor the interaction between age and vowel category, p>0.40, was significant. Follow-up tests revealed that the F2 values for /i/ are significantly higher than those for /a/ and /u/ at all ages. The pattern of F2 values shown by infants thus conforms to that shown by adults.
Analysis of the CD measurements revealed a significant effect of vowel category. F(2,181) = 32.64, p<0.01. Neither age, p>0.08, nor the interaction between age and vowel category, p>0.20, was significant. Follow-up tests revealed that the vowel /i/ is significantly more diffuse than /a/ and /u/, as expected, for all three ages.
Analysis of the GA measurements revealed a significant effect of vowel category. F(2,180=52.03, p<0.001. Neither the effect of age nor the interaction between age and vowel category was significant, p>0.40 in both cases. Follow-up tests show that /u/ is significantly more grave than either the /i/ or /a/ vowels. Thus, the patterning of the three vowels on the GA dimension conforms to the pattern shown by adults.
Analysis of the F0 measurements revealed a significant effect of vowel category, F(2,181)=4.61, p<0.05. Neither the effect of age nor the interaction, between age and vowel category was significant, p>0.10 in both cases. Follow-up tests revealed that the significant effect was attributable to the fact that the 16-week-olds produced their/u/ vowels with a higher F0 than either their /i/ or /a/ vowels. No other significant differences in F0 were observed.
The main effects of vowel category and age were also examined for the duration (DUR) measurements. We had not predicted any specific variations in duration as a function of either vowel category or age. The analysis revealed that the effect of vowel category was not significant, p>0.10, but that the effect of age was significant. F(2,181)=4.23, p<0.02. The interaction was not significant. Follow-up tests showed that 16-week-olds’ utterances were significantly longer than those of either the 12- or the 20-week-olds.
Taken as a whole the acoustic measurements show significant effects of both vowel category and age on infants’ vocalizations. Infants’ vowels, even though they occupy a smaller area within the vowel space when compared to adults’ vowels, differ perceptually (as shown by phonetic transcription). Moreover, the perceptual differences in infants’ vowels correlate with acoustic differences that are consistent with the acoustic dimensions that differentiate adults’ vowels (Peterson and Barney, 1952
: Syrdal and Gopal, 1986
D. Vocal imitation
The analyses thus far have demonstrated that young infants produce vowels that phonetically trained listeners can reliably code as /a/-like, /i/-like, and /u/-like, and that these categories vary acoustically, both in their formant values and in acoustic features calculated from the formant values. The fact that infants are capable of producing utterances perceived as /a/-like, /i/-like. and /u/-like by adult listeners allows us to pose the next question: Do infants systematically vary the utterances they produce as a function of the stimulus they hear? In other words. Is there evidence that infants are attempting to imitate the adult model?
If infants are capable of vocal imitation their vocalizations should vary as a function of the three stimulus conditions. The hypothesis of vocal imitation predicts that infants should produce more /a/-like utterances in response to the stimulus /a/ than to the /i/ or /u/ stimulus: similarly, they should produce more /i/-like utterances in response to the stimulus /i/ than to the /a/ or /u/; and finally, they should produce more /u/-like utterances in response to the stimulus /u/ than to the /a/ or /i/. Data regarding imitation will be presented at two levels of analysis, both at an utterance level and at the level of individual subjects.
The utterance-level analysis examines the entire corpus of 224 utterances. displays the corpus of 224 utterances in a stimulus-response matrix that provides the number of each infant utterance type (/a/-like, /i/-like, or/u/-like) as a function of stimulus condition (/a/, /i/, or /u/). Recall that infants were assigned randomly to stimulus groups, and that each infant was exposed to a video of the same female producing a vowel at the same rate for the same length of time. The only thing that varied was the particular vowel she produced. If the stimulus had no effect on infants’ productions, the cells in a given row should not vary. Alternatively, if the stimulus affects infants’ responses and infants are attempting to match the stimulus, then the largest cell frequencies will occur on the diagonal. The frequency of/a/ utterances will be at its maximum when the stimulus is /a/; the frequency of /i/ utterances will be maximum when the stimulus is /i/, and the frequency of /u/ will be maximum when the stimulus is /u/.
FIG. 5 Stimulus X response matrix for the entire corpus of 224 utterances showing number of infant utterances (/a/-like, /i/-like. or /u/-like) occurring in response to the stimulus (adult presentation of /a/, /i/, or /u/). Higher numbers along the diagonal (more ...)
Visual inspection of the cell frequencies in the stimulus-response matrix of suggests that the vowel stimulus strongly affected the type of vowel infants produced in response. In all three rows, the cell with the highest frequency falls on the diagonal. The top row shows that the frequency of /a/ utterances was systematically influenced by the stimulus that infants heard. Of the 105 /a/ utterances produced by infants in the experiment, 66 (62.9%) occurred in response to the stimulus /a/, 25 (23.8%) occurred in response to the stimulus /i/, and 14 (13.3%) occurred in response to the stimulus /u/. A similar pattern emerges in the case of /i/ utterances. Of the 37 /i/ utterances produced by infants, 22 (59.4%) occurred in response to the stimulus /i/; 11 (29.7%) occurred in response to the stimulus /a/, and 4 (10.8%) occurred in response to /u/. The /u/ utterances also showed a similar profile. Of the 82 /u/ utterances produced by infants, 44 (53.7%) occurred in response to the stimulus /u/, 18 (22.0%) occurred in response to /a/, and 20 (24.4%) occurred in response to /i/.
displays the stimulus-response matrices for each age individually. Examination of the nine rows in these tables (3 ages × 3 response categories) shows that eight of the nine are in line with the prediction of vocal imitation.
FIG. 6 Stimulus X response matrices for each individual age. Cell entries are the number of infant utterances (/a/-like, /i/-like. or /u/-like) that occurred in response to the stimulus (adult presentation of /a/, /i/, or /u/). Higher numbers on the diagonal (more ...)
The utterance-level data just presented are informative because they show the distribution of all 224 utterances. However, infants’ utterances entered into the table cannot be considered independent of one another. Therefore, a second analysis was undertaken at the subject level. In this analysis. infants were categorized as “/a/ infants,” “/i/ infants,” or “/u/ infants” depending on the utterance type they most frequently produced. For example, an infant who produced 12 criterion utterances during the course of the experiment, 7 coded as /u/-like, 3 as /a/-like, and 1 as /i/-like, would be classified as a “/u/ infant,” and so on. In classifying infants, only utterances transcribed identically by the two transcribers were considered. This was done to ensure that only infants’ clearest utterances would be used in determining their classification as an /a/, /i/, or /u/ infant. The subject-level classification achieves statistical independence because each infant is listed once and only once in the matrix.
displays the subject-classification×stimulus-condition matrix for all 45 infants who produced criterion utterances during the experiment. These data demonstrate that the vowel infants heard systematically affected their classification. In all three rows, the cell with the highest frequency falls on the diagonal, supporting the hypothesis of vocal imitation. A chi-square test of the 3×3 contingency table is significant. χ2
Each row of can also be considered individually. The top row shows that of the 19 infants classified as /a/ infants, 13 (68.4%) had heard the /a/ vowel during the experiment. 3 (15.8%) had heard /i/, and 3 (15.8%) had heard /u/, χ2
= 19)= 10.53. p
<0.005. A similar pattern emerges in the case of /i/ infants. Of the 9 infants classified as /i/ infants, 8 (88.9%) had heard the /i/ vowel during the experiment, and only 1 (11%) had heard the /a/ vowel. p<0.025 by the binomial test. A similar pattern is also obtained in the case of /u/ infants. Of the 17 infants classified as /u/ infants, 11 (64.7%) had heard the vowel /u/ during the experiment, 2 (11.8%) had heard /a/, and 4 (23.5%) had heard /i/, χ2
(2, N=17) =7.88, p
FIG. 7 Stimulus X subject classification matrix for 45 infants. Cell entries are the number of infants classified as /a/ infants. /i/ infants, or /u/ infants (based on their vocalizations! as a function of stimulus condition (adult presentation of /a/, /i/, (more ...)
displays the subject-level matrices for each age considered individually. Examination of the nine rows (3 ages×3 response categories) shows that all are in line with the prediction of vocal imitation. Chi-square analyses showed that the matrix for the 20-week-olds reached significance, χ2(4, N = 13) = 21.67, p<0.001. The 12-and 16-week-old matrices did not reach significance when considered individually; however, the data were significant with the larger N provided by collapsing the two younger age groups (12-and 16-week-olds) together, χ2(4, N = 32) = 15.54, p<0.0l. The subject-level analyses support the hypothesis of vocal imitation in infants under 20 weeks of age.
FIG. 8 Stimulus X subject classification matrices for each individual age. Cell entries are the number of infants classified as /a/ infants, /i/ infants, or /u/ infants (based on their vocalizations) as a function of stimulus condition (adult presentation of/a/, (more ...)