Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Lang Sci. Author manuscript; available in PMC 2018 January 1.
Published in final edited form as:
PMCID: PMC5513188

Gesture and intonation are “sister systems” of infant communication: Evidence from regression patterns of language development


This study investigates infants’ transition from nonverbal to verbal communication using evidence from regression patterns. As an example of regressions, prelinguistic infants learning American Sign Language (ASL) use pointing gestures to communicate. At the onset of single signs, however, these gestures disappear. Petitto (1987) attributed the regression to the children’s discovery that pointing has two functions, namely, deixis and linguistic pronouns. The 1:2 relation (1 form, 2 functions) violates the simple 1:1 pattern that infants are believed to expect. This kind of conflict, Petitto argued, explains the regression. Based on the additional observation that the regression coincided with the boundary between prelinguistic and linguistic communication, Petitto concluded that the prelinguistic and linguistic periods are autonomous. The purpose of the present study was to evaluate the 1:1 model and to determine whether it explains a previously reported regression of intonation in English. Background research showed that gestures and intonation have different forms but the same pragmatic meanings, a 2:1 form-function pattern that plausibly precipitates the regression. The hypothesis of the study was that gestures and intonation are closely related. Moreover, because gestures and intonation change in the opposite direction, the negative correlation between them indicates a robust inverse relationship. To test this prediction, speech samples of 29 infants (8 to 16 months) were analyzed acoustically and compared to parent-report data on several verbal and gestural scales. In support of the hypothesis, gestures alone were inversely correlated with intonation. In addition, the regression model explains nonlinearities stemming from different form-function configurations. However, the results failed to support the claim that regressions linked to early words or signs reflect autonomy. The discussion ends with a focus on the special role of intonation in children’s transition from “prelinguistic” communication to language.

1. Introduction

This paper examines infants’ communication development from 8 to 16 months. The age of approximately 12 months, when the first words appear, divides this 8-month period into the prelinguistic or nonverbal stage (before 12 months) and the linguistic or verbal stage (after 11 months). In the context of infant communication research, the terms “prelinguistic” or “nonverbal” refer to the visual-gestural modality and pragmatic functions. Conversely, the terms “linguistic” or “verbal” refer to the vocal-auditory modality and referential functions such as those encoded in nouns, verbs, etc. Thus, the transition from nonverbal communication to words involves changes from the gestural-visual to the vocal-auditory modality and from pragmatic to referential functions.

Of special interest in this study is intonation or the melody of speech, a system that does not fit neatly into the schema of descriptors discussed above (e.g. nonverbal vs. verbal). For example, with respect to the vocal modality, intonation is linguistic or verbal; but with respect to functions, it is nonverbal. In sum, intonation is a “hybrid” system with properties of communication that are sometimes verbal and sometimes nonverbal. Accordingly, researchers have speculated that intonation acts as a bridge from gestures to words (e.g., Bruner, 1974/1975), but to date little is known about the role of intonation in this milestone event.

To address this issue, the present study analyzed patterns of regression that reflect cognitive-linguistic advances in the age range of interest in this research. I begin with a sketch of regression phenomena. This will serve as the background for a study of regressions in infants’ acquisition of intonation in English.

1.1. Regressions in language development

Regressions are partial or complete losses of previously controlled skills. Temporary periods of decline are followed by an apparent reorganization and a return to pre-regression levels of mastery (Snow, 2006). A canonical example was documented by Petitto (1987) in a study of pointing gestures in two infants learning American Sign Language (ASL). At 10 months, the children used pointing gestures to designate the location of people and objects (deixis) just as hearing children do at the same age. However, at 12 months pointing to persons decreased markedly. For each of the two infants, during a three-to-six-month period between 12 and 18 months, pointing to persons dropped out completely while the children continued to point to objects or locations. Recovery from the regression occurred by 22 months when pointing to persons re-appeared as linguistic pronouns, e.g., the signed equivalents of I and you. The pronoun forms were adultlike except that the meanings were usually reversed (for an interesting discussion of this reversal pattern, see Clark, 1978). Finally, personal pronouns were produced correctly by 27 months.

How can we account for regressions of this type? To address this question, Petitto pointed out that the regression in ASL occurred when the children acquired a “new function” for an “old form” (Slobin, 1982, 1985). The old form – pointing gestures – served the deictic functions that are universal in prelinguistic communication. However, when the children were beginning to produce single signs, they arguably discovered a new function of pointing, namely to mark linguistic personal pronouns. At this juncture, one form (pointing) had two functions (deixis and personal pronouns), a violation of the 1:1 correspondence between forms and functions that children expect (Gleitman & Wanner, 1982; Slobin, 1982, 1985). To maintain the preferred 1:1 relationship, the children stopped pointing to persons until new and old forms and functions could be sorted out (Petitto, 1987).

Summarizing, regressions occur when two systems developing at the same time are closely related to one another. Deixis and pronoun reference in ASL, for example, are said to be related because they share the form of pointing. One of the related systems undergoes a sharp decline or disappearance, namely, personal pronouns, while the other (deixis) triggers the regression but continues to develop without discontinuity. Statistically, then, the regression is signaled by an inverse correlation between the two related systems of interest. Although deixis and pronouns are closely related variables during the regression, they move in different directions.

Among the broader implications for language acquisition, regressions reflect the child’s emerging awareness of a more complex sophisticated grammar than the simple 1:1 model suggests. For this reason, regressions signal significant advances in the child’s development of language. As Vihman (1993, p. 418) succinctly expressed this conclusion: “Nonlinearity or ‘regression’ in production accuracy marks emergent organization.”

1.1.1. The 2:1 pattern

The regression in ASL was precipitated by a 1:2 form-function pattern. The 2:1 pattern has also been reported (two forms having the same function). An example in English is the well-known study of the development of past tense morphology in English described by Slobin (1982,1985). In early stages of acquisition, infants and toddlers often acquire several irregular forms like went and broke – past tense forms which are presumably memorized like other lexical items. At a later stage, when the regular -ed rule is acquired (danced, jumped, played), many children overgeneralize the -ed suffix to irregular forms that they had previously produced correctly, for example, went and broke are replaced by goed and breaked. After age four, children gradually sort out correctly the regular and irregular forms.

Slobin attributed such patterns to an interaction between old and new forms and functions. Children acquired a “new form” (-ed) for an old function (past tense marking). In Petitto’s model, this pattern would correspond to a 2:1 relation between forms and functions (2 forms, 1 function). To maintain the preferred regularity of one form for one function, the old forms (irregular verbs) are suppressed until the new and old forms and functions can be sorted out and reorganized. This is the same kind of model that Petitto posited in her study of the 1:2 pattern in ASL.

However, the 2:1 pattern is controversial because it seems different from its 1:2 counterpart. To a greater extent than the 1:2 pattern, the 2:1 case may entail a redundancy in which, for example, the meaning of past time has a dual representation in the input. As a result of the redundancy, the meaning can be accessed via either one of two channels. If infants benefit from the inherent redundancy of the 2:1 pattern, they might not interpret the pattern as a mismatch between forms and functions, in which case, a regression would not be motivated.

In the preceding paragraph, however, the comparison between pointing in ASL and tense morphology in English indicated that the 1:2 and 2:1 patterns lead to the same kind of regression. This does not support the hypothesis that the 2:1 and 1:2 patterns are fundamentally different, or that the 2:1 pattern modifies or curtails the regression that is otherwise expected.

It is likely that infants use redundant patterns to facilitate the acquisition of forms and meanings. But the regression has a different motivation on a higher level, namely, to maintain simplicity and regularity in the infant’s emerging language system. Accordingly, the assumption of this study relative to this issue is that one-year-old children will demonstrate a regression in the acquisition of both 1:2 and 2:1 form-function patterns.

The present study analyzes a regression of intonation in the vocalizations of infants learning English, a regression involving the debated 2:1 relation. The principal aim of the study is to determine whether the assumptions and theoretical implications of the form-function model apply to the regression of intonation in English as they did for the regression of past tense morphology in English and pointing in ASL. I begin with a sketch of English intonation and its nonlinear pattern of development in infant vocalizations from 6 to 23 months.

1.2. Intonation

Intonation refers to the vocal effects of pitch that modify meanings on the level of sentences (Crystal, 1991; Cruttenden, 1997). The acoustic correlate of intonation is the pattern of changes in the fundamental frequency of the voice (f0) during sentence production. The perceptual correlate of a change in f0 is “pitch range” expressed in semitones, a logarithmic measure based on the octave scale (Burns & Ward, 1982). For example, a drop in f0 from 200 Hz to 100 Hz is 1 octave or 12 semitones. To illustrate pitch range, Figure 1 displays the acoustic analysis of two non-meaningful monosyllabic utterances produced by infant girls at different stages of development (Snow, 2006). In the display panel for each child, the maximum (vertical arrows) and minimum (horizontal arrows) indicate the beginning and ending f0 values on the falling portion of the fundamental frequency contour which is displayed below the time waveform. The magnitudes of pitch range for TN and AB are, respectively, ten semitones (a wide contour) and four semitones (a narrow contour).

Figure 1
Time waveform and f0 contour of single-word utterances produced by 2 girls. [top panel (TN) max f0 = 728 Hz; min f0 (falling portion) = 407 Hz; pitch range = 10.0 semitones. bottom panel (AB) max f0 = 410 Hz; min f0 = 319 Hz; pitch range = 4.4 semitones. ...

The figure also illustrates contour direction, that is, whether the pitch change between the minimum and maximum points is falling or rising. Both contours in Figure 1 have a falling pattern of pitch change. To describe the acquisition pattern of pitch range in English, Snow (2006) reported a cross-sectional study of children in the first two years of life. The results by age groups from 6 to 20 months are displayed in Figure 2. The most striking feature of the overall pattern of development is its U-shaped configuration.

Figure 2
Mean pitch range by age group and contour direction: 6 to 20 months

Between 6 and 9 months, infants produce on average a wide pitch range that resembles the contours of older children and adults in experimental contexts (Snow, 1998). At about 10 months, however, children’s intonation weakens abruptly and begins a period of regression. This paper focuses on the U-shaped curve from points labeled a to c in Figure 2.

The regression observed by Snow corroborated earlier studies of intonation in English and other languages (e.g., D’Odorico & Franco, 1990; Levitt, 1993; Marcos, 1987). For example, D’Odorico & Franco used situations (mother present, mother absent) and type of vocalization (e.g., cry, non-cry, initial request, repeated request) to determine the infants’ probable affective and pragmatic meanings. After 9 months, infants failed to demonstrate the consistent associations between intonational forms and meanings that they had demonstrated before 9 months (developments after 12 months are not known because 12 months was the oldest age tested). In addition, the prosodic regression that D’Odorico and colleagues observed in production has also been documented in cross-linguistic studies of speech perception (Levitt, 1993). Further attesting to the generality of nonlinear patterns across languages, the regression pattern in ASL (Petitto, 1987) is similar to the regression of intonation in English. From a methodological perspective, it is of interest that studies using a longitudinal (e.g., D’Odorico & Franco, 1991) or quasi-longitudinal design (e.g., Marcos, 1987) foreshadowed findings that Snow (2006 ) confirmed in a cross-sectional study. The key findings are remarkably robust across different languages and research designs.

Figure 3 plots Petitto’s summary data on the children’s development of gestures and signs (a′ to d′). To compare the acquisition curves across languages, the ASL data are superimposed on a schematic diagram of the U-shaped pattern of English intonation development which is depicted by bold lines from a to d in Figure 3.

Figure 3
Schematic of U-shaped curves in English and ASL [ad identify phases in Snow (2006). a′ – d′ identify the corresponding phases in Petitto (1987)].

Visual inspection of Figure 3 shows that the regression pattern of pointing in ASL is remarkably similar to the regression of intonation in English except that the regression of pointing begins at the onset of language and the regression of intonation begins at the onset of gestures. This suggests that the underlying causes of regressions are universal (violation of a 1:1 relation between forms and functions) but the linguistic context that precipitates the regression and the timing of the events may vary across languages such as ASL, English, or Italian. The hypothesis of this study of English-speaking is that intonation and gestures (2 forms) share the same pragmatic function (a 2:1 relation), that is, gestures and intonation are “sister systems” by virtue of having a common origin in pragmatics. To support the hypothesis, the following literature review examines the functional developmental of gestures and intonation in the late prelinguistic and early linguistic periods.

1.4. The pragmatic foundation of infant communication

Intonation contours primarily signal social intent (Dore, 1975, for example, the speaker’s intention to make a statement or to ask a question. Intonation signals such intentions primarily by the direction of pitch change in utterance-final position. For example, the falling contours in Figure 1 convey the speaker’s social intent to make, for example, a statement or comment. Typical but not universal meanings associated with rising contours are yes/no questions, queries, and various kinds of requests, especially those that might not be granted.

Given these primitive roots in affective-pragmatic expression, intonation is relevant to the theory of “speech acts” (Austin, 1962; Searle, 1969). “Speech acts” refer to “communicative activities (locutionary acts) defined with reference to the intentions of speakers while speaking (the illocutionary force of their utterances) and. the effects they achieve on listeners (the perlocutionary effect of their utterances)” (Crystal, 1991: 323). The related term, pragmatics, is a branch of semiotics (the study of signaling systems) in which communication is defined in relation to the expectations and intentions of the participants with respect to what is being communicated. In recognition of the pragmatic origin of intonation, Dore (1975) referred to intonation contours in infant vocalizations as “Primitive Speech Acts” (PSAs). Collectively, PSAs constitute a small-scale but systematically functional mode of communication that could be considered the beginning of language acquisition (e.g., Crystal, 1984).

As early as five weeks, infants communicate with caregivers by eye contact, smiling, cooing, and facial expression. As a precursor to the infant’s ability to follow the line of sight of pointing gestures, infants at nine months frequently look where adults are looking (Scaife & Bruner, 1975). At the same age, infants demonstrate shifts in eye-gaze to and from the adult and the object of joint attention – the visual counterpart of gestural communication episodes (Trevarthen & Hubley, 1978).

However, it is not until about 10 or 11 months that infants use gestures in a truly communicative way (Bates, Camaioni, & Volterra, 1975). To study the transition from pre-intentional to intentional communication and from nonverbal to verbal, interactions, Bates et al. adopted the speech acts model that was discussed above in relation to intonation. For example, “Protoimperatives” entail whole hand pointing (Cochet & Vauclair, 2010) and serve a regulatory function such as enlisting adult intervention to attain a goal. Protoimperatives are the precursors generally of adult yes/no interrogative and imperative sentence types.

The most sophisticated of the pointing gestures are “protodeclaratives,” which entail an extended index finger (Bates, O’Connell, Vaid et al., 1986; Cochet & Vauclair, 2010). Growing out of showing and giving at 10 to 11 months (Bates et al, 1975), these emerging declaratives exemplify the “socially advanced function of commenting in which affective sharing is the content of communication” (Mundy, Kasari, & Sigman, 1992: 377). This is consistent with studies of hand preference (Bates et al., 1986; Cochet & Vauclair, 2010) which indicate that Protodeclaratives relative to Protoimperatives are associated with a greater degree of right hand preference. This is provocative because the right-hand preference implies lateralization to the left hemisphere – the region of the brain that is specialized for language.

In addition to social intent, speech acts also convey referential intent (Tomasello, 1988), that is, the speaker’s intention to draw the listener’s attention to an object of shared conversational interest. Referential intent corresponds to “indicating” (Bruner, 1974/1975), that is, the speaker’s intent to designate or refer to an object or event of joint attention. “Indicating” is a part of deixis (e.g., Petitto, 1987), the classical grammatical meaning of which is “pointing” or “demonstrating.” It is a term used with reference to the “personal, temporal, or locational aspects of the situation in which an utterance takes place” (Crystal, 1991: 96).

Table 1 lists examples of speech acts in a developmental framework adapted from Clark (1978). All five examples illustrate referential and social intent. Line 1 is an example of a protodeclarative. Lines 2 and 3 illustrate gestures with deictic words that some infants use in intermediate stages of the acquisition of deixis (e.g., AB in Tables 2 and and3).3). The next 2 lines (4 and 5) illustrate referential intent entirely by vocal means. Line 4 (adapted from Petitto’s, 1987) is based on the idea that the utterance-final fall in intonation designates the information focus of meaningful or non-meaningful utterances, in this case, the ball. If the identity of the referent can be inferred from the situation, the fall in the child’s intonation contour signals his or her intention to designate that object (the ball) and to make a statement or request about it. In both jargon and meaningful speech, then, the pitch prominence of an utterance vocally designates or “points to” the information focus of the utterance. Line 5, the meaningful counterpart of line 4, illustrates referential intent by intonation and words.

Table 1
The development of deixis and social intent.
Table 2
Demographic data by child.
Table 3
Acoustic and parent report data by childa

The comparison between intonation and gesture shows that the two systems differ markedly in form but they have the same pragmatic function. For example, protodeclaratives and falling intonation both convey the speaker’s intention to make a statement or share experience about an object of joint attention.

The cross-modal representation of pragmatics in one-year-olds has an exact counterpart in models of adult communication. Bolinger (1986), for example, concluded that gestures and intonation are closely related components of a single “gestural complex.” Similarly, Bolinger pointed out that if intonation is affected by acquired language disorders, gestures are usually affected as well.” This suggests a close connection between intonation and gesture” (Cruttenden, 1997: 177).

The relation between forms and functions as hypothesized by the Gestural Complex is schematically depicted by the intersection of two circles in Figure 4. The upper circle (gestures) overlaps with the lower circle (intonation) and the domain of overlap is pragmatics.

Figure 4
Schematic diagram of the relation between gestures, intonation, and pragmatics.

Summarizing, intonation and gestures reflect a 2:1 pattern in which the two forms are similar to one another by virtue of having the same pragmatic function. This violation of the preferred 1:1 pattern explains the regression of intonation in English in the same way that a similar violation (a 1:2 form-function relation) accounts for the regression of pointing gestures in ASL. Early communication truly has a “pragmatic basis” (Bruner, 1974/1975) and a foundation in “social cognition” (Tomasello, 1995; Tomasello, Carpenter, & Liszkowski (2007).

1.5. Prelinguistic versus linguistic communication

Petitto (1987) explored the implications of her ASL study for the relation between verbal and nonverbal communication. She observed that the regression in ASL occurred at about 12 months – a milestone in which deaf infants began to produce their first single signs. Petitto interpreted this co-occurrence as evidence that regressions signal a sharp break between prelinguistic and linguistic periods. In fact, Petitto concluded that infants’ prelinguistic experience is unrelated to linguistic competence. This theoretical position, which posits the complete autonomy of the prelinguistic and linguistic periods, is reminiscent of Jacobson’s (1968) discontinuity model of the relation between the sound systems of babble versus early words.

However, visual inspection of the regression patterns in English and ASL (Figure 3) suggests that regressions (hence major developmental advances) may occur at milestones other than first words or signs. In fact, the hypothesis of the present study is that the regression of intonation in English occurs within the prelinguistic period and before language. If this hypothesis is statistically supported, the results would challenge claims that the periods preceding and following language are autonomous.

The research questions to be addressed are these:

  1. Are gestures correlated with pitch range during the regression of intonation?
  2. Does the correlation between gestures and intonation reflect a significant inverse relation?
  3. Is the basic structure of the regression in English intonation (2:1 form-function pattern) similar to that of the regression of pointing in ASL (1:2 pattern)?
  4. Do the findings support a continuity or discontinuity model of the relation between prelinguistic and linguistic stages of communication development?

2. Methods

2.1. Participants

29 infants participated in the study (15 girls and 14 boys). The children were from predominately middle class homes and European backgrounds. The children and their families were recruited by a newspaper advertisement in the vicinity of Lafayette, Indiana. The participants met the following selection criteria: 1) between 8 and 16 months of age, 2) no unusual prenatal, sensory, or developmental concerns, 3) from English-speaking homes (General American dialect), and 4) hearing within normal limits, defined as 20 dB HL or better at 500 Hz, 1000 Hz, 2000 Hz, and 4000 Hz (American National Standards Institute [ANSI], 1969). Thresholds at 30 dB or better at 500 Hz were accepted if all other findings of visual reinforcement audiometry and bilateral tympanometry were within normal limits. The participants’ name, gender, and age are listed in Table 2.

2.2. Procedures and analysis

Each parent completed the MacArthur Communicative Development Inventory (CDI): Gestures and Words Form (Fenson, Dale, Reznick et al., 1993) less than 30 days from the experimental session. Spontaneous speech samples were audio- and video-taped in a university laboratory context. Each child interacted with his or her mother and an experimenter in an informally structured play setting. The hearing screening immediately followed the experimental play session.

Meaningful and non-meaningful monosyllabic utterances were selected for acoustic analysis using C-Speech (Milenkovic & Read, 1992). Monosyllables were selected because they constituted the only utterance type (based on length and stress pattern) that all 60 children produced at least once in falling and rising contours. The pitch range of each analyzed contour was expressed in semitones. For each child, the average pitch range of falling and rising contours meeting the selection criteria yielded the measure of intonation for that child (the dependent or predicted variable).

The independent variables were based on categories assessed by the MacArthur Communicative Development Inventories (CDI): Words and Gestures form, a parent report instrument that has been validated as a measure of communication development in young children (Dale, Bates, Reznick et al; 1989; Fenson et al., 1993). Dale et al. (1989) observed children in a laboratory setting after collecting parent report data for the same children. They found that parent report was comparable in accuracy and validity to intensive laboratory observations.

The scoring categories of the CDI address comprehension (phrases and words), vocabulary (comprehended and produced), and nonverbal skills (First Communicative Gestures, Games and Routines, Actions with Objects, Pretending to be a Parent, and Imitating other Adult Actions). Examples of First Communicative Gestures include giving, showing, pointing, waving bye-bye, and shaking head “no”. Games and Routines refer to social action games like Peek-a-Boo. Actions with Objects include skills like drinking from a cup, combing or brushing own hair, etc. The last two categories are devoted, respectively, to activities sometimes called symbolic play, pretend play, and tool use.

In this paper, the CDI language scales are called Phrases Comprehended, Words Comprehended, and Words Produced. In the nonverbal scales, First Communicative Gestures and Games and Routines were combined in a single category called Early Gestures. The last three categories were combined in a single category called Later Gestures. Categories described as “Early” are typically acquired earlier in development than categories designated as “Later.” For example, a 9-month-old child might have several early gestures such as pointing in joint attention episodes but few if any later gestures such as pretending to drink from a cup.

The following three phases of the posited regression trace the logic of the predicted inverse relation between intonation and early gestures, the heart of the statistical evidence.

  1. Preceding the regression, children are actively acquiring gestures and intonation contours. At this point, indices of pitch range and gestures are high or increasing.
  2. Infants began to suppress intonation in response to their awareness that gestures and intonation have the same pragmatic meanings. While pitch range decreases rapidly, children continue to use many types of gestures. At this juncture, the relationship between gestures and intonation is strong because the magnitude of changes in one variable is linked to the magnitude of changes in the other.
  3. Most importantly, however, the sign of the correlation will be negative because the direction of changes in one variable is linked to the opposite direction of changes in the other. Thus the overall relationship between gestures and intonation is expressed in two parts -- the correlation coefficient, which measures the strength of the relationship, and its sign, which indicates whether the direction of change in the two variables is the same or different. In the present study, the anticipated significant and negative correlation is evidence that the regression has unfolded in the manner predicted by the form-function model.

3. Results

The acoustic and parent report results are listed in Table 3. Raw scores were used as the measures of the CDI scales (percentiles are also listed but they were not used in the analysis). On the CDI Form, for example, AB’s parents checked 16 early gestures (types) that, in their judgment, AB used sometimes or often. This corresponds to the 90th percentile for 12-month-old girls.

The analysis was designed to determine which of the parent-report predictor variables are significantly associated with intonation. The data were organized in two groups – the first half of the age range in this study (8–11 months) and the second half (12–16 months). The two groups developmentally represent the late prelinguistic period and the first half of the single word period. Pearson correlations were computed for the children in each group (see Table 4).

Table 4
Summary of statistics. Predicting pitch range from 8 to 16 months.

The only significant correlations (probability value less than 0.05) were between intonation and both of the gesture scales in infants younger than 12 months. These findings indicate that intonation is significantly related to early and later gestures.

Independent of the statistical significance of correlations is the positive or negative sign of the relationships. Positive correlations, which are unmarked, indicate that the correlated variables change in the same direction, while negative correlations indicate that the variables change in different directions. For infants younger than 12 months, then, all of the CDI predictors had increasing raw scores (reflecting normal development and change) but intonation had decreasing pitch ranges (reflecting the regression process). Thus, all correlations before 12 months signal a negative or inverse relation. For infants in this age range, however, the critical finding that has a bearing on the hypothesis is that the correlations between intonation and the gestural scales are both negative and significant (lines 1 and 2 on the left side of Table 4).

To display the results, Figure 5 is a scatterplot of the intonation-early gestures data for all 29 children in the study. The left and right panels represent, respectively, the younger and older groups of infants. In each panel, the individual data are plotted by pitch range, number of early gestures, and age group (two groups for the age range in each panel). Linear trends are also displayed. The trends show that the anticipated inverse relation between intonation and gestures is very clear in the period ending at 11 months. After that time, the children enter a plateau that continues throughout the single-word period. From 12 to 16 months, the nearly flat slope indicates that after the children’s first birthday there is little or no developmental relationship between pitch and gestures, and the negative direction of the correlation has all but disappeared.

Figure 5
Scatterplot and linear trend analysis of mean pitch range by number of gestures and age group (Note. To visually distinguish pairs of overlapping or nearly overlapping data points, one or both of the plotting symbols in each pair was moved one pixel to ...

4. Discussion

In children learning English, research in semiotics and related fields has shown that intonation and gestures share a common core of pragmatic meanings. Primitive Speech Acts in children and the Gestural Complex in adults bolster the far-reaching claim that intonation is the vocal counterpart of gestural communication.

Statistical evidence confirmed this close relation between gesture and intonation: Gestures alone (but not any of the verbal scales) were significantly correlated with intonation in the period of peak developmental change (Research question 1). It is of interest to note that the correlations in the younger group uniformly have a negative sign. The sign indicates that the CDI predictor (e.g. Early Gestures) changes in a positive way consistent with normal development; but the predicted variable, intonation, changes in the opposite direction, a pattern consistent with the regression phenomenon.

To test the regression model, the critical evidence is that intonation and gestures reflect a statistically significant inverse relationship (Research question 2). Collectively, the findings support the hypothesis that gestures and intonation develop in a parallel way from 8 to 11 months and bear a close albeit inverse relation to one another.

Contrary to the hypothesis, however, the predictors of intonation included not only early gestures but later gestures as well. Later gestures such as symbolic play were not expected to develop substantially at 8 to 11 months because they are associated with more advanced stages of linguistic and cognitive development than the present study investigated. However, some of these late developing CDI scales, especially Actions with Objects, overlap partly with aspects of pragmatics as represented in the scale of Early Communication Gestures.

Pragmatics encompasses social intent (Bates et al., 1975) and referential intent (Tomasello, 1988). Behaviors such as pointing (Early Communicative Gestures) express both social and referential intent. In contrast, behaviors such as drinking from a cup or appropriately using a comb (Actions with Objects) express social intent indirectly at best but clearly entail referential intent, as is the case for words, place-holders for words, and intonation (e.g., Clark, 1978; Petitto, 1987). Some of the advanced gestural scales of the CDI, then, might measure at least one aspect of pragmatics that is relevant to prelinguistic children. Thus, the conclusions of this study are more general than the hypothesis had anticipated, that is, the correlate of intonation at 9½ months is pragmatics broadly defined so as to include the speaker’s social and referential intent. As the reader might recall, Crystal (1991) pointed out that some analysts consider referential intent or deixis to be a part of pragmatics. The outcomes of this study support the following classification of semiotics (where curly brackets indicate categories and straight brackets indicate subcategories): {pragmatics [deixis, social intent], semantics, and syntactics}.

The regression of intonation was sparked by a 2:1 form-function configuration The introduction summarized a similar pattern of regression in ASL except that the regression was sparked by a 1:2 configuration (Research question 3). The similar results across ASL and English suggest that either form-function mismatch can trigger a regression. It is likely that the children used to their advantage the redundancy that is represented especially in the 2-to-1 pattern. However, young children in this age range seem to value regularity to an even greater extent than they benefit from redundancy. Thus, a means of maximizing regularity and minimizing exceptions, such as a regression strategy, is motivated regardless of the type of form-function the children encounter.

The regression model explains nonlinearities in a range of learning circumstances. However, the model doesn’t apply uniformly to all types of nonlinearity. In general, nonlinear patterns of development occur at junctures of transition and change (Parladé & Iverson, 2011); but some nonlinear patterns are driven by different mechanisms than those that give rise to regressions. To take an example from infant speech development, the proportion of labial consonants produced undergoes a nonlinear change in the transition from babbling to single words. From a developmental perspective, labials are the simplest class of consonants to master (McGregor, 2015), perhaps because they are richly represented by auditory and visual cues (Vihman, 1996). Accordingly, infants initially produce a greater frequency of labials than the native language input would predict. The frequency decreases during the babbling period as the child’s phonology grows in complexity, sophistication, and diversity. At the beginning of word production, the frequency of labials increases and approaches or surpasses the high frequency that was characteristic of the initial babbling period (Boysson-Bardies, Vihman, Roug-Hellichius et al., 1992). At the transition to speech, in which the child first attempts to produce words, lexical retrieval requires a substantial outlay of cognitive resources. Infants seem to return to an earlier stage of phonological development in order to simplify the overall processing demands of word retrieval. This idea is expressed succinctly by Boysson-Bardies et al. (p. 384): “A limited resource assumption may account for a tendency to ‘return’ to more basic adjustments when the infant tries to approximate a complex target.” In contrast, the hallmark of regressions is that children suppress exceptions to familiar patterns in the interest of maximizing the regularity of the grammar, a conceptual type of simplification.

4.1 Intonation in the transition to language

As the reader might recall, Petitto interpreted her findings (regression at 12 months) as evidence for a discontinuity model of the relation between prelinguistic development and language. Petitto argued that the sudden change reflected a leap from one autonomous mode of communication to another, that is, from prelinguistic to linguistic communication. The implication for language acquisition is that the infant’s perceptual learning in the first year (e.g., Jusczyk, 1997) is unrelated to lexical development in the second year.

The present study, however, reported the opposite findings and the opposite conclusions. The data showed that regressions can occur at developmental milestones other than first words or single signs. Indeed, the regression of intonation occurred within the prelinguistic period at least two months before the onset of words. The milestone at this juncture is the onset of intentionality, a cognitive nonverbal stage that is arguably as significant a development as the onset of referential words (Research question 4). Instead of being an autonomous and monolithic block before words, the prelinguistic period is itself divided into more than one stage constituting a continuous step-like pathway to language.

Continuity of this type is in keeping with a large body of research, dating from the resurgence of infant development studies in the 1970s, which shows that the prelinguistic period predicts future language development (Iverson, Capirci, & Caselli, 1994; Trevarthen & Aitkin, 2001; Bates, O’Connell, & Shore, 1987). Iverson & Goldin-Meadow (2005: p.367) nicely summarize this position using gestures as an example: “Changes in gesture thus not only predate but also predict changes in language, suggesting that early gesture may be paving the way for future developments in language.”

Bruner (1974/1975¸was among the first to propose that prelinguistic gestures are closely related to language proper. He was also the first to recognize the limitations of that claim. Indeed, Bruner concluded that the gap between gestures and words is too large for infants to make in a single leap. At least one more step is needed. That step, Bruner speculated, is intonation. Instead of comprising a single step (nonverbal to verbal communication), the transition is modelled in two steps: nonverbal communication to intonation to words. The intermediary role of intonation allows for a more continuous transition to language than the single-step model permits.

Bruner’s conclusions add to converging evidence that intonation has a pivotal role in the transition from gestures to words. The beginning pages of the introduction foreshadowed this idea by characterizing intonation as a hybrid system which does not fit neatly into the traditional classifying schema based on descriptors like nonverbal versus verbal. Rather intonation has one foot planted in nonverbal pragmatics and the other in the vocal modality. In sum, the “hybrid” system serves as a bridge which begins with gesture and ends at the threshold of language.

4.2. Conclusions

The major conclusion of this investigation is that intonation is the vocal counterpart of gesture. This sister relation between intonation and gesture is owing to the common core of pragmatic meanings that the two systems share. The implication for the theory of regressions is that there are 2 forms (pitch contours and gestures) for a single function (pragmatics). The 2:1 mismatch between forms and functions sparks a regression – the first step toward progress and reorganization. Similar findings in cases involving a 1:2 configuration show that regressions are not tied to a single form-function type. Finally, the intermediary role of intonation in infant speech development supports the view that traditional boundaries between prelinguistic and linguistic communication are considerably more gradual and “fuzzy” than autonomous models would suggest.


Intonation and gesture are sister systems that spring from a common pragmatic source.

Complex form-function relations., e.g., 2 forms, 1 function, spark the regression of intonation.

Regressions in acquisition are signs of progress and emergent organization.

Infant intonation is an intermediary step between nonverbal and verbal communication.


This research was funded in part by a Kinley Trust grant from the Purdue Research Foundation and an R03 grant (DC04365-02) from the National Institute on Deafness and Other Communication Disorders. Portions of this study were presented at the 28th Annual Child Phonology Conference, University of Washington, June 22–23, 2007. I would like to thank Heather Balog, Betsy Evers, Amy Hanrahan, Violette Hawa, Joshua Kelly, Jennifer King, Elizabeth Kompagne, Tara Robinson, Erin Wright, and Jessica′ Yodor for their contributions to the data collection, perceptual analysis, and acoustic analysis phases of this research.


Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.


  • Austin JL. How to do things with words. Oxford: Oxford University Press; 1962.
  • Bates E, Camaioni I, Volterra V. The acquisition of performatives prior to speech. Merrill-Palmer Quarterly. 1975;21:205–226.
  • Bates E, O’Connell B, Vaid J, Sledge P, Oakes L. Language and hand preference in early development. Developmental Neuropsychology. 1986;2:1–15.
  • Bolinger D. Intonation and its parts: Melody in spoken English. Stanford: Stanford University Press; 1986.
  • Boysson-Bardies Bde, Vihman M, Roug-Hellichius L, Durand C, Landberg I, Arao F. Material evidence of infant selection from the target language: A cross-linguistic study. In: Ferguson CA, Menn L, Stoel-Gammon C, editors. Phonological development: Models, research, implications. Timonium, Maryland: York Press; 1992. pp. 369–391.
  • Bruner J. From communication to language: a psychological perspective. Cognition. 1974/1975;3:255–287.
  • Burns EM, Ward WD. Intervals, scales, and tuning. In: Deutsch D, editor. The psychology of music. New York: Cambridge University Press; 1982.
  • Clark EV. From gestures to word: On the natural history of deixis in language acquisition. In: Bruner JS, Garton A, editors. Human Growth and Development. 1978.
  • Cochet H, Vauclair J. Pointing gestures produced by toddlers from 15 to 30 months: different functions, hand shapes and laterality patterns. Infant Behavior and Development. 2010;33:431–441. [PubMed]
  • Cruttenden A. Intonation. Second. Cambridge: Cambridge University Press; 1997.
  • Crystal D. Baby Talk (VHS video) BBC: Public Broadcasting System; 1984.
  • Crystal D. A dictionary of linguistics and phonetics. 3rd. Oxford: Blackwell; 1991.
  • Dale P, Bates E, Reznick S, Morisset C. The validity of a parent report instrument of child language at twenty months. Journal of Child Language. 1989;16:239–250. [PubMed]
  • D’Odorico L, Franco F. Selective production of vocalization types in different communication contexts. Journal of Child Language. 1991;18:475–499. [PubMed]
  • Dore J. Holophrases, speech acts, and language universals. Journal of Child Language. 1975;2:20–40.
  • Fenson L, Dale PS, Reznick JS, Thal D, Bates E, Hartung JP, Pethick S, Reilly JS. MacArthur Communicative Development Inventories User’s Guide and Technical Manual. San Diego: San Diego State University; 1993.
  • Gleitman L, Wanner E. Language acquisition: The state of the state of the art. In: Wanner E, Gleitman L, editors. Language acquisition: The state of the art. Cambridge: Cambridge University Press; 1982.
  • Iverson JM, Capirci O, Caselli MC. From communication to language in two modalities. Cognitive Development. 1994;9:23–43.
  • Iverson JM, Goldin-Meadow S. Gesture paves the way for language development. Psychological Science. 2005;16:367–371. [PubMed]
  • Jakobson R. In: Child language, aphasia, and phonological universals. Keiler AR, editor. Mouton: The Hague; 1968.
  • Jusczyk PW. The discovery of spoken language. Cambridge, MA: the MIT Press; 1997.
  • Levitt AG. The acquisition of prosody: Evidence from French- and English-learning infants. In: de Boysson-Bardies B, de Schonen S, Jusczyk P, McNeilage P, Morton J, editors. Developmental neurocognition: Speech and face processing in the first year of life. Dordrecht: Kluwer; 1993. pp. 385–398.
  • Marcos H. Communicative functions of pitch range and pitch direction. Journal of Child Language. 1987;14:255–268. [PubMed]
  • McGregor WB. Linguistics: An introduction. 2nd. London: Bloomsbury Publishing; 2015.
  • Milenkovic P, Read C. CSpeech Version 4. Department of Electrical Engineering. Madison: University of Wisconsin; 1992.
  • Mundy P, Kasari C, Sigman M. Nonverbal communication, affective sharing, and intersubjectivity. Infant Behavior and Development. 1992;15:377–381.
  • Parladé MV, Iverson JM. The interplay between language, gesture, and affect during communicative transition: A dynamic systems approach. Developmental Psychology. 2011;47:820–833. [PMC free article] [PubMed]
  • Petitto L. On the autonomy of language and gesture: Evidence from the acquisition of personal pronouns in American Sign Language. Cognition. 1987;27:1–52. [PubMed]
  • Scaife M, Bruner JS. The capacity for joint visual attention in the infant. Nature. 1975;253:265–266. [PubMed]
  • Searle J. Speech acts. Cambridge, MA: Cambridge University Press; 1969.
  • Slobin D. Universal and particular in the acquisition of language. In: Wanner E, Gleitman LR, editors. Language acquisition: The state of the art. Cambridge: Cambridge University Press; 1982. pp. 128–170.
  • Slobin D. The crosslinguistic study of language acquisition. Volume 1; The data; Volume 2: Theoretical issues. Hillsdale, NJ: Lawrence Erlbaum Associates; 1985.
  • Snow D. Children’s imitations of intonation contours: are rising tones more difficult than falling tones? Journal of Speech, Language, and Hearing Research. 1998;41:576–587. [PubMed]
  • Snow D. Regression and reorganization of intonation between 6 and 23 months. Child Development. 2006;77:281–296. [PubMed]
  • Tomasello M. The role of joint attention processes in early language development. Language Sciences. 1988;10:69–88.
  • Tomasello M. Joint attention as social cognition. In: Moore C, Dunham PJ, editors. Joint attention: Its origin and role in development. Hillsdale, New York: Erlbaum; 1995. pp. 103–130.
  • Tomasello M, Carpenter M, Liszkowski U. A new look at infant pointing. Child Development. 2007;78:705–722. [PubMed]
  • Trevarthen C, Aitkin KJ. Infant intersubjectivity: Research, theory, and clinical implications. Journal of Child Psychology and Psychiatry. 2001;42:3–48. [PubMed]
  • Trevarthen C, Hubley P. Secondary intersubjectivity: Confidence, confiding, and acts of meaning in the first year. In: Lock A, editor. Action, gesture, and symbol. London: Academic Press; 1978. pp. 183–229.
  • Vihman MM. The construction of a phonological system. In: de Boysson-Bardies B, de Schonen S, Jusczyk P, McNeilage P, Morton J, editors. Developmental neurocognition: Speech and face processing in the first year of life. Dordrecht: Kluwer; 1993. pp. 411–419.
  • Vihman MM. Phonological development: The origins of language in the child. Cambridge, MA: Blackwell; 1996.