Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Appl Psycholinguist. Author manuscript; available in PMC 2011 January 1.
Published in final edited form as:
Appl Psycholinguist. 2010 January 1; 31(1): 209–224.
doi:  10.1017/S0142716409990221
PMCID: PMC2854417

When speech is ambiguous gesture steps in: Sensitivity to discourse-pragmatic principles in early childhood


Young children produce gestures to disambiguate arguments. This study explores whether the gestures they produce are constrained by discourse-pragmatic principles: person and information status. We ask whether children use gesture more often to indicate the referents that have to be specified, i.e., 3rd person and new referents, than the referents that do not have to be specified, i.e., 1st/2nd person and given referents. Chinese- and English-speaking children were videotaped while interacting spontaneously with adults, and their speech and gestures were coded for referential expressions. We found that both groups of children tended to use nouns when indicating 3rd person and new referents but pronouns or null arguments when indicating 1st/2nd person and given referents. They also produced gestures more often when indicating 3rd person and new referents, particularly when those referents were ambiguously conveyed by less explicit referring expressions (pronouns, null arguments). Thus Chinese- and English-speaking children show sensitivity to discourse-pragmatic principles not only in speech but also in gesture.

Young children often underspecify their intended referents starting in the two-word stage and continuing until 4 to 5 years of age, whether or not their ambient language permits underspecification (Allen, 2000, Serratrice, 2005; Valian, 1991). For example, a child might say “[var phi] eat cookies” ([var phi] refers to the omitted eater) or “I like this one” (“this one” refers to a particular puzzle), even when it is not clear from the context who is doing the eating or what the child likes. However, children routinely gesture when they talk (Goldin-Meadow, 2003; McNeill, 1992, 2005), and might be using gesture to compensate for their underspecification in speech. The child could, for example, point to the girl munching on a cookie while saying “eat cookies,” or point to his favorite puzzle while saying “I like this one”.

Gesture and speech develop together during the early language learning period (e.g., Volterra, Caselli, Capirci, & Pizzuto, 2005). After age 2, children construct an integrated speech-gesture system as they acquire their language (Butcher & Goldin-Meadow, 2000; Mayberry & Nicoladis, 2000) and distribute information across speech and gesture modalities. They convey information in gesture that is not conveyed in speech; for example, when engaging in spontaneous conversations (Goldin-Meadow & Butcher, 2003; Iverson & Goldin-Meadow, 2005; Özçali ş kan & Goldin-Meadow, 2005); when telling a story (Demir & So, 2006); or when solving a problem (Church & Goldin-Meadow, 1986, Goldin-Meadow, Alibali & Church, 1993, Goldin-Meadow, 2005).

The question we address here is whether children use their gestures to clarify a referent that is ambiguous in speech but ought to be (on discourse grounds) specified. Specifically, we ask whether the way young children gesture is constrained by the discourse-pragmatic rules underlying the language they are learning. We look, in particular, at two discourse-pragmatic features that determine whether a referent needs to be specified: person (1st/2nd vs. 3rd) and information status (given vs. new) (Clancy, 1993; Greenfield & Smith, 1976). A referent needs to be specified by overt arguments, like nouns, when it is a 3rd person (as opposed to 1st or 2nd person) or when it was not previously mentioned (i.e., new as opposed to given information). Imagine that a child has been talking to his mother about a puzzle and says, “I like this one.” It is perfectly clear which object the child likes in this context. Because the puzzle is given information, it does not need to be fully specified. Now consider a child who has not been talking about the puzzle and wants to tell his mother that he likes the puzzle. Because the puzzle is new information, it needs to be specified in order for the child to be fully informative. He could say, “I like this puzzle.” However, he could also use the less specified sentence along with a gesture; for example, “I like this,” said while pointing at the puzzle. If children's gestures are discourse appropriate (in this case, sensitive to the new/given status of the referent), the child should be more likely to point at the puzzle in the second scenario than in the first.

Do children use gestures more often to indicate referents that have to be specified than referents that can be underspecified or omitted? To address this question, we videotaped Chinese- and English-speaking children spontaneously interacting with an adult, and examined their sensitivity to 1st/2nd vs. 3rd person and to given vs. new information in their expression of referents in speech and gesture. English is a subject-prominent language (Bloom, 1990; Hyams & Wexler, 1993) and, as such, does not generally permit argument omission1. In contrast, Chinese is a null argument language that allows argument omission governed by discourse-pragmatic factors (Li & Thompson, 1979; Tsao, 1990). This grammatical difference between English and Chinese allows us to investigate whether children's use of gesture is sensitive to discourse-pragmatic principles. The following discourse example demonstrates the grammatical difference between English and Chinese.

Speaker A:[var phi]1chi1fan1Lemei2you3?
[var phi]1eat lunchASPnot
Have you eaten lunch?
Speaker B:[var phi]2he2Li3 xian1sheng4chi1fan1
[var phi]2withMr. Leeeat lunch
I have eaten lunch with Mr. Lee.
Speaker B:[var phi]3ji4debu4ji4deta1le?
[var phi]3rememberNotrememberhimINT
Do you remember him?
Speaker A:[var phi]4dang1ran2ji4de[var phi]5le
[var phi]4of courseremember[var phi]5INT
Of course I remember him.

In Chinese, underspecified or omitted arguments are used to refer to entities that are retrievable from the discourse; nouns are used to refer to entities that are not retrievable (e.g., Huang, 1984, 1994; Lee & Naigles, 2005; Li & Thompson, 1979; Tsao, 1990). 1st and 2nd person are always active in discourse and are thus retrievable and can be omitted (e.g., [var phi]1, [var phi]2, [var phi]3, [var phi]4) (Chafe, 1994, 1996; Dimitriadis, 1995). In contrast, 3rd person is not always active in discourse; whether it is explicitly mentioned depends on whether it is new or given information. If the 3rd person referent was not previously mentioned (i.e., it is new information), it is not retrievable from discourse and should be expressed in an overt argument (e.g., Li3 xian1sheng4). If, however, the 3rd person referent was previously mentioned (i.e., it is given information), it can be omitted (e.g., [var phi]5) or presented in a less explicit form such as a pronoun (e.g., ta1)2 (Chafe, 1994, 1996; Dimitriadis, 1995).

Unlike Chinese, English is relatively strict in terms of representing arguments and generally does not allow argument omission (see Chomksy, 1981). As a result, explicit forms must be used to indicate even the given information in the preceding example (i.e., 1st and 2nd person). However, as in Chinese, discourse factors do influence the choice of referential expression in English (e.g., Chafe, 1976, 1994, 1996; Huang, 1994, 1995; Levinson, 1987; Levinson, 1991). For example, less explicit forms are typically used instead of more explicit forms for previously mentioned referents (I and you are used, rather than the speakers' names, in the preceding example). Thus, personal pronouns tend to be used in English in situations where omission is possible in Chinese.

In the experiment reported here, we expected to replicate previous findings in the literature and find that both Chinese- and English-speaking children would show sensitivity in their speech to the discourse-pragmatic principles person and information status that underlie their respective languages. The question of interest is whether children also show sensitivity to these two discourse features in the gestures that they produce along with speech.



The participants were 6 English-speaking and 6 Mandarin Chinese-speaking children, living in Chicago, USA, and Nanjing, China, respectively, in middle-class homes. The English-speaking children were, on average, 4;1 (years;months) old, ranging from 3;7 to 5;2. The Chinese-speaking children were, on average, 3;11, ranging from 2;10 to 4;11.3 None of the children had any major sensory or hearing problems, and none of the children or caregivers knew a conventional sign language. Families were recruited from postings and were paid for their participation.


The children participated in free play activities and spontaneous conversations with their caregivers (mothers, fathers or grandparents) and an experimenter. Both caregivers and experimenter were instructed to interact naturally with the children. A bag of toys, books, pictures, and puzzles was brought to each taping session to facilitate communication. The session lasted for approximately 45 minutes for each child (ranging from 30 minutes to an hour, depending on the attention span of the child) and was videotaped.

Speech coding

All conversations between children and caregivers/experimenter were transcribed by research assistants who were native speakers of English or Mandarin-Chinese. All transcripts were then checked by a second coder who was also a native speaker. Breaths, pauses, and speech dysfluencies such as self-interruptions, self-corrections, and repetitions were included in the transcriptions. The stream of speech was segmented into utterances. Utterances that contained syntactic questions, imitations, unintelligible sounds, songs, or poems were excluded from the analyses.

Our unit of speech analysis was the clause. A clause is a grammatical unit that expresses propositions4, and includes a predicate5 (Crystal, 1980; Hartmann & Stork, 1972; Pei & Gaynor, 1954). We analyzed clauses containing predicates that described actions involving a subject and either a direct object (e.g., I eat an apple, wo3 chi1 ping2guo3) or indirect object (I go to school, wo3 qu1 xue3xiao1). Utterances containing more than one clause connected by a conjunction, for example, and (hai3you2), or but (bu2guo4), were separated into two clauses.

The following types of clauses were excluded from the database: (1) clauses that did not contain either a direct or indirect object, e.g., I go (wo3 qu4); (2) clauses containing copula verbs, e.g., is (shi1), because copulas are optional in Chinese; (3) ditransitive clauses containing both direct and indirect objects, e.g., I give a pen to you (wo3 ge2 ni3 bi3) because our goal was to compare clauses of equal syntactic complexity; (4) clauses with serial verb structure (applicable only in Chinese); e.g., wo3 na2 zhe4ge4 jiao1 ge2 ni3 (I hold this give to you), again to insure equality in syntactic complexity; (5) clauses containing grammatical omission of subjects (applicable only in English), including imperatives (e.g., Open this door!); wanna questions (e.g., Wanna eat this?); implied first person declaratives in past tense (e.g., Got it!); and progressive participles in response to questions (e.g., Brushing teeth).

Thus, each clause contained a subject and an either a direct or indirect object that could potentially be expressed in a complete description of the action. We identified the subject and object within each clause and assigned each to the following categories according to the form of the expression used to refer to it: null argument, personal pronoun, e.g., he, she, it (ta1)6, demonstrative pronoun, e.g., this, that (zhe4ge4, ne4ge4), or noun, e.g., cat, dog (mao1, gou2)7. Subjects and objects were also classified according whether the referent was a 1st or 2nd vs. 3rd person. Referents were further classified according to information status: 1st and 2nd persons were assumed to be given information (Chafe, 1994; 1996); a 3rd person referent was considered given if it was mentioned somewhere in the preceding 20 utterancesand new ifit had not been mentioned (Chafe, 1987; Du Bois, 1987).

Gesture coding

We analyzed the gestures that co-occurred with the relevant clauses and determined whether the subject and object within each clause were identified in gesture. We followed Goldin-Meadow and Mylander (1984; see also Iverson & Goldin-Meadow, 2005, and Özçalişkan & Goldin-Meadow, 2005) in excluding hand movements that involved direct manipulation of an object (e.g., placing a toy on a floor) or were part of a ritualized game (e.g., putting a puzzle in a puzzle slot) from the database. Gestures were of three types: (1) object-referring iconic gestures8 that bear a resemblance to the referents they represent (e.g., two hands flapped at shoulders, classified as a reference to bird); (2) pointing gestures that refer to objects, people, or places by singling out the referent (e.g., index finger point to a bottle, classified as a reference to bottle); and (3) hold-up gestures that refer to objects by raising them in the air (e.g., hold-up bottle, classified as a reference to bottle). The purpose of a hold-up gesture is not to manipulate the object, but to draw the interlocutor's attention to the object (Gullberg, de Bot, & Volterra, 2008).

Each gesture was then assigned a semantic meaning. The semantic meaning of a gesture was determined by its form in conjunction with the speech in the clause with which it occurred. For example, two hands flapping at shoulders produced in conjunction with the clause, “The bird eats a worm,” was assumed to refer to the bird. If the gesture was not accompanied by a clause containing a word that expressed its referent, context or form was used to determine the gesture's meaning. The meaning of a point or a hold-up gesture depended on the context of interpretation; for example, a point at a puzzle was assumed to refer to the puzzle. The meaning of an iconic gesture depended on its form; for example, a curved palm moving toward the mouth was assumed to refer either to a glass or to the action of drinking.9

The proportion of referents conveyed in gesture was calculated as the total number of referents conveyed in gesture, divided by the total number of referents conveyed in speech and/or gesture. All proportions were subjected to an arcsine transformation before statistical analysis.


A subset of each transcript (20%) in English and Chinese was independently coded by a second research assistant, who was a bilingual speaker in English and Chinese and was trained to code speech and gesture. Reliability was 98% for the English-speaking children (N=120) and 97% for the Chinese-speaking children (N=140) for identifying target clauses; 100% for the English-speaking children (N=236) and 100% for the Chinese-speaking children (N=262) for classifying references to subjects and objects according to speech form (noun, pronoun, etc.); 100% for the English-speaking children (N=236) and 100% for the Chinese-speaking children (N =262) for determining 1st, 2nd, and 3rd person status of the referents; 90% for the English-speaking children (N=236) and 93% for the Chinese-speaking children (N=262) for determining information status of the referents; 85% for the English-speaking children (N=236) and 84% for the Chinese-speaking children (N=262) for identifying gestures; 95% for the English-speaking children (N=201); 90% for the Chinese-speaking children (N=220) for determining types of gestures; and 92% for the English-speaking children (N=201) and 88% for the Chinese-speaking children (N=220) for identifying the semantic meaning of gestures.


We analyzed all clauses containing predicates that described actions involving both subjects and either direct or indirect objects10. There were no significant differences in the number of clauses produced by the English-speaking children (M=80.67, SD=41.01) and the Chinese-speaking children (M=120, SD=43.16), t(10)=1.62, ns. However, the Chinese-speaking children produced gestures for a greater proportion of the referents they indicated (M=.18, SD=.05, ranging from .13 to .24) than the English-speaking children (M=.08, SD=.04, ranging from .04 to .15), t(10)=3.26, p<.009.

The goal of this study was to explore the role of two discourse-pragmatic features, person and information status, in referential expressions in speech and gesture. We look first at how discourse-pragmatic features influenced lexical choices in speech and we then turn to gesture.


We first examined how subjects and objects were expressed in speech. Figure 1 shows the distribution of lexical choices in speech in the English- and Chinese-speaking children. We conducted a repeated ANOVA with proportion of arguments expressed as the dependent variable, and with type of referential expression (null, personal pronoun, demonstrative pronoun, noun) as a within-subject independent variable, and language (English, Chinese) as a between-subject independent variable. We found a significant effect of referential expression, F(3, 30)=12.18, p<0.001, a marginal effect of language, F(1, 10)=3.62, p<0.08, and a significant interaction, F(3, 30)=9.82, p<.001. As expected, the Chinese-speaking children produced null arguments (i.e., they omitted arguments) more often than the English-speaking children, t(10)=8.94, p<.001. The English-speaking children produced personal pronouns more often than the Chinese-speaking children, t(10)=4.46, p<.001. In fact, the proportion of null arguments that the Chinese-speaking children produced was similar to the proportion of personal pronouns that the English-speaking children produced. No significant differences were found between the groups in demonstrative pronouns, t(10)=.64, ns, or full nouns, t(10)=1.58, ns.

Figure 1
Distribution of different forms of referential expressions in speech produced by Chinese-speaking and English-speaking children.

Thus, children in the two language groups did not differ in how often they produced explicitly specified referents––they used nouns equally often. They differed only in the type of less specified expressions they used––English-speaking children used personal pronouns as their preferred form, Chinese-speaking children used null arguments. The two groups of children had learned to use the less specified term appropriate to the language each was acquiring. Because null arguments, personal pronouns, and demonstrative pronouns are all less explicit than nouns, we grouped them together into a non-noun category in the following analyses.

We next ask whether person and information status affect the explicitness of the children's referential expressions. We classified referents into three types: (1) 1st/2nd person (which were assumed to be given); (2) 3rd person given; (3) 3rd person new. We found that .37 (SD=.07) of the referents that the Chinese-speaking children produced were 1st/2nd person, .31 (SD=.09) were 3rd person given, and .32 (SD=.09) were 3rd person new; comparable numbers for the English-speaking children were .36 (SD=.07), .31 (SD=.03), and .33 (SD=.06).

We expected that the children would be sensitive to discourse-pragmatic factors. We hypothesized that, of the three categories, 3rd person referents that were new to the context would be the least known to a listener and thus ought to be explicitly specified more often than 3rd person referents that were given, followed by 1st/2nd person referents. Figure 2 presents the proportion of nouns and non-nouns that the children used in each referential category.

Figure 2
Distribution of nouns and non-nouns for 1st/2nd person, 3rd person given, and 3rd person new referents produced by Chinese-speaking children and English-speaking children.

We conducted a repeated-measure ANOVA with the proportion of non-nouns that the children produced as the dependent variable and referential category (1st/2nd person, 3rd person given, and 3rd person new) as a within-subject independent variable, and language (Chinese, English) as a between-subject independent variable. We found a significant effect of referential category, F(2,20)=121.50, p<.0001, no effect of language, F(1,10)=.54, ns, and no interaction, F(2,20)=.10, ns. Bonferroni-adjusted pairwise comparisons showed that children in both language groups produced non-nouns more often when referring to 1st/2nd person than when referring to 3rd person given, p<.0001, and 3rd person new, p<.0001. They also produced non-nouns more often when referring to 3rd person given than when referring to 3rd person new, p=.001.

Thus, children in both language groups tended to use less specified forms (pronouns, null arguments) for referents that did not need to be specified (1st/2nd person, 3rd person given). Importantly, null arguments were used in the same way in Chinese-speaking children as personal pronouns in English-speaking children: 61% (SD=.10) of the null arguments that the Chinese-speaking children used referred to 1st/2nd person and 3rd person given referents, referents that did not need to be specified. Similarly, 67% (SD=.15) of the personal pronouns that the English-speaking children produced referred to 1st/2nd person and 3rd person given referents.

Children in both languages tended to use nouns when they needed to––that is, when discourse required that the referents be specified (3rd person new referents). Still 40% of the 3rd person new referents that they produced were conveyed by non-nouns and thus were underspecified. Our next question was whether the children used gesture to help disambiguate these underspecified forms.


Children in both groups produced all three types of gestures and in roughly the same proportions––pointing (M=.65, SD=.11), hold-up (M=.30, SD=.12), iconic (M=.05, SD=.06) in the Chinese-speaking children; pointing (M=.53, SD=.29), hold-up (M=.37, SD=.32), iconic (M=.10, SD=.12) in the English-speaking children.

Figure 3 displays the proportion of expressions indicating 1st/2nd person, 3rd person given, 3rd person new referents that were accompanied by gesture in the two groups of children. We conducted a repeated ANOVA with proportion of expressions accompanied by gesture as the dependent variable, and referential category (1st/2nd person, 3rd person given, 3rd person new) as a within-subject independent variable, and language (English, Chinese) as a between-subject independent variable. We found a significant effect of referential category, F(2,20)=34.68, p<.0001, a significant effect of language, F(1,10)=60.50, p<.0001, and no interaction, F(2,20)=2.79, ns. Overall, Chinese-speaking children produced gestures more often than English-speaking children, p<.0001 (perhaps because Chinese caregivers produce more gestures when interacting with their children than American caregivers, Goldin-Meadow & Saltzman, 2000). However, both groups of children produced gestures more often when indicating 3rd person new referents than when indicating 3rd person given referents, p=.024, and 1st/2nd person referents, p<.0001. They also produced gestures more often when indicating 3rd person given referents than when referring to 1st/2nd person referents, p=.002.11

Figure 3
Proportion of 1st/2nd person, 3rd person given, 3rd person new referents accompanied by gesture in Chinese-speaking children (left) and English-speaking children (right).

We are now able to address our final question––are gesture and speech working as an integrated system to specify referents? We focused on referents that need to be specified (i.e., new 3rd persons), which were, in fact, frequently accompanied by gesture (see Figure 3). We asked whether non-nouns were accompanied by gesture more often than nouns––the pattern we would expect if the children were using their gestures to adjust for the fact that non-nouns are underspecified relative to nouns.

Figure 4 presents the proportion of 3rd person new referents conveyed by nouns or non-nouns (null arguments, pronouns) that were accompanied by gestures. We conducted a repeated ANOVA with proportion of referential expressions accompanied by gesture as the dependent variable, and type of referential expression (non-noun, noun) as a within-subject independent variable, and language (Chinese, English) as a between-subject independent variable. We found a significant effect of type of referential expression, F(1, 10)=20.28, p=.001, a significant effect of language, F(1,10)=23.33, p=.001, and no interaction, F(1,10)=1.22, p=ns. Not surprisingly given the earlier analyses, Chinese-speaking children produced gestures proportionally more often than English-speaking children. More importantly, when indicating 3rd person new referents, both groups of children produced gestures more often with non-nouns than with nouns, as we would expect if gesture is being used to further specify underspecified referents.

Figure 4
Proportion of 3rd person new referents conveyed by nouns (black bars) and non-nouns (null arguments, pronouns, white bars) that were accompanied by gesture in Chinese-speaking children (left) and English-speaking children (right).

To summarize, both Chinese- and English-speaking children produced gestures more often to indicate referents that should be specified in discourse, particularly when those referents were conveyed by potentially ambiguous words.


Our study explored whether English- and Chinese-speaking children display sensitivity to two discourse-pragmatic features––person and information status––in speech and in gesture. In terms of speech, we found that, despite the fact that Chinese is a null argument language and English is a subject-prominent language, Chinese-speaking and English-speaking children produced fully specified referring expressions (i.e., nouns) required by the discourse equally often. However, the children displayed sensitivity to the discourse requirements of their respective languages in the underspecified expressions they produced––Chinese-speaking children omitted arguments for their underspecified forms, whereas English-speaking children produced pronouns. Importantly, both groups of children produced their underspecified forms in appropriate discourse contexts––more often for referents that did not need to be fully specified (1st/2nd person and 3rd person given referents) than for referents that did need to be specified (3rd person new referents). Thus, irrespective of the language they were learning, children produced fully specified forms (nouns) more often when expressing referents that needed to be specified than when expressing referents that did not need to be specified (3rd person new referents > 3rd person given referents > 1st/2nd person referents).

Previous research has found that children learning null argument languages are particularly sensitive to discourse-pragmatic features (e.g., Allen, 2000; Allen & Schroder, 2003; Serratrice, 2005). Italian- and Inuktitut-speaking children tend to use overt arguments to indicate 3rd person and new referents but null arguments to indicate 1st or 2nd person and given referents. Similar findings have been reported in other languages (Korean: Clancy, 1993; Hindi: Narasimhan, Budwig & Murty, 2005; Romance: Paradis & Navarro, 2003). Our study builds on this research by extending the phenomenon to Chinese, another null argument language. Interestingly, children who are learning a subject-prominent language, like English, appear to be as sensitive to discourse-pragmatic principles as children learning Chinese, a null argument language, suggesting that sensitivity to discourse is universal across language learners.

Importantly, we found that the children also displayed sensitivity to discourse in their gestures, producing precisely the same pattern as they displayed in speech. They produced gestures for referents that needed to be specified more often than they produced gestures for referents that did not need to be specified (3rd person new referents > 3rd person given referents > 1st/2nd person referents). Thus, the children were paying attention to discourse-pragmatic information when deciding when to use gesture.

Moreover, when children used underspecified forms in their speech to refer to referents that needed to be specified (i.e., when they used null arguments and pronouns, as opposed to nouns, to refer to 3rd person new referents), the children produced gestures along with these non-nouns. In other words, gesture stepped in to clarify potentially ambiguous speech and did so equally often in children learning English and Chinese (Allen, 2008; but see Guerriero, Oshima-Takane & Kuriyama, 2006, for a comparison of children learning English and Japanese).

How do young children develop sensitivity to discourse-pragmatic principles? Previous work has found that caregivers use discourse-pragmatic strategies when talking to their children (Clancy, 1993; Paradis & Navarro, 2003). However, few studies have investigated the relation between parental input and the development of children's sensitivity to discourse-pragmatic features. One exception is a longitudinal study by Guerriero, Oshima-Takane, and Kuriyama (2006). They followed English- and Japanese-speaking children for more than a year, observing conversations between the children and their parents. The English-speaking parents showed consistent language-specific discourse patterns in their referential expressions in speech, but the Japanese-speaking parents did not. In turn, the English-speaking children developed discourse-pragmatic strategies earlier than the Japanese-speaking children. These findings suggest that children may learn about discourse-pragmatic features in their language from their parents' speech.

But unlike the children in our study, adults do not appear to routinely use gesture to clarify potentially ambiguous referring expressions. So, Kita and Goldin-Meadow (in press) showed English-speaking adults vignettes of two stories and asked them to retell the stories to an experimenter. Since none of the protagonists or objects in the story was present, the adults could not use points at real-world objects to indicate referents in the story. However, they did use points at space to indicate particular referents. A gesture was considered to identify a referent if it was produced in the same location as the previous gesture for that referent. The adults frequently used gesture location to identify referents. However, they used gesture to identify referents that were already specified in speech, and not to clarify referents that were ambiguous in speech (even though they produced a number of expressions that did not fully specify the referent). In other words, the adults did not use gesture to disambiguate speech, as the children in our study did.

Note, however, that there are many differences between the So et al (in press) study and ours. First, the participants in So et al. were telling stories; our participants were engaged in spontaneous conversation. The second difference, which follows from the first, is that the participants in So et al. were pointing at empty spaces, which were used to stand for particular referents; our participants were pointing at real objects and people in the room. Finally, the participants in So et al. were adults; our participants were children. The differences between adults and children may stem from the different types of discourse examined in the two studies (displaced story telling vs. here-and-now conversation). Thus, it is possible that adults do use gesture to disambiguate their underspecified speech when they engage in spontaneous conversations where points can be directed at real-world objects.

Alternatively, using gesture to disambiguate underspecified speech may be a characteristic of early childhood, one that disappears as children become more proficient speakers. Under this view, children use gesture differently from adults simply because they have not yet fully mastered lexical specification in speech. This phenomenon would then be another instance of gesture preceding, and perhaps propelling, advances in speech (Iverson & Goldin-Meadow, 2005) and in other cognitive tasks (Goldin-Meadow, 2003; Goldin-Meadow, Alibali & Church, 1993). Comparable data from adults speaking null argument and subject-prominent languages engaged in a here-and-now conversation is needed to fully test this view.

Our findings also have clinical implications. Children with language impairments often have difficulty producing sentences with complex argument structure (Grela, 2003). In line with our findings, these children might be able to use gesture to specify referents that they are not able to specify in speech. Indeed, children whose language development is impaired for a variety of reasons (focal brain injury: Sauer, Levine & Goldin-Meadow, 2009; Downs syndrome: Caselli, Vicari, Longobardi, Lami, Pizzoli, & Stella, 1998; Stefanini, Caselli, & Volterra, 2007; Stefanini, Recchia, & Caselli, 2008; Specific Language Impairment: Evans, Alibali & McNeill, 2001; Fex & Mansson, 1998) have been shown to use gestures to compensate for their communicative deficiencies.

To summarize, we have found that children in the early stages of language learning use speech and gesture to identify referents and do so in accordance with discourse-pragmatic principles. They use nouns and gestures more often when indicating referents that need to be specified than when indicating referents that can be inferred from context. Moreover, when speech is less specific than it needs to be, gesture is often used to fill the breach, whether the child is learning English or Chinese.


This research was supported by the Provost Research Funding R-581-000-074-133 to W.C. So at National University of Singapore and NIH RO1 00491 to S. Goldin-Meadow at the University of Chicago. We also thank Wenping Xue, Stacy Steine, Zachary Mitchell and Carolyn Mylander for help in collecting data in Nanjing, China, and Chicago, USA; Lim Jia Yi, Tee Can Shou Joseph, Lee Yingqi, Tan Wenlin, Elizabeth Sarah Ragen, Chew Xin Ying Ivane, and Kirrthana Krishnamoorthy for help in coding the data. The coding system for parts of speech and gesture was established in the first author's dissertation.


1Under some circumstances, argument omission is permitted in English; see examples in the Method section.

2In this example, ta1 is more appropriate than a null argument to identify Mr. Lee. A null argument might be mistakenly understood as the event of Speaker B having lunch with Mr. Lee. Note, however, we did not aim to study the differences between pronouns and null arguments in Chinese in the present study. Both pronouns and null arguments were considered less explicit forms of referential expressions.

3The English-speaking children were somewhat older than the Chinese-speaking children. In order to be certain that age was not responsible for any differences found between the groups, we redid all of the analyses on 3 Chinese- and 3 English-speakers matched for age. We found that the patterns in this matched sample were identical to those reported below.

4A proposition is the meaning content of units within the clause

5A predicate is the portion of a clause, excluding the subject, that expresses something about the subject

6The pronunciation of the pronouns referring to animate and inanimate entities is the same, i.e., ta1.

7Any combinations of demonstrative pronoun and nouns were assigned to the noun category.

8Iconic gestures are also known as characterizing (Goldin-Meadow & Mylander, 1984) or representational (Gullberg, de Bot, & Volterra, 2008) gestures.

9The children produced very few iconic gestures overall: 5% of the English-speaking children's gestures and 10% of the Chinese-speaking children's gestures were iconic.

10Following the criteria described in the speech coding section, we excluded 461 clauses in the English-speaking children and 684 clauses in the Chinese-speaking children.

11As in our previous analyses, Chinese-speaking children used null arguments in the same way as English-speaking children used personal pronouns. The Chinese-speaking children produced more gestures with their null arguments for referents that needed to be specified (3rd person new, M=.45, SD=.17) than for referents that did not need to be specified (3rd person given, M=.08, SD=.07; 1st/2nd person, M=.01, SD=.03). English-speaking children showed precisely the same pattern for personal pronouns: 3rd person new, M=.26 (SD=.27) vs. 3rd person given, M=.04 (SD=.07), and 1st/2nd person, M=.01 (SD=.01).


  • Allen SEM. A discourse-pragmatic explanation for argument representation in child Inuktitut. Linguistics. 2000;38(3):483–521.
  • Allen S, Schroder H. Preferred argument structure in early Inuktitut spontaneous speech data. In: Du Bois JW, Kumpf L, Ashby W, editors. Preferred argument structure: Grammar and architecture for function. Amsterdam: John Benjamins; 2003.
  • Allen SEM. Interacting pragmatic influences on children's argument realization. In: Bowerman M, Brown P, editors. Crosslinguistic perspectives on argument structure: Implications for learnability. Mahwah, NJ: Erlbaum; 2007. pp. 191–210.
  • Bloom P. Subjectless sentences in child language. Linguistic Inquiry. 1990;24:721–734.
  • Bosch P. Agreement and anaphora: A study of the roles of pronouns in discourse and syntax. London: Academic Press; 1983.
  • Butcher C, Goldin-Meadow S. Gesture and the transition from one- to two-word speech: When hand and mouth come together. In: McNeill David., editor. Language and Gesture. Cambridge: Cambridge University Press; 2000. pp. 235–257.
  • Caselli MC, Vicari S, Longobardi E, Lami L, Pizzoli C, Stella G. Gestures and words in early development of children with Down Syndrome. Journal of Speech, Language and Hearing Research. 1998;41:1125–1135. [PubMed]
  • Chafe WL. Givenness, contrastiveness, definiteness, subjects, topics, and point of view. In: Li CN, editor. Subject and Topic. New York: Academic Press; 1976. pp. 25–55.
  • Chafe WL. Discourse, consciousness, and time: the flow and displacement of conscious experience in speaking and writing. Chicago: The University of Chicago Press; 1994.
  • Chafe WL. Inferring identifiability and assessibility. In: Fretheim T, Gundel J, editors. Reference and referent accessibility. Amsterdam: John Benjamins; 1996. pp. 37–46.
  • Clancy PM. Preferred argument structure in Korean acquisition. In: Clark EV, editor. Proceedings of the twenty-fifth annual Child Language Research Forum. Stanford, C.A: CSLI Publications; 1993.
  • Chomksy N. Lectures on government and binding. Dordrecht, the Netherlands: Foris; 1981.
  • Church RB, Goldin-Meadow S. The mismatch between gesture and speech as an index of transitional knowledge. Cognition. 1986;23:43–71. [PubMed]
  • Crystal D. A first dictionary of linguistics and phonetics. Boulder, CO: Westview; 1980.
  • Demir E, So WC. What's hidden in the hands? How children use gesture to convey arguments in a motion event. In: Brugos A, Clark-Cotton MR, Ha S, editors. Proceedings of the 31th Annual Boston University Conference on Language Development; Somerville, MA: Cascadilla Press; 2006. pp. 172–183.
  • Dimitriadis A. When pro-drop languages don't: On overt pronominal subjects in Greek. Penn Working Papers in Linguistics. 1995;2:45–60.
  • Du Bois JW. The discourse basis of ergativity. Language. 1987;63:805–855.
  • Evans J, Alibali M, McNeill NM. Divergence of verbal expression and embodied knowledge: Evidence from speech and gesture in children with specific language impairment. Language and Cognitive Processes. 2001;16(2):309–331.
  • Fex B, Mansson AC. The use of gestures as a compensatory strategy in adults with acquired aphasia compared to children with specific language impairment (SLI) Journal of neurolinguistics. 1998;11:191–206.
  • Fox B. Discourse structure and anaphora. Cambridge: Cambridge University Press; 1987.
  • Garrod S. Anaphora resolution. In: Smelser NJ, Baltes PB, editors. International encyclopedia of the social and behavior sciences. Amsterdam: Elsevier; 2001. pp. 490–494.
  • Givon T. Universals of discourse structure and second language acquisition. In: Rutherford WE, editor. Language universals and second language acquisition. Amsterdam: Benjamins; 1984. pp. 109–136.
  • Goldin-Meadow S, Alibali MW, Church RB. Transitions in concept acquisition: Using the hand to read the mind. Psychological Review. 1993;100(2):279–297. [PubMed]
  • Goldin-Meadow S, Butcher C. Pointing toward two-word speech in young children. In: Sotaro Kita., editor. Pointing: Where language, culture, and cognition meet. Mahwah, MJ: Erlbaum; 2003. pp. 85–107.
  • Goldin-Meadow S, Saltzman J. The cultural bounds of maternal accommodation: How Chinese and American mothers communicate with deaf and hearing children. Psychological Science. 2000;11:311–318. [PubMed]
  • Goldin-Meadow S, Mylander C. Gestural communication in deaf children: The effects and non-effects of parental input on early language development. Monographs of the Society for Research in Child Development. 1984;49(3) no.207. [PubMed]
  • Goldin-Meadow S. Hearing gesture: How our hands help us think. Cambridge, MA: Harvard University Press; 2003.
  • Goldin-Meadow S. The two faces of gesture: Language and thought. Gesture. 2005;5:239–255.
  • Grela BG. Production based theories may account for subject omission in normal children and children with SLI. Journal of Speech-Language Pathology and Audiology. 2003;27:221–228.
  • Gullberg M, de Bot K, Volterra V. Gestures and some key issues in the study of language development. Gesture. 2008;8(2):149–179.
  • Guerriero AMS, Oshima-Takane Y, Kuriyama Y. The development of referential choice in English and Japanese: a discourse-pragmatic perspective. Journal of Child Language. 2006;33:823–857. [PubMed]
  • Gundel JK, Hedberg N, Zacharski R. Cognitive status and the form of referring expressions in discourse. Language. 1993;69:274–307.
  • Hartmann RRK, Stork FC. Dictionary of language and linguistics. London: Applied Science; 1972.
  • Huang CTJ. On the distribution and reference of empty pronouns. Linguistic Inquiry. 1984;15:531–574.
  • Huang CTJ. On null subjects and null objects in generative grammar. Linguistics. 1994;33:1081–1123.
  • Hyams N, Wexler K. On the grammatical basis of null subjects in child language. Linguistic Inquiry. 1993;24(3):421–459.
  • Iverson JM, Goldin-Meadow S. Gesture paves the way for language development. Psychological Science. 2005;16:367–371. [PubMed]
  • Kendon A. Gesture and speech: How they interact. In: Weimann JM, Harrison RP, editors. Nonverbal interaction. Beverly Hills, CA: Sage; 1983.
  • Lee J, Naigles LR. Input to verb learning in Mandarin Chinese: A role for syntactic bootstrapping. Developmental Psychology. 2005;41:529–540. [PubMed]
  • Levinson S. Pragmatic reduction of the binding conditions revisited. Journal of Linguistics. 1991;27:107–161.
  • Levinson SC. Pragmatics and the grammar of anaphora: A partial pragmatic reduction of binding and control phenomena. Journal of Linguistics. 1987;23:379–434.
  • Li CN, Thompson S. Third-person pronouns and zero-anaphora in Chinese discourse. In: Givon T, editor. Syntax and semantics: Discourse and syntax. Vol. 12. New York: Academic; 1979. pp. 311–335.
  • Lyons J. Semantics 2. Cambridge: Cambridge University Press; 1977.
  • Mayberry R, Nicoladis E. Gesture reflects language development: Evidence from bilingual children. Current Directions in Psychological Science. 2000;9(6):192–196.
  • McNeill D. Hand and Mind: What gesture Reveals about Thought. University of Chicago Press; 1992.
  • McNeill D. Gesture and Thought. The University of Chicago Press; 2005.
  • Narasimhan B, Budwig N, Murty L. Argument realization in Hindi caregiver-child discourse. Journal of Pragmatics. 2005;37:461–495.
  • Özçalişkan S, Goldin-Meadow S. Gesture is at the cutting edge of early language development. Cognition. 2005;96:B101–113. [PubMed]
  • Paradis J, Navarro S. Subject realization and crosslinguistic interference in the bilingual acquisition of Spanish and English: what is the role of input? Journal of Child Language. 2003;30:371–393. [PubMed]
  • Pei MA, Gaynor F. A dictionary of linguistics. New York: Philosophical Library; 1954.
  • Sauer E, Levine SC, Goldin-Meadow S. Early gesture predicts language delay in children with pre- and perinatal brain lesions. 2009 under review. [PMC free article] [PubMed]
  • Serratrice L. Syntax and pragmatics in the acquisition of subjects in Italian. Paper presented at the Ninth meeting of the International association for the Study of Child Language; University of Madison; 2002.
  • Serratrice L. The role of discourse pragmatics in the acquisition of subjects in Italian. Applied Psycholinguistics. 2005;26:437–462.
  • So WC, Kita S, Goldin-Meadow S. Using the hands to identify who does what to whom: speech and gesture go hand-in-hand. Cognitive Science in press. [PMC free article] [PubMed]
  • Stefanini S, Caselli MC, Volterra V. Spoken and gestural production in a naming task by young children with Down Syndrome. Brain and Language. 2007;101(3):208–221. [PubMed]
  • Stefanini S, Recchia M, Caselli MC. The relationship between spontaneous gesture production and spoken lexical ability in children with Down syndrome in a naming task. Gesture. 2008;8(2):197–218.
  • Tomasello M, Anselmi D, Farrar MJ. Young children's coordination of gestural and linguistic reference. First Language. 1985;5:199–210.
  • Tsao Ff. Sentence and clause structure in Chinese: a function perspective. Taipei: Student Book; 1990.
  • Valian V. Syntactic subjects in the early speech of American and Italian children. Cognition. 1991;40:21–81. [PubMed]
  • Volterra V, Caselli MC, Capirci O, Pizzuto E. Gesture and the emergence and development of language. In: Tomasello Michael, Slobin Dan I., editors. Beyond nature-nurture: Essays in honor of Elizabeth Bates. Mahwah, MJ: Erlbaum; 2005. pp. 3–40.
  • Wilcox MJ, Howse P. Children's use of gestural and verbal behavior in communicative misunderstandings. Applied Psycholinguistics. 1982;3:15–27.