Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
J Exp Child Psychol. Author manuscript; available in PMC 2010 May 6.
Published in final edited form as:
PMCID: PMC2865246

Toddlers’ referential understanding of pictures


Pictures are referential in that they can represent objects in the real world. Here we explore the emergence of understanding of the referential potential of pictures in the second year of life. In Study 1, 15-, 18-, and 24-month-old children learned a word for a picture of a novel object (e.g., “blicket”) in the context of a picture-book interaction. Later they were presented with the picture of a blicket along with the real object it depicted and asked to indicate “a blicket.” Many of the 24-, 18-month-olds and even 15-month-olds indicated the real object as an instance of a “blicket”, consistent with an understanding of the referential relation between pictures and objects. In Study 2, children were tested with an exemplar object that differed in color from the depicted object to determine if they would extend the label they had learned for the depicted object to a slightly different category member. The 15-, 18- and 24-month-old participants failed to make a consistent referential response. The results are discussed in terms of whether pictorial understanding at this age is associative or symbolic.

Pictures are among the most common symbols infants and young children are exposed to early in life. The majority of children in western societies regularly encounter pictures in children’s books, family albums, magazines, and so on. Previous research establishes that by 30-months of age children are able to use pictures referentially, as symbols for and sources of information about the world (DeLoache & Burns, 1994), but little is known about when in development this capacity first emerges, and what limitations might accompany it.

There is abundant evidence that children perceive the similarity between pictures and their referents very early in life. For instance, infants as young as 3 months can recognize their mother’s face in color photographs (Barrera & Maurer, 1981; de Schonen & Mathivet, 1990), and by 5 months they can detect similarities between and also discriminate between two- and three-dimensional stimuli (DeLoache, Strauss, & Maynard, 1979; Dirks & Gibson, 1977; Rose, 1977; Slater, Rose, & Morison, 1984).

There is also abundant evidence suggesting a lack of appreciation of the symbolic nature and use of pictures in the first two years. Perner (1991) reported that his 16-month-old son attempted to step into a picture of a shoe. This kind of behavior towards pictures suggest that at this age young children treat pictures as "things-of-action" rather than as "objects-of-contemplation" (Werner & Kaplan, 1964). This claim has now been experimentally documented in several studies (Callaghan, Rochat, MacGillivray, & MacLellan, 2003, 2004; DeLoache, Pierroutsakos, Uttal, Rosengren, & Gottlieb, 1998; Murphy, 1978; Pierroutsakos & DeLoache, 2003; Yonas, Granrud, Chov, & Alexander, 2005). When young infants are presented with a highly realistic color photograph of an object, they touch, rub, pat, and scratch at the depicted object and sometimes even grasp at it (DeLoache et al., 1998). Children’s manual exploration of pictures decreases from 9 to 18 months (with the largest decrement between 9- and 15-months), as it is replaced by referential behaviors, such as pointing and labeling. The decline in manual behaviors toward pictures may reflect a beginning appreciation that pictures are representations for things other than themselves (DeLoache et al., 1998). It is also possible that at this early stage children simply have firmed up the distinction between the behavioral affordances of 2D and 3D entities, but do not yet take pictures as symbols.

By preschool age children clearly take pictures as symbols in that they can make use of the representational relation between a picture and its referent. For example, 3-and 4-year-olds understand that a drawing can have a different meaning or interpretation depending on the creator’s intention (Bloom & Markson, 1998; Gelman & Ebeling, 1998), and even 30-month-olds can use a depicted situation to form a representation of a real situation in order to guide behavior (DeLoache & Burns, 1994). To explicitly appreciate the representational relation between a picture and its referent, one needs to have the ability to form meta-representations (Perner, 1991), which emerges between 3- and 4-years of age (Leslie, 1987; Perner, 1991), and aspects of explicit reasoning about pictures as symbols continue to develop throughout the preschool and elementary years (e.g., Beilin & Pearlman, 1991; Flavell, Falvell, Green, & Korfmacher, 1990; Robinson, Nye, & Thomas, 1994; Uttal, Gentner, Liu, Lewis, 2008; Zaitchik, 1990).

There is a paucity of research on toddlers’ understanding of pictures as symbols. Three recent studies suggest that 15- to 24-month-old toddlers are able to apply information they hear in relation to a picture to its referent (Ganea, Bloom Pickard, & DeLoache, 2008; Preissler & Carey, 2004; Simcock & DeLoache, 2006). One of these studies (Ganea et al., 2008) showed that 15-month-olds are capable of transferring a novel word from a picture to its referent. After a book-reading interaction in which they learned the label “blicket” for a novel depicted object, they identified which of two real objects was a “blicket.” Clearly, children learned the mapping between the word and the picture, and used the similarity between the picture and its referent to choose which object was the blicket. However, this study does not provide strong evidence that children assumed that the word ostensively taught with respect to a picture actually referred to the object. The child may have merely been choosing the best of two bad choices—after all, shown a bone and a bowl of milk and asked “Which is the dog?,” 3-year-olds will indicate the bone (Markman, 1989), but that does not license the conclusion that they take “dog” to refer to bones. Similarly, the 15-month-olds may have considered the real blicket as the best of two bad options for the word they learned in relation to a picture of the blicket.

Preissler and Carey (2004) provided a stronger test of whether toddlers taught a new word in relation to a picture take that word to apply to the picture’s real world referent. In their study, 18- and 24-month-old children were taught an unfamiliar label (“whisk”) for a small line drawing of an unfamiliar object (a whisk). Subsequently, they were presented with a pair of stimuli—a real whisk and the same simple drawing for which they had learned the label—and asked to indicate “whisk.” (i.e., “Can you show me a whisk?”). If children simply associate the words with the picture with which it was paired, or take it to refer only to the picture, then they should indicate the picture itself, as this is actually a choice presented to them. However, if children understand the referential relation between the word and the picture, and between the picture and its referent, they should never indicate only the picture as the referent for the word. Rather they should choose the real object, either alone or together with the picture.

The results were striking: of 50 18- and 24-month-olds tested, only one selected the picture alone, in spite of the fact that they had initially learned the label for the line drawing, and had repeatedly experienced the pairing of the label and the small line drawing of the whisk. All of them chose the real whisk, with half selecting the whisk alone and half the real whisk and its picture. These data are consistent with the conclusion that by 18 months of age, very young children who hear a novel word applied to a picture assume that the word refers to the real object that the picture depicts. This interpretation is bolstered by the finding that the observed pattern of responding is not inevitable. Using the same paradigm, Preissler (2008) found that children with Autism Spectrum Disorder (ASD) were making associative mappings between words, pictures and objects and failing to generalize a label learned for a picture to its corresponding real referent. That is, on the same task, the children with ASD rarely picked the real object alone and over half the time picked the picture alone when asked to indicate the whisk from a choice of the picture and a real whisk.

Study 1 has two goals. First, we explore how robust the Preissler and Carey (2004) findings are by seeking to replicate them in the picture book procedure used by Ganea et al. (2008). The procedure used to teach children a novel word in these two studies differs in two important ways. In the picture book procedure the word learning training was more naturalistic, with pictures for the novel object being labeled in the context of looking through a picture book in which other familiar entities are labeled as well. Also, the pictures are high quality, very realistic, photographs rather than highly schematic black-and-white line drawings. Second, we ask whether children younger than those tested by Preissler and Carey (2004), namely 15-month-olds, perform like older toddlers on this task, demonstrating a symbolic understanding of both words and pictures, or whether they respond as do children with ASD (Preissler, 2008), which would suggest a developmental transition from associative to symbolic understanding in the age range of 15- to 24-months.

Study 1

Study 1 seeks to replicate the picture book word learning paradigm of Ganea et al., (2008) showing that 15-, 18- and 24-month-olds will apply a label learned in the context of a picture-book interaction to the pictured item. It extends these findings by exploring whether children take the picture that was paired with the word during learning to be a better referent for the word than is the actual object itself. Children first learned the novel name “blicket” for a picture of a novel object, and then they were shown the picture and its real referent and asked to indicate “a blicket.” The dependent measure is whether a child indicates the object, the picture, or both when asked to show “a blicket.”



Three groups of children were tested: eighteen 15-month-olds (range 14.6 – 16.8 months, M = 15.7 months, 10 girls), sixteen 18-month-olds (range 17.9 – 19.3 months, M = 18.5 months, 8 girls), and sixteen 24-month-olds (range 23.1 – 26.07 months, M = 24.6, 8 girls). Thirteen additional children were excluded (four 15-month-olds, five 18-month-olds, four 24-month-olds) because of fussiness or failure to complete the training procedure.


A picture book contained (13 cm × 18 cm) color photographs of six familiar objects (stuffed dog, toy phone, plastic cup, toy car, toy hammer, and ball) and two novel objects (a chrome wire egg holder and a large white plastic egg holder adorned with two red strings). Each page was laminated on a cardboard backing. Two pictures – one of a novel object and one of a familiar object – were visible at a given time, on opposite pages (see sample in Figure 1). This was true throughout the book except for the last pair of pages, on which both novel objects were visible on adjacent pages, each once on the left page and once on the right page. Each novel object was depicted 4 times throughout the book.

Figure 1
Sample picture pages from the books used in Study 1.

The pictures used during the training and test phases (ball, cup, and the two novel objects) were identical to those that the children had seen in the picture book. The two novel objects were a small metal spiral egg cup, and an oval, white plastic object used for holding eggs.


Infants sat in a sassy seat that clipped onto a small table. The child’s parent(s) were present in the testing room for the duration of the session. The procedure had four phases: book reading, training, object familiarization, and test.

Book Reading Phase

The experimenter sat next to the infant as in a normal book-reading interaction. In this phase, the child was taught a novel name for 1 of 2 novel objects in the picture book. For each familiar picture, the experimenter labeled the depicted object once (e.g., “Look, it’s a ball.”). For the target novel object, the experimenter labeled the depicted object in the same fashion, saying “Look, this is a blicket,” and repeated the label 3 times. For the distractor novel object, the experimenter referred to the object (“Look at that, yeah, see that!”) without labeling it, to equate attention to the distractor picture with attention to the target picture.

Training Phase

This phase was intended to familiarize the children with the nature of the test questions. After the experimenter finished reading the book, she sat across the table from the participant. Infants were first presented with two pictures of familiar objects, and asked, “Show me the ball/cup”, counterbalanced for the object that they were asked about. If a child did not respond to the first question, the experimenter attempted to elicit a response using slightly different phrases (“Can you give me the cup?” or “Can you give mommy the cup?”). She continued to use whichever phrase elicited a response in subsequent trials.

To assess whether the children had learned the novel label for the depicted object during the book-reading interaction, they were then shown a picture of the novel target and a picture of the novel distractor and asked to indicate the “blicket.” Children’s choices of the target picture were positively reinforced; if they chose incorrectly, they were given feedback as to the correct picture. Most children reached criterion of two correct successive responses in 2 trials, with all three age groups taking an average of 2.3 to 2.5 trials to complete the training. The 6 children who failed to indicate the correct picture on two successive trials (out of a maximum of 4 trials) were not included in the final analysis, being among those who were replaced in the final sample (see description of final sample).

Object Familiarization Phase

To reduce the chance of a test response based solely on object novelty, the children were familiarized with the two novel objects one at a time, with order counterbalanced. Each test object was presented but not labeled (the experimenter simply said, “Look at this.”), and the children were allowed to explore it for a few seconds.

Test Phase

The children were then asked three test questions. One explored children’s referential understanding of pictures and words, testing whether the depicted object is equally or more acceptable as a referent for the newly learned word than is the picture that had been the ostensive referent in the learning phase. The other was used to determine the extent to which these young children might have a bias to indicate 3-D objects over pictures, irrespective of whether the object is a candidate bearer of the label. The presentation of these two tests was counterbalanced. The third test provided an additional measure of children’s ability to extend a word from a picture to its real referent. In this last test children were presented with the two novel objects and asked for the “blicket”. The extension test was always asked last, because it was essential that children not hear the label in relation to the real novel object before testing for their referential understanding of pictures and words.

For all three tests administered, if a child simply grabbed a test item and explored it without clearly indicating a response, the experimenter removed both objects and repeated the question. Only intentional behaviors, such as pointing at an item, or showing or giving an item to the experimenter while making eye contact with her, were taken to reflect application of the label.

Picture-Object Test

The child was simultaneously shown the picture of the target object and the real target object and asked to indicate “a blicket” (“Can you show me a blicket?”).

Real Object-Bias Test

The child was shown a picture of the target object and the real distractor object and asked to show the “blicket” (“Can you show me the blicket?). The correct answer on this test was to indicate the picture. Thus, this test was a measure of any general tendency to prefer objects over pictures and provided important information to evaluate performance on the Picture-Object Test. Selection of the real distractor object on this test would indicate a simple preference for objects over pictures.

Extension Test

The child was presented with the two real objects (target and distractor) and asked to show the “blicket.” This test was a measure of children’s application of the newly-learned word to the real target object, as in Ganea et al., 2008.


All coding was done from the videotapes of the children’s behavior by two independent raters. Throughout the training and test phases of the study, only intentional responses were counted as relevant to the referent the child assigned the new word “blicket.” The criteria for establishing that a child had made an intentional response were the same as those used by Preissler and Carey (2004). Children had to give or slide an item to the experimenter, point to it, or pick it up and show it to the experimenter while making eye contact. If a child simply grabbed an item, played with or explored an item without clearly indicating it to the experimenter, this was not coded as an intentional response. Across the three tests, overall children responded intentionally 79% of the time (157 responses out of 199 total responses), p < .01, binomial test. Chi-square analyses indicated that the 15-month-olds responded unintentionally (23 unintentional of 77 responses) more often than did the 18- and 24-month-olds (7 unintentional of 62 responses, 12 unintentional of 60 total, respectively), all ps <.01.

For example, with respect to the specific choices that children made on the Picture-Object Test, if a child indicated that an object was correct (by pointing at it or sliding it to the experimenter) and then took the picture to play with it, this was coded as “real object alone” response. If a child indicated the picture and then played with the object, this was coded as a “picture alone” response. If a child pointed to both items, or took the object and placed it on top of the picture, and then made eye contact with the experimenter, this was coded as “both.” Responses coded as both were also examined with respect to whether children indicated both items simultaneously (i.e., child puts the object on top of the picture or points to both while saying “there are two blickets”) or sequentially (i.e., child points to picture, then takes object and puts it on top of picture, then indicating both to the experimenter, or child points to one item and then points to the other).

Intercoder reliability on the total number of test trials in Study 1 was high – the two coders agreed on 95% of the test choices. The few disagreements were easily resolved by a third person.

Results and Discussion

We approach the results in three steps. First, we determined whether children have a general object bias. If a child has a basic tendency to select real objects over pictures (as indicated by choice of the object on the Real Object-Bias Test), the child’s choice of the object on the Picture-Object Test would provide no information concerning whether the term “blicket” is extended to a real object even when the taught-on picture is available as a response. The choice of the real object could simply be the result of a preference for objects over pictures.

There was no effect of test order on children’s performance: children’s choices were not affected by whether they received the Picture-Object Test first or second.

Replicating Preissler and Carey’s (2004) data, virtually no 18- or 24-month-old children ever indicated the real novel object on the Real-Object Bias Test over the picture that had been used in the labeling phase, when asked which was the blicket (only 1 18-month-old did so). However 1/3 of the 15-month-olds (6/18) displayed a real object bias (see Figure 2).

Figure 2
Proportion of children in each age group who chose the real object on the Real Object-Bias Test in Studies 1 and 2.

Next, we analyzed children’s responses to the Picture-Object Test. We assessed whether children with no object bias would indicate the real object as a blicket when the response of the picture of the blicket was available to them, or whether they would, as did children with autism, often choose the picture alone. As can be seen in Figure 3, at each age, children frequently indicated the real object as a blicket, either alone or together with the picture (75% of 15-month-olds; 69% of 18-month-olds; 75% of 24-month-olds).

Figure 3
Percentage of choice results on the Picture-Object Test in Studies 1 and 2 after controlling for children’s object preference (the results for the 15-month-olds in Study 2 are not shown because the sample of children with no object preference ...

Next we assessed whether children actually preferred the real object to the picture, as did the normally developing 18- and 24-month-olds in the Preissler and Carey (2004) study. They did not—as can be seen from Figure 3, at no age were they significantly more likely to chose the object alone than the picture alone.

The results from Study 1 show that toddlers accept the real object as a blicket, even when presented with a choice between the real object and the picture that had been labeled during training. The pattern of responding on the Picture-Object Test is what would be expected if the child were choosing at chance (i.e., choices distributed between object, picture, and both). However, children almost never indicated both test items on the other two types of tests (there was only one response that included “both” on the Extension Test, and none on the Real Object-Bias Test). Thus, in light of all the other trials in which children did not indicate both objects, we take choices of “both” on the Picture-Object Test to indicate that the child accepts both the real object and the picture of the object as blickets.

Of the children (16 total across the three ages) who made “both” choices in Study 1, 10 children indicated the test items sequentially (e.g., child pointed to picture, then took object and put it on top of picture, also indicating the object, or child pointed to one item and then pointed to the other), and 3 children indicated the test items simultaneously (e.g., child pointed to both at the same time while saying “two blickets,” or, child took object and put it on top of picture, indicating them both). The response of 3 children could not be coded with respect to whether the specific choice was done sequentially or simultaneously because of technical difficulties with the tape (the coding was based on the original online recording of the experimenter).

On the Extension Test, the majority of children indicated the correct target object when presented with the real blicket and the real distractor object: 77% (14 out of 18) of the 15-month-olds, 81% (13 out of 16) of the 18-month-olds, and 93% of the 24-month-olds (15 of 16) (p < .05, binomial test, for each age group); these results show that all age groups were successful at transferring the novel word to the real novel object when it was presented in isolation from the picture. The results of the Extension Test replicate the findings reported by Ganea et al. (2008); 15- to 24-month old toddlers, when taught a new label of a pictured novel entity, will extend that label to the real entity that was pictured, given a choice between that real entity and another real object. The results of the Picture-Object Test extend that finding to a case where the real entity is paired with the actual picture taught on, showing that the extension of the label to the real object in the Extension Test is not merely a choice of the best of two bad options.

Nonetheless, the present results do not fully replicate the pattern of responding that Preissler and Carey (2004) observed. Unlike in their study, the children did not prefer the real object to the pictured object when asked to indicate a blicket on the Picture-Object Test. In the present study, both the real object and the pictured object were equally favored responses whereas in Preissler and Carey’s study the 18 to 24-month-olds significantly preferred the real object as the referent for the new label. The difference between the two studies is statistically reliable, in Preissler and Carey’s study, only 1 of 50 18- to 24-month-olds indicated the picture alone on Picture-Object trials, whereas in this study 8 of 31 18- to 24-month-olds did so, χ2 (1, 81) = 8.70, p < .01.

There are two major differences between the two studies, either or both of which may have contributed to the difference in results. First, the pictures in the present research were large (13 cm x 18 cm) highly realistic colored photographs of the objects, whereas those in Preisser and Carey’s study were small (5 cm × 5 cm) schematic black-and-white line drawings. The highly realistic photographs may be better candidates to be bearers of names. That is, whereas the perceptual similarity between a highly realistic photograph and the real object depicted therein may be easier to compute than is the similarity between a schematic line drawing sketch and the entity it represents, pointing to a line drawing and saying “this is a whisk,” may paradoxically promote assigning the referent of “whisk” to an unseen real object, because the little cardboard drawing is obviously not a whisk or anything else for that matter.

A second difference between the two studies is in the word learning training. During training, the children in Preissler and Carey’s (2004) study received more pairings between the novel word (“whisk”) and the pictured whisk, and each training trial involved choosing the whisk from among 2 or 7 pictured objects. This training was meant to provide many associative pairings between the word and the picture, and the focus was deliberately only on the word “whisk” and the picture that provided the ostensive definition of it. In the present study, the words were introduced in a much more naturalistic manner, in the context of a book in which many different familiar entities were named. The child did not extensively practice choosing the blicket from many different pictured objects. This difference in training may have led to a more robust pairing of the word “whisk” with the picture of the whisk than of the word “blicket” with the picture of the blicket. And again, paradoxically, this more robust pairing may lead to a more confident extension of the word to the real object.

Future studies could explore these two hypotheses; more schematic drawing could be used in the picture book paradigm and more realistic photographs could be used in the Preissler and Carey (2004) training regime. Such studies would inform our understanding of the processes through which toddlers establish representations of pictured entities. Nonetheless, the present study adds to the growing literature that very young children extend words taught by ostensive definition on pictures to the pictured entities.

Just as in previous research, the word learning progress of the 15-month-olds was less proficient than that of the older children. In Ganea et al. (2008), the youngest children were more affected by perceptual similarity between the pictured item and the real referent than were the two older groups. And in the present study, a third of the 15-month-olds failed to indicate the pictured blicket when given a choice between the picture and a real distractor entity (the Object-Bias Test). Both of these findings may reflect a less robust representation of the pictured blicket being formed during learning by the 15-month-olds than by the older children. Still, the 15-month-olds in the current study did succeed overall at indicating the real blicket when it was paired with the real distractor on the extension test. Also, if they succeeded on the Object-Bias Test, their pattern of responding on the Picture-Object Test was identical to that of the older children. Thus, there was no evidence that 15-month-olds, unlike children with autism, form a merely associative mapping between the word and the entity paired with it during learning. If they did, children should more often include the picture when asked to indicate which of two entities is a blicket (if a picture choice is available). Thus, the 15-month-olds, the youngest yet tested in such a paradigm, showed the same evidence for symbolic understanding of both the picture and word as did the older children. We return to alternative interpretations of the observed pattern of responding in the General Discussion.

If younger children form less robust representations of the pictured entities, they should be more affected by a mismatch between the perceptual properties of the entity depicted in the picture and those of the real entities the word may apply to, both on the extension tests (as already shown by Ganea et al, 2008) and on the Picture-Object Test. Study 2 tests this prediction.

Study 2

Study 2 asked whether children will prefer a real object as a referent for a newly heard word learned in the context of a picture, when that referent fails to match that of the pictured entity in some salient way. Children were tested on the same test of pictorial understanding as in Study 1. That is, children again learned the name “blicket” for a novel depicted object. Then they were shown the picture for which they learned the label and a real object that belonged to the same category as the depicted object but was of a different color and asked to show the experimenter “a blicket”. Thus, the level of perceptual similarity between the picture and the candidate real referent was lower than in Study 1 in which children were tested with the picture and an identical real referent.



Three groups of children were tested: 15-, 18-, and 24-month-olds. There were 15 15-month-olds (range 15.0–17.1 months, M = 15.8 months, 8 girls), 16 18-month-olds (range 17.8–19.4 months, M = 18.5 months, 7 girls), and 16 24-month-olds (range 23.1–25.0 months, M = 24.2 months, 7 girls). Eleven additional children were excluded (5 15-month-olds and 6 18-month-olds) because of fussiness or failure to complete the training procedure.


Materials were the same as in Study 1, with the only difference that the real novel objects were different colors (blue egg cup and golden metal spiral) than the depicted objects (white egg cup and chrome metal spiral).


The procedure was the same as in Study 1 with one important change. In the testing session, the children were shown novel exemplars of the objects that they had seen depicted in the book. For the familiarization trials between the book reading and the tests, the two real objects presented for inspection were those of different colors from those depicted in the pictures. And in the tests, for example, if the children saw a picture of a white blicket in the picture book, then during the Picture-Object Test, they saw a picture of a white blicket and a real blue blicket. And during the Real-Object Bias Test they saw a picture of the white blicket and a novel object of a different color from that that had been in the book, and on the Extension Test, they saw a real blue blicket and the novel object of a different color from that depicted in the picture book.

Coding was the same as in Study 1, with 99% agreement between coders.

Results and Discussion

Children’s responses to the test questions were scored as in Study 1. As in Study 1, many of the younger children had an object bias, as shown by their selection of the real distractor object rather than the pictured blicket, when asked to indicate the blicket on the Real-Object Bias Test. Although real object biases were more frequent in Study 2 than in Study 1 (see Figure 2), the difference was not statistically significant, χ2 (1, 97) = 1.38.

The perceptual similarity between the picture labeled “blicket” during training and the real blicket on the Picture-Object Test had a large effect on children’s choices. Because the sample of the 15-month-olds who did not have an object bias was very small (N=7) we conducted this analysis only for the 18- and 24-month-olds (Of the seven 15-month-olds with no object bias, 2 children selected the picture alone, 4 children selected the object alone and one child indicated both the picture and the object). As can be seen in Figure 3, the 18-month-olds were more likely to indicate the picture alone as a “blicket” in Study 2 than were those in Study 1, χ2 (1, 30) = 4.16, p < .05. The 24-month-olds’ responses were symmetrical between the object alone choices and the picture alone choices; “both” responses were less frequent (2 of 13 children) than in Study 1 (8 of 16 children) but not significantly different according to Chi-Square analysis.

As can be seen in Figure 4, the perceptual similarity between the picture and the real object also affected the responses on the Extension Test. Replicating Ganea et al. (2008), the younger children were less likely to pick the real blicket in Study 2 than in Study 1, and indeed, the performance of the 15- and 18-month olds did not differ from chance (66% correct and 63% correct, respectively.) The 24-month-olds were less affected by perceptual similarity on this measure, 86% correctly indicating the blicket in spite of its being a different color from the pictured one (p < .05, binomial test), essentially the same level of performance as in Study 1.

Figure 4
Proportion of children in each age group who responded correctly on the Extension Test in the two studies.

In sum, Study 2 confirmed that the perceptual mismatch between the picture that provided an ostensive definition of the newly heard word and a candidate real object that could be taken as a referent for that word greatly decreased the likelihood that 18- and even 24-month-olds applied the word to the real object.

These data, together with those of Preissler and Carey (2004), raise a paradox. In their study, the perceptual mismatch between the picture labeled “whisk” and the real whisk was great—the picture was a schematic cartoon, a small black and white line drawing, whereas the real whisk was silver, large, and not even exactly the same shape as the drawing. The perceptual mismatch was far greater than that between the picture of the blicket and the real blicket, which differed only in color, given the high quality of the realistic photos used here. Yet in Preissler and Carey’s data, 18- and 24-month-olds were more likely to avoid the picture alone choice than those in the present research. These data are consistent with the suggestion made above that the highly detailed, realistic photos may be better candidate bearers of names than are schematic line drawings, thus leading to the paradoxical finding that a word introduced as referring to a small line drawing is more likely to be extended to a real object that picture depicts.

General Discussion

In spite of the fact that infants perceive the similarity between pictures and the objects they depict and also distinguish 2D entities from 3D entities (DeLoache, Strauss, & Maynard, 1979; Dirks & Gibson, 1977; Rose, 1977; Slater, Rose, & Morison, 1984) it is still an open question when and how they come to grasp the symbolic function of pictures. Achieving this understanding is a complex developmental process. It is not until 18 to 24 months of age that children prefer upright to inverted pictures (DeLoache, Uttal, & Pierroutsakos, 2000) and point at depicted objects rather than manually explore them (DeLoache et al., 1998; Murphy, 1978). Also, by 24 months children can follow a request to put a toy at a place specified to them on a picture (Deloache & Burns, 1994), and can use information provided with a picture (e.g., “the toy is hiding there,” pointing to the cupboard on a picture of a room) to find the object in the depicted room (Suddendorf, 2003), and reliably do so by 30 months of age (Deloache & Burns, 1994). It is not until age 4 that children solve a “false photograph task” that is structured the same as the classic false belief task. That is, at around 4 years children understand that a photograph will depict an object at the place where it was when the photo was taken, even if the object had subsequently been moved in reality (Zaitchik, 1990). Nevertheless, even at age 4 children can show confusion about the properties of pictures and depicted objects (Beilin & Pearlman, 1991, Robisnon, Nye, & Thomas, 1994), and the consequences of actions on pictures and objects (Flavell, Flavell, Green, & Korfmacher, 1990).

Notwithstanding this extended developmental process, the present studies add to the evidence that even younger children are able to use information gained from photographs to guide behavior in the real world. Fifteen, 18- and 24-month-olds can apply a label learned for an object depicted in a picture book to the real object (Ganea et al., 2008), and 18- and 24-month-olds can learn and imitate a novel action sequence from a picture book (Simcock & DeLoache, 2006; Simcock & Dooley, 2007). And in the present study, as in Preissler and Carey (2004), 18- to 24-month-old toddlers included a real object in the extension of a new word “blicket,” that had been taught in relation to a picture of that object, even when given a choice between the taught-on picture and the real object. The present study extended this finding to even younger children, 15-month-olds.

Not only do toddlers extend words taught on pictures to their real world referents, a series of studies by Callaghan (2000) suggest that until age 2.5, children may succeed in identifying the referents of pictures only when labels for those pictures are known, or when the pictures have been labeled (as in the present studies). Callaghan showed 2.5- and 3-year-olds pictures and instructed them to “Find this one. Where’s this one?” while indicating the picture, then showed children two real objects, one of which had been depicted in the picture. The younger children failed to identify the object that was previously pointed at in the picture, showing better performance only when the pictured object was labeled, which is consistent with work showing that labeling highlights a picture’s symbolic status (Preissler & Bloom, 2007). These findings, together with those of the present studies and those of Preissler and Carey (2004) leave open three subtly different interpretations of toddlers’ grasp of pictures and words as symbols.

According to a lean interpretation, the relation between a word and the objects in its extension is associative and the relation between a picture and the entity it depicts is merely one of perceptual similarity. On this view, when pictures of novel entities are given novel names, children form an association between the label and a perceptual representation of the picture, and then they apply this label to any stimulus that is perceptually similar to the stored representation. This account makes sense of the effects of perceptual similarity between the pictures and their referents observed in the present and other related studies (Callaghan, 2000; Ganea et al., 2008; Simcock & DeLoache, 2006). However, this account is not consistent with Preissler and Carey’s findings, for in their study the word was repeatedly associated with the line drawing of the whisk, and yet children rejected the line drawing as a referent for the word when given a choice between it and a real whisk. Preissler and Carey (2004) concluded that the relation between words and their referents is symbolic, as is the relation between pictures and the entities they depict. In addition, children with ASD provide data that clearly support the lean associative interpretation (see Preissler, 2008), as they mapped a novel word just to a picture, and failed to generalize to the real world referent. Given this existence proof that the hypothesized pattern of responding consistent with the lean interpretation is possible, it is significant that normally developing toddlers respond totally differently.

However, there is a medium lean interpretation of these findings, in which the word is indeed interpreted symbolically, but the picture is not. That is, the word might be assigned a referent upon ostensive definition of the picture, and that referent is represented in terms of the perceptual features of the picture. Other entities are then categorized in accord with how well they match the stored representation. This account makes sense of Callaghan’s finding that labels are necessary for young children to extend pictures to real world entities, and it also accounts for the role of similarity between the pictures and the depicted objects in these studies.

The results of the current paper along with those of Preissler and Carey (2004), pose some problems for this account. When the child forms a representation of the perceptual stimulus (i.e., the picture) that is ostensively indicated as the “blicket,” what leads the child to ignore those perceptual features that specify the stimulus as 2D (and small, and black and white, and a line drawing, in Preissler and Carey, 2004)? If the picture is rejected because real objects are better candidates for word meanings than are pictures, and it is only the word that has symbolic content, then the real object only should be strongly preferred in the present studies (for it is both a real object and an excellent match to the stored stimulus). But this is not what we observed.

Finally, the richest interpretation, the one we tentatively favor, is that by 15 months of age, under these circumstances, when pictures are labeled, both the word and the picture are taken as symbols for real world entities. That is, when children hear a word ostensively referring to a picture, they know that both the word and the picture refer to a real, 3-D entity.

When a child’s attention is drawn to a picture, the child needs some cue in order to know that his or her communicative partner intends this particular picture to be taken as a symbol (after all, not all pictures are symbols). These cues are varied, and interact in subtle ways. First, how realistic and detailed the picture is may be inversely related to the likelihood that the picture itself is likely to be a bearer of a name. Another cue is labeling itself. When we point to a picture and label it “a blicket,” we are cueing the child to take it as a symbol for real objects, so long as the child knows that words refer to real entities and that the picture is not such an entity. This might explain the important role of labeling that Callaghan (2000) observed in the studies described above.

As required by this analysis, children generally perform better on symbolic tasks when the symbols are introduced as part of a social interaction and their status as representations is highlighted (Callaghan et al., 2004; Szechter & Liben, 2004; Troseth, 2003). The role of ostensive communicative behaviors for learning symbols has also been clearly demonstrated in the domain of word learning (Baldwin, 1991; Bloom, 2000; Akhtar, Carpenter, & Tomasello, 1996; Tomasello, 1999). Further research examining which specific aspects of ostensive communication are involved in children’s understanding of pictures may reveal important information about mechanisms underlying symbolic development. One recent finding suggest that even a non-directional social cue, such as a positive facial expression, can facilitate young children’s interpretation of symbols (Leekam, Solomon, Teoh, in press).

So far there are no decisive arguments that adjudicate between the medium lean and the rich interpretation of the findings to date concerning toddlers’ capacity to extend words from pictures to the objects depicted. What is clear is that they are indeed able to do so, and that they do so even when the picture that had been paired with the new word during learning is available as a choice. Thus, the foundations of understanding word-picture and picture-object relations are in place before children reach their second birthday.


We thank the parents and children who graciously participated. We are grateful to Themba Carr, Kristen Knecht, Jasmine DeJesus, Ashley Foster, and Carina Wind for help with data collection. This research was supported by NSF grant 0440254 PAG and JSD, and by NIH grant HD-25271 to JSD.


  • Akhtar NM, Carpenter M, Tomasello M. The role of discourse novelty in children’s early word learning. Child Development. 1996;67:635–645.
  • Baldwin DA. Infants’ contribution to the achievement of joint reference. Child Development. 1991;62:875–890. [PubMed]
  • Beilin H, Pearlman EG. Children’s iconic realism: Object versus property realism. In: Reese HW, editor. Advances in child development and behavior. vol. 23. New York: Academic Press; 1991. pp. 73–111. [PubMed]
  • Barrera ME, Maurer D. Recognition of mother's photographed face by the three-month-old infant. Child Development. 1981;52:714–716.
  • Bloom P. How children learn the meanings of words. Cambridge, Massachusetts: The MIT Press; 2000.
  • Bloom P, Markson L. Intention and analogy in children's naming of pictorial representations. Psychological Science. 1998;9:200–204.
  • Callaghan TC. Factors affecting children's graphic symbol use in the third year. Language, similarity, and iconicity. Cognitive Development. 2000;15:185–214.
  • Callaghan TC, Rochat P, MacGillivray T, MacLellan C. The social construction of pictorial symbols in 6- to 18-month-old infants. Manuscript submitted for publication. 2003
  • Callaghan TC, Rochat P, MacGillivray T, MacLellan C. Modeling Referential Actions in 6- to 18-Month-Old Infants: A Precursor to Symbolic Understanding. Child Development. 2004;75:1733–1744. [PubMed]
  • de Schonen S, Mathivet E. Hemispheric asymmetry in a face discrimination task in infants. Child Development. 1990;61:1192–1205. [PubMed]
  • DeLoache JS, Strauss M, Maynard J. Picture perception in infancy. Infant Behavior and Development. 1979;2:77–89.
  • DeLoache JS, Burns NM. Early understanding of the representational function of pictures. Cognition. 1994;52:83–110. [PubMed]
  • DeLoache JS, Pierroutsakos SL, Uttal DH, Rosengren KS, Gottlieb A. Grasping the nature of pictures. Psychological Science. 1998;9:205–210.
  • DeLoache JS, Uttal DH, Pierroutsakos SL. What's up? The development of an orientation preference for picture books. Journal of Cognition and Development. 2000;1:81–95.
  • Dirks JR, Gibson E. Infants' perception of similarity between live people and their photographs. Child Development. 1977;48:124–130. [PubMed]
  • Flavell JH, Falvell ER, Green FL, Korfmacher JE. Do young children think of television images as pictures for real objects? Journal of Broadcasting & Electronic Media. 1990;34:399–419.
  • Ganea PA, Bloom-Pickard M, DeLoache JS. Transfer between picture books and the real world by very young children. Journal of Cognition and Development. 2008;9:46–66.
  • Gelman SA, Ebeling KS. Shape and representational status in children's early naming. Cognition. 1998;66:835–847. [PubMed]
  • Leekam SR, Solomon TL, Teoh YS. Adults' social cues facilitate young children's use of signs and symbols. Developmental Science. (in press) [PubMed]
  • Leslie AM. Pretense and representation: The origins of "theory of mind.". Psychological Review. 1987;94:412–426.
  • Markman EM. Categorization and naming in children: Problems of induction. Cambridge, MA: The MIT Press; 1989.
  • Murphy CM. Pointing in the context of a shared activity. Child Development. 1978;49:371–380.
  • Perner J. Understanding the representational mind. Cambridge, MA: The MIT Press; 1991.
  • Pierroutsakos SL, DeLoache JS. Infants' manual exploration of pictorial objects varying in realism. Infancy. 2003;4:141–156.
  • Preissler MA. Associative learning of pictures and words by low-functioning children with autism. Autism. 2008;12:229–246. [PubMed]
  • Preissler MA, Bloom P. Two year olds understand the duality of pictures. Psychological Science. 2007;18:1–2. [PubMed]
  • Preissler MA, Carey S. Do both pictures and words function as symbols for 18- and 24-month-old children? Journal of Cognition and Development. 2004;5:185–212.
  • Robinson EJ, Nye R, Thomas GV. Children's conceptions of the relationship between pictures and their referents. Cognitive Development. 1994;9:165–191.
  • Rose SA. Infants' transfer of response between two-dimensional and three-dimensional stimuli. Child Development. 1977;48:1086–1091.
  • Simcock G, DeLoache JS. The effect of iconicity on re-enactment from picture books by 18- to 30-month-old children. Developmental Psychology. 2006;42:1352–1357. [PubMed]
  • Simcock G, Dooley M. Generalization of learning from picture books to novel test conditions by 18- and 24-month-old children. Developmental Psychology. 2007;43:1568–1578. [PubMed]
  • Slater A, Rose D, Morison V. New-born infants' perception of similarities and differences between two- and three-dimensional stimuli. British Journal of Developmental Psychology. 1984;2:287–294.
  • Suddendorf T. Early representational insight: Twenty-four-month-olds can use a photo to find an object in the world. Child Development. 2003;74:896–904. [PubMed]
  • Szechter L, Liben L. Parental guidance in preschooler's understanding of spatial-graphic representations. Child Development. 2004;75:869–885. [PubMed]
  • Tomasello M. The cultural origins of human cognition. Cambridge, MA: Harvard University Press; 1999.
  • Troseth GL. TV guide: Two-year-old children learn to use video as a source of information. Developmental Psychology. 2003;39:140–150. [PubMed]
  • Uttal DH, Gentner D, Liu, Linda L, Lewis A. Developmental changes in children's understanding of the similarity between photographs and their referents. Developmental Science. 2008;11(1):156–170. [PubMed]
  • Werner H, Kaplan B. Symbol Formation. An organismic-developmental approach to language and expression of thought. New York: John Wiley & Sons, Inc; 1964.
  • Yonas A, Granrud CE, Chov MH, Alexander AJ. Picture Perception in Infants: Do 9-Month-Olds Attempt to Grasp Objects Depicted in Photographs? Infant Behavior and Development. 2005;8:147–166.
  • Zaitchik D. When representations conflict with reality: The preschooler's problem with false beliefs and "false" photographs. Cognition. 1990;35:41–68. [PubMed]