|Home | About | Journals | Submit | Contact Us | Français|
Two experiments investigated the role of language in children’s spatial recall performance. In particular, we assessed whether selecting an intrinsic reference frame could be improved through verbal encoding. Selecting an intrinsic reference frame requires remembering locations relative to nearby objects independent of one’s body (egocentric) or distal environmental (allocentric) cues, and does not reliably occur in children under 5 years of age (Nardini, Burgess, Breckenridge, & Atkinson, 2006). The current studies tested the relation between spatial language and 4-year-olds’ selection of an intrinsic reference frame in spatial recall. Experiment 1 showed that providing 4-year-olds with location-descriptive cues during (Exp. 1a) or before (Exp. 1b) the recall task improved performance both overall and specifically on trials relying most on an intrinsic reference frame. Additionally, children’s recall performance was predicted by their verbal descriptions of the task space (Exp. 1a control condition). Non-verbally highlighting relations among objects during the recall task (Exp. 2) supported children’s performance relative to the control condition, but significantly less than the location-descriptive cues. These results suggest that the ability to verbally represent relations is a potential mechanism that could account for developmental changes in the selection of an intrinsic reference frame during spatial recall.
Language is a tool that can augment cognition, with effects demonstrated in areas such as numerical representation (e.g., Miura, Kim, Chang, & Okamoto, 1988), categorization (e.g., Bowerman & Choi, 2003), analogical reasoning (e.g., Gentner, 2003), and spatial reasoning (e.g., Haun, Rapold, Janzen, & Levinson, 2011; Pyers, Shusterman, Senghas, Spelke, & Emmorey, 2010). Recently, there has been increased interest in understanding relations between language and spatial cognition over development. Research indicates that providing task-relevant verbal cues while children perform spatial tasks can bolster children’s performance (Dessalegn & Landau, 2008, 2013; Loewenstein & Gentner, 2005; Shusterman, Lee, & Spelke, 2011) and that children’s spatial language production abilities predict their spatial skills (Hermer-Vazquez, Moffet, & Munkholm, 2001; Pruden, Levine, & Huttenlocher, 2011). However, the specific way in which language relates to spatial cognition is not universally agreed upon (e.g., Nardini et al., 2006; Ratliff & Newcombe, 2008), and this relation has previously only been tested in a limited range of spatial tasks. In this paper, we extend the literature by investigating the role of language in a type of spatial skill that has been given little prior consideration: children’s selection among spatial reference frames to recall object locations, with a focus on the intrinsic reference frame.
Providing spatial language in the context of spatial tasks can promote preschool-aged children’s spatial performance (e.g., Dessalegn & Landau, 2008, 2013; Loewenstein & Gentner, 2005; Shusterman et al., 2011). Loewenstein and Gentner (2005) showed that providing children with spatial terms such as ‘top’ and ‘middle’ during or just prior to a relational mapping task increased children’s performance. Dessalegn and Landau (2008) found similar effects with 4-year-olds using a different type of spatial task, the feature binding task, which assessed memory for visual feature conjunctions. Providing children with spatial terms specifying the location of one colored feature relative to the other (e.g., “the red is on the left”) improved their performance in remembering the bindings between the features. These effects are not limited to terms that describe spatial relations. Language can also highlight pragmatic information about the relevance of cues for solving spatial tasks. For example, Shusterman et al. (2011) tested 4-year-old children in a disorientation search task that requires integration of featural cues (i.e., a uniquely-colored wall in the task space) with geometry to successfully reorient. In this task, children typically do not use features until 6 years of age (Hermer-Vazquez et al., 2001; but see Cheng & Newcombe, 2005), however Shusterman et al. found that telling participants “the red wall can help you find the sticker” helped younger children solve the task.
Additionally, children’s abilities to produce spatial language predicts their spatial skills (Hermer-Vazquez et al., 2001; Pruden et al., 2011; Simms & Gentner, 2008). Children’s production of words such as ‘left’, ‘right’, and ‘middle’ was positively associated with performance on spatial tasks that depend on these spatial relations (Hermer-Vazquez et al., 2001; Simms & Gentner, 2008). Furthermore, the total number of spatial words that children produced in a free-play setting correlated longitudinally with their spatial skills across a variety of spatial measures (Pruden et al., 2011). These two types of results—those that show language input boosts performance and those that show correlations between spatial word production and spatial skills—have been interpreted as evidence that spatial language enables children to verbally encode task-relevant spatial information, thereby improving spatial task performance.
Whereas language has been shown to enhance children’s performance across some spatial tasks, limited attention has been given to the effects of language on reference frame selection. The flexible use of reference frames in spatial recall develops throughout early childhood (Nardini et al., 2006), in parallel with improvements on the disorientation search task (Hermer-Vazquez et al., 2001). By 2 years of age, children can select among egocentric or allocentric reference frames, remembering object locations relative to themselves or environmental cues (e.g., Acredolo, 1978; Bai & Bertenthal, 1992; Newcombe, Huttenlocher, Drummey, & Wiley, 1998). However, children struggle to utilize one particular type of allocentric reference frame, an intrinsic reference frame, until 5 to 6 years of age (Nardini et al., 2006). Selecting an intrinsic reference frame requires using the configuration among objects to anchor memory without relying on the body or distal environmental cues. Investigating the role of language in children’s intrinsic reference frame selection is of particular interest because preliminary evidence suggested language may not contribute to the development of this spatial ability (Nardini et al., 2006), even though this ability shows a similar developmental trajectory to other spatial abilities that relate to language (Hermer-Vazquez et al., 2001).
To assess 3- to 6-year-old children’s use of an intrinsic reference frame in recall, Nardini et al. (2006) adapted a task by Simons and Wang (1998). On each trial, a toy was hidden under one of thirteen cups arranged on a table, with unique landmarks along two edges of the table. On all trials the intrinsic reference frame was available and constant, as children could use the stable configuration of objects (i.e., cups and landmarks) within the array to remember the location. Across trials the alignment of the other reference frames varied through changes in the child’s position (disrupting egocentric alignment) and/or changes in the array’s position (disrupting room-centered alignment). Children performed best when all three reference frames (intrinsic, egocentric, room-centered) maintained alignment from hiding to search, and worst when the three were misaligned—trials that relied most heavily on the intrinsic reference frame. Additionally, only 5- and 6-year-olds performed significantly above chance on trials in which the array rotated between hiding and search, suggesting that children’s reliance on the intrinsic reference frame is not well developed before 5 years of age.
Although language was not the focus of their study, Nardini et al. (2006) also assessed whether verbal encoding during the task contributed to this developmental change. They tested verbal encoding during one “surprise” trial on which, following the delay, the experimenter asked the child to describe the toy’s location. The location was selected to be relatively easy to describe, as it was directly between two landmarks. Children’s use of landmarks to describe the location increased with age: 0% of 3-year-olds, 19% of 4-year-olds, 29% of 5-year-olds, and 71% of 6-year-olds. Of most interest was performance by 5-year-olds, as only a few of them described the landmarks, but as a group they performed above chance on the recall task (this effect held even when analyzing only children who did not use verbal descriptions). The authors concluded that verbal encoding was not necessary for the development of reference frame selection, but noted that a single trial was not a comprehensive method for testing whether language could support intrinsic reference frame selection.
In this paper, we investigated the influence of spatial language on children’s use of an intrinsic reference frame during recall. We extend the literature not only by testing the effects of language on a spatial skill that has received limited attention, but also by contributing to the debate on the effects of language on spatial skills more generally. Some researchers have theorized that language is necessary for the development of advanced spatial cognitive skills (e.g., Hermer-Vazquez et al., 2001; Pyers et al., 2010) while other theorists have argued against a central role for language (Learmonth, Newcombe, Sheridan, & Jones, 2008; Nardini et al., 2006; Ratliff & Newcombe, 2008). Proponents of these contrasting perspectives have approached their research using different methodologies. The former perspective has provided evidence through giving children verbal cues during spatial tasks (e.g., Shusterman et al., 2011) and through assessing whether production of spatial terms outside of the spatial task context correlates with spatial performance (e.g., Hermer-Vasquez et al. 2001). The latter perspective has tested children at a particular age level who are believed to lack the necessary words to solve the spatial tasks or who show no evidence of verbal encoding during the task (Learmonth et al., 2008; Nardini et al., 2006). We combine these approaches by providing children with verbal cues to support selection of an intrinsic reference frame in recall (similar to the former perspective) and by assessing children’s spatial language production to describe the spatial task space (similar to the latter perspective).
We used a recall task similar to Nardini et al. (2006) to evaluate children’s reference frame selection and spatial descriptions. We tested 4-year-olds because this age was just prior to the reliable use of the intrinsic reference frame in Nardini et al.’s study and because multiple studies have shown relations between spatial skills and language at 4 years of age (Dessalegn & Landau, 2008, 2013; Loewenstein & Gentner, 2005; Pruden et al., 2011; Shusterman et al., 2011). Experiment 1 tested whether verbal encoding of spatial relations among objects contributed to children’s selection of an intrinsic reference frame during recall. To address this question, Experiment 1a examined: (1) whether providing children with verbal cues that specified relations among objects on the testing array helped children select an intrinsic reference frame; and (2) whether children’s abilities to produce accurate verbal descriptions of spatial relations predicted recall performance on trials depending on the intrinsic reference frame, in the absence of verbal cues from the experimenter. Experiment 1b tested the effect of providing verbal cues before (rather than during) the recall task. Experiment 2 investigated whether visual cues provided comparable support to children’s performance as the verbal cues in Experiment 1a.
Experiment 1a tested whether verbal encoding of spatial locations could support children’s selection of an intrinsic reference frame during recall. We tested this question in three ways with the current experiment. First, we administered a recall task similar to Nardini et al. (2006) and evaluated whether providing 4-year-olds with location-descriptive verbal cues would enhance their intrinsic reference frame use in recall (compared to a control condition with non-descriptive cues). Second, we examined children’s description of the locations within our testing array to assess the types of cues children may spontaneously verbally encode and we compared individual differences in these descriptions for children in the control condition to recall performance using the intrinsic reference frame. Finally as a manipulation check, we administered a comprehension task to ensure that children understood the location-descriptive verbal cues. We hypothesized that children who received location-descriptive verbal cues during the recall task would perform better than those who did not because these cues provide the necessary information to verbally encode object locations relative to an intrinsic reference frame. Furthermore, this benefit should be most pronounced on trials that rely most on the intrinsic reference frame (through misalignment with other reference frames). Additionally, we hypothesized that at the group level, children would not provide much task-relevant language to describe locations in the testing array but that individual differences in children’s production of task-relevant language would predict their recall performance (in the control condition), parallel to the predicted effect of location-descriptive verbal cues. Lastly, we predicted that children would successfully understand the verbal cues in the comprehension task.
Thirty-nine 4-year-olds (M = 4.44 years, SD = 0.24 years, 21 girls and 18 boys) participated. An additional 15 children participated but were excluded for: parental interference during the language production task (7, described further below), non-compliance during testing (4), ending before completing all tasks (3), recall performance more than 2.2 standard deviations below the mean (1). Participants were recruited from a research affiliated database. Parents provided informed consent in accordance with the university’s Institutional Review Board. All participants received a small toy after participation.
The test apparatus included an array of cups and stuffed animal landmarks situated on a short rotating table, shown in Figure 1. The rotating table was 1.0 m in diameter and 0.4 m in height. The hiding locations were five cups (three red and two yellow; 9 cm diameter, 11 cm tall) placed on blue plates (8 cm diameter) aligned in a pentagon shape on the table. The alignment of cups in a pentagon shape was designed to aid children in identifying the spatial relations among the cups. Four small stuffed animals (approximately 12 cm tall and 13 cm wide at the base; see Figure 1) familiar to children (dog, cow, pig, and frog) were used as landmarks. The placement of the landmarks near the cups was chosen to facilitate verbal description of the hiding location (see Figure 1).
Four X-marks (20 cm per side) were formed with duct tape on the floor of the testing room, centered along each curtained wall of the testing space with gaps of approximately 55 cm to the curtain and 25 cm to the edge of the table (see Figure 2 below). Two X-marks indicated the position for the participant to stand during hiding and search events; the other two X-marks provided symmetry to preclude participant’s use as side-distinctive landmarks. A small toy (approximately 4 cm tall and 2 cm wide) served as the target for hiding events. Solid blue curtains hung from ceiling to floor surrounding the task space to block external landmarks in the room. The rotating table was centered within this task space, approximately 1 m from each side of the curtained wall. A small digital video camera was mounted on the ceiling directly above the table to record the sessions.
The experiment involved three tasks: recall, production, and comprehension. Children completed the production task first, followed by the recall task, and then the comprehension task. This order avoided influence of the cues provided by the experimenter (in the comprehension task in both conditions, and the recall task in the descriptive cues condition) on performance in the other tasks. Additionally, the production task was administered first to allow assessment of children’s descriptions when they had limited experience in the task space.
The recall task included two manipulations: a rotation manipulation (within-subjects) and a verbal cues manipulation (between-subjects). The rotation manipulation tested children’s use of the intrinsic reference frame for recall, following Nardini et al. (2006; see also Simmering, Miller, & Patterson, 2011). There were four types of trials, shown in Figure 2, which involved the following manipulations during the delay: 1) the participant remained in the same position and the table remained stationary (neither-move); 2) the participant walked 90° to their right while the table remained stationary between hiding and search (child-move); 3) the participant walked 90° to their right and the table rotated 90° counter-clockwise (i.e., in the same direction; both-move); and 4) the participant remained in the same position while the table rotated 90° clockwise (table-move). Note that, in all cases, the relative positions among the cups and landmarks on the table (i.e., the intrinsic reference frame) did not change; only the alignment of the configuration relative to the room and/or child was manipulated.
We counterbalanced the four rotation types such that half of the participants completed the task in the forward order (neither-move, table-move, child-move, both-move) and half completed the task in the backward order (both-move, child-move, table-move, neither-move). This order alternated between relative levels of difficulty (Nardini et al., 2006; a pattern replicated by Simmering et al., 2011 with the apparatus used here). Note that the intrinsic reference frame could be used in all rotation types, but the table-move trials depended most on it because other reference frames (egocentric and room-centered) are misaligned between hiding and search. The hiding locations for each participant were pseudo-randomly assigned prior to the experiment with the constraint that, for each rotation type, one trial used the single red cup (location 1), one trial used one of the two yellow cups (location 2 or 5), and one trial used one of the two neighboring red cups (location 3 or 4; see Figure 1). The same location was never used on consecutive trials.
For the verbal cues manipulation, children were randomly assigned to either the descriptive condition or the control condition. In the descriptive condition, the experimenter provided verbal cues describing the location of the target object relative to the nearby landmark(s) on each trial (listed in Table 1). In the control condition, the experimenter did not specify the location of the target object relative to the nearby landmark(s) (i.e., “I’m hiding the toy here”); the same cues were given on each trial in this condition.
The procedures for the recall task are explained first, as this was the central task, followed by the production and the comprehension tasks (note children performed the production task before recall task).
Before the recall task, children chose whether the caregiver stayed with them during the task; if not, the caregivers returned to a waiting area outside of the curtained space. When a caregiver stayed, the experimenter ensured that s/he stood in a position outside of the child’s view (in the corner of the task space to the left and slightly behind the child). On trials in which the child changed position within the task space, the experimenter and caregiver moved with the child to maintain the same relative positions, to ensure that they could not serve as landmarks.
The experimenter explained the task as follows: “In this game, I am going to hide the toy in one of these cups and then you will turn around, and we’ll count to 10 together; when we say 10, you will turn back around and find the toy.” The task began with a practice trial in which the experimenter hid the toy in the single red cup (location 1) closest to the child and then had the child turn around, so that they could not see the array for 10 seconds. During the delay, the experimenter and child counted to 10. After the delay, the experimenter instructed the child to turn back around and find the toy. The child was encouraged to try to find the toy with his or her first search, but was allowed to keep searching until finding the toy. Eighty-five percent of children correctly found the toy with their first search on the practice trial; the practice trial was repeated for children who selected the incorrect location.
Next the child completed the four rotation types (see Figure 2) in either the forward or backward order described above. Before each trial, depending on the rotation type, the experimenter explained to the child whether he or she would move to the other X mark and/or the table would rotate. On neither-move trials, the procedure was identical to the practice trial (i.e., no changes in position during the delay). On child-move trials, the experimenter guided the participant 90° to the right around the table to stand on the second X-mark during the delay. On both-move trials, the experimenter rotated the table 90° counterclockwise and then helped the child walk 90° to their right (the same direction as in the child-move trials) during the delay. These two changes resulted in the child having the same view of the table during hiding and search (but from a different location in the room; see Figure 2). On table-move trials, the experimenter rotated the table 90° degrees clockwise during the delay (the opposite direction of the both-move trials; see Figure 2). On all trials when the child moved to a new location (i.e., child- and both-move), the child was asked to keep his or her eyes closed while walking, and the experimenter blocked the child’s view of the table with a clipboard held next to his or her head. On all trials when the table moved (i.e., table- and both-move), the experimenter reminded the child after the delay and before they searched for the toy that the table had rotated. At the end of each trial, the experimenter had the child move back to their original position and/or rotated the table back to its original position in the child’s view.
The experimenter provided verbal cues while hiding the toy. In the descriptive condition, the experimenter described the hidden toy’s location relative to its proximity to the nearest landmark(s) (e.g., “The toy is hidden by the frog”; see Table 1). Children repeated the location descriptions provided by the experimenter to ensure that they encoded the description. In the control condition, the experimenter hid the toy while saying “Look, I am hiding the toy here” and then had the child turn around for the delay.
For the production task, the experimenter instructed the caregiver not to talk to the child about the task or materials, explicitly instructing them to avoid discussing the cups or the stuffed animal landmarks during the session. Additionally, the experimenter highlighted the importance of control in the study and explained that we are interested in children’s spontaneous responses. Caregivers were told they could prompt the child to give more clues (e.g., by saying “can you tell me more?”), but that they should not give directive prompting that would help the child in describing the toy’s location (e.g., not asking “what color is the cup?” or “what animal is it by?”). In some cases, a caregiver did not follow these instructions and the children were excluded from analyses (see Participants section) due to the possible influence on children’s performance.
At the beginning of the production task, the experimenter explained the instructions to the child (with the appropriate term substituted for “mom” if it was not the mother who accompanied the child): “In this game your mom will turn around so that she cannot see the table. I will hide a toy in one of these cups and then we will have your mom turn back around. Then you will use your words and help your mom find the toy. In this game, you want to help your mom so she can find the toy.” The experimenter instructed the child to use their words and not to point. If the child did not give a sufficiently descriptive response (e.g., “it’s under a cup”), the experimenter prompted the child up to two times by saying “do you have anything else to say to help your mom?” and then the caregiver was told to guess the location. The child completed five trials, one for each hiding location, in a randomized order.
After the recall task, children completed a comprehension assessment as a manipulation check. The experimenter instructed the child to turn away from the table, and then hid the toy under a cup while out of the child’s view. After hiding the toy, the experimenter instructed the child to turn back around and gave the child a descriptive verbal cue (e.g., “the toy is hidden by the dog and cow”; see Table 1). If the child did not find the toy with his or her first search, they were encouraged to keep searching. Each child completed five trials in a random order, one for each hiding location.
The child’s responses in the production task were transcribed online by a second experimenter. All transcripts were later checked (from video recordings) by a different research assistant and corrected if any inconsistencies were found. The children’s responses were coded by three trained research assistants; each research assistant coded data from approximately one third of the participants. For each trial, the coders reported whether the child mentioned (A) the color of the hiding location, (B) the name of the stuffed animals that served as landmarks, and (C) relational terms (e.g., by, next to, between) to describe the object’s location. Approximately twenty percent of the data (i.e., nine participants) were double-coded for reliability; coders agreed on 96% of trials and disagreements were resolved by the first author.
During the recall and comprehension tasks, the second experimenter recorded the locations and order of the cups that the child searched under. Approximately twenty percent of the sessions (i.e., nine participants) were blind coded from video by a different research assistant; the blind coded responses matched the originals on 94% of the trials and disagreements were resolved by the third author.
In this section, we describe the results of the recall task followed by the production task, then the comprehension task. First, we tested whether there was an effect of the verbal cues manipulation on performance in the recall task. Next, we evaluated the types of cues that children provided in the production task, assessing whether individual differences in production predicted performance in the recall task (control condition only). Lastly, we considered performance in the comprehension task to ensure that children could correctly interpret the verbal cues.
As the dependent measure, we calculated the proportion of trials on which the participant selected the correct cup as their first choice; mean proportion correct is shown across rotation types and verbal cue conditions in Figure 3. As a preliminary analysis, to test for sex differences, we conducted a two-sampled t-test on mean proportion correct and found no significant difference (p = .79); thus, we excluded this factor from further analysis.
As shown in Figure 3, our results followed the qualitative pattern across rotation types found by Nardini et al. (2006) and Simmering et al. (2011), with the highest performance on neither-move trials, followed by child-move, both-move, and table-move trials. Furthermore, similar to Nardini et al., the 4-year-olds in our control condition did not perform significantly above chance on table-move trials (see 95% confidence intervals in Figure 3). Most notably, children’s performance was generally higher in the descriptive condition.
We conducted a repeated measures two-way ANOVA on mean proportion correct with rotation type (neither-, child-, both-, table-move) as a within-subject factor and verbal cues condition (descriptive, control) as a between-subject factor. There were significant main effects of rotation type (F(3,111) = 13.82, p < .001, ηp2 = .272) and verbal cues condition (F(1,37) = 15.15, p < .001, ηp2 = .290), and a marginal two-way interaction (F(3,111) = 2.64, p = .052, ηp2 = .066). A Tukey post-hoc test (p = .05) following up on the rotation type main effect revealed that the children performed significantly better on the neither-move rotation type (M = .83, SD = .20) than on the other rotation types: child-move (M = .66, SD = .32), both-move (M = .54, SD = .35), and table-move (M = .48, SD = .36). Additionally, children performed significantly better in the child-move rotation type than in the table-move. The main effect of verbal cues is evident in Figure 3, with better performance in the descriptive (M = .75, SD = .29) versus control (M = .52, SD = .34) conditions.
We were particularly interested in whether providing children with location-descriptive verbal cues specifically supported performance on trials that depend most on the intrinsic reference frame. We tested this by conducting an independent-samples t-test comparing performance on the table-move rotation type across conditions. As predicted, we found significantly better performance in the descriptive condition (t(37) = 4.07, p < .001, ηp2 = .309; see table-move bars in Figure 3). Thus, providing children with verbal cues describing the location of the nearest landmark(s) relative to the hiding location helped them select an intrinsic reference frame during the recall task.
In this section, we analyzed the types of cues children provided in the production task and assessed whether their production predicted recall performance. We first examined whether there were differences in the rate at which children mentioned color, landmarks, and relational terms. This analysis allowed us to assess the types of cues that children may verbally encode during the recall task. Additionally, we analyzed whether there were condition differences in the rates at which children mentioned these types of terms to examine whether there were a priori differences across the conditions that may have contributed to differences in recall performance (i.e., differences not accounted for by the verbal cues manipulation). Finally, we compared children’s production of landmark and relational terms in to their recall performance on table-move trials to test for the predicted correlation across tasks.
Three children (one from the descriptive and two from the control conditions) were excluded from this analysis for providing no data (i.e., they did not speak during any trial of the production task), resulting in a sample size of 36 participants. As a preliminary analysis, we used independent-samples t-tests to assess whether there were sex differences in the rate at which children mentioned color, landmark, or relational terms and found no significant difference (ps > .312).
Next, we examined whether there were differences in the proportion of trials on which children mentioned the various types of cues (color, landmarks, relational terms), collapsing across conditions. A Friedman’s test showed that children mentioned color (M = .66, SD =.43) more often than landmarks (M = .44, SD = .43) or relational terms (M = .41, SD = .40), χ2(2, N = 37) = 7.6, p = .022, but landmarks and relational terms did not differ. In our testing apparatus (see Figures 1 and and2),2), encoding color more frequently would not support recall performance as encoding the landmarks because color was not a unique cue. This low rate of children’s mentioning of landmark and relational terms may account for some of the limitations in children’s spontaneous use of verbal encoding to support intrinsic reference frame selection.
We also assessed whether there were a priori differences across conditions in children’s mentioning of color, landmark, and relation terms. We found no differences in frequency of mentioning landmarks or relational terms (ps > .494), and a marginal difference for color terms (t(34) = 1.98, p = .055) such that children in the descriptive condition (M = .80, SD = .37) mentioned color terms more often than did children in the control condition (M = .53, SD = .44). As noted above, color is not a term that could strongly support recall performance, suggesting that the condition effect in the recall task (described above) could not be explained through differences in the children assigned to each condition.
Lastly, and most central to our hypotheses, we evaluated whether children’s performance in the production task predicted their recall performance in the control condition. We were particularly interested in the table-move rotation type, which relied most heavily on selecting an intrinsic reference frame. To test whether production and recall performance were related, we calculated the number of trials on which each child mentioned each type of cue (color, landmark, relation term) correctly and compared each of these separately to their proportion correct on table-move trials. A Pearson’s correlation (using one-tailed p-values) showed that mentioning relation terms in the production task significantly predicted performance on the table-move trials of the recall task (r(17) = .409, p = .045). The correlation with mentioning landmarks also approached significance (r(17) = .376, p = .062), but the correlation with color terms did not (r(17) = −.026, p = .459). These results suggest that children’s production of the information most relevant for identifying the intrinsic reference frame—the relations among landmarks and cups—related most strongly to their use of an intrinsic reference frame during recall.
The purpose of this task was to ensure that children could interpret the verbal cues provided during the recall task in the descriptive condition, in case our manipulation was not effective. Given the positive effect of location-descriptive cues on the recall task, we also asked whether there was any carry-over to the comprehension task. We calculated the proportion of trials on which children selected the correct cup as their first choice and found that children performed well above chance (.20) on the comprehension task (M = .79, 95% CI [.72, .86]). Next, we compared performance between conditions from the recall task (descriptive, control) to see if exposure to verbal cues during recall influenced performance on the comprehension task. Recall that children’s experience in the comprehension task did not differ between conditions, but children in the descriptive condition heard the same cues in both tasks. An independent-samples t-test showed no significant differences between the descriptive (M = .85, 95% CI [.63, .85]) and control (M = .74, 95% CI [.75, .95]) conditions (p = .138).
Experiment 1a assessed whether verbal encoding could support children’s performance in selecting an intrinsic reference frame for spatial recall. Our first goal was to test the effect of providing 4-year-olds with location-descriptive verbal cues during the recall task on their intrinsic reference frame selection. As predicted, children in the descriptive condition performed better than those in the control condition, both overall and specifically on the table-move trials. These results show that helping children verbally encode the spatial locations relative to landmarks facilitated their overall recall performance and was particularly helpful for selecting an intrinsic reference frame during recall.
The second goal was to assess whether children’s production of spatial language was related to their selection of the intrinsic reference frame during recall. The results showed that 4-year-olds as a group produced more color terms than landmarks or relational terms, which were not as descriptive in this task array (see Figures 1 and and2).2). This may explain the relatively poor performance of 4-year-olds in selecting an intrinsic reference frame in general (Nardini et al., 2006; Simmering et al., 2011). Additionally, the result showed that children who used more relational terms to describe the toy’s location performed better on the trials that relied most on the intrinsic reference frame of the recall task (table-move trials). These findings suggest that verbally encoding spatial relations supports the use of an intrinsic reference frame during recall and could be the source of the developmental changes found by Nardini et al. (2006).
The final goal was to assess whether children successfully comprehended the verbal cues used in the descriptive condition of the recall task. We predicted that 4-year-olds as a group would use these cues to find the toy even when they did not see it being hidden, and our prediction was confirmed. We also found no difference in performance following the recall task using location-descriptive versus non-descriptive verbal cues. This lack of difference is partly attributable to children’s overall high level of performance, which did not leave much room for improvement relative to the control condition.
Overall, the results of Experiment 1a showed that verbally encoding the hidden object’s location relative to landmarks supported children’s selection of an intrinsic reference frame in recall, and that individual differences in children’s ability to describe relations in the task array predicted selection of an intrinsic reference frame during recall. These findings are consistent with previous research showing that verbally encoding task-relevant spatial information can enhance young children’s spatial task performance (Dessalegn & Landau, 2008; Loewenstein & Gentner, 2005; Shusterman et al., 2011), but do not address the mechanism by which language supports intrinsic reference frame selection. The positive effects of providing descriptive verbal cues could be due to: (a) supplying a verbal code that children otherwise would not have accessed to encode the spatial relations (e.g., children’s relatively under-developed use of the term “by” Hund & Plumert, 2007) or (b) leading children to verbally encode the location on every trial by requiring them to repeat the experimenter-provided cues. Experiment 1b was designed to differentiate these possibilities, to understand the mechanism by which language skills relate to reference frame selection in recall.
In Experiment 1b, we tested whether children would benefit from location-descriptive cues if these cues were provided before rather than during the recall task. This allowed us to evaluate whether the results from Experiment 1a depended on verbal encoding on every trial, or if simply exposing children to a description of the relevant relations in advance could support their recall performance. Our manipulation was similar to that of Loewenstein and Gentner (2005), who showed that exposing preschool-aged children to relational language prior to a relational mapping task helped them perform better in the relational mapping task. Thus, we predicted that exposing 4-year-olds to the location-descriptive cues in the comprehension task would support selection of an intrinsic reference frame during the subsequent recall task, therefore leading to improved performance relative to the control condition from Experiment 1a.
Twenty 4-year-olds (M = 4.44, SD = 0.22, 16 girls and 4 boys) participated in the study. An additional four children participated but were excluded from analyses for not completing the task (3) or non-compliance (1). Participants were recruited using the same methods as in Experiment 1a.
The materials, design, and procedure were identical to the control condition of Experiment 1a, except that participants completed the comprehension task first, followed by the recall task. Twenty percent of the sessions (i.e., four participants) were blind coded from video by a different research assistant for reliability; the blind coded responses matched the original on 98% of the trials and disagreements were resolved by the third author.
As in Experiment 1a, we calculated the mean proportion of correct first searches in each rotation type, shown in Figure 4. An independent samples t-test showed no significant sex differences in performance (p = .100), thus we excluded this factor from further analyses. Children’s performance followed the same qualitative pattern across rotation types as in Experiment 1a (see also Nardini et al., 2006; Simmering et al., 2011), but overall performance was higher than in the control condition of Experiment 1a, with all means significantly above chance. A one-way ANOVA with rotation type as a within-subjects factor showed a significant main effect (F(3, 57) = 13.23, p < .001, ηp2 = .410). Follow-up Tukey HSD tests (p < .05) revealed that performance was significantly higher on neither- and child-move trials than on both- and table-move trials (see Figure 4), with no other differences.
To test whether hearing location-descriptive verbal cues prior to the recall task helped children select the intrinsic reference frame, we conducted a two-way ANOVA on mean proportion correct with experiment (1a-control, 1b) as a between-subjects factor and rotation type as a within-subjects factor, and only report effects of experiment. This analysis revealed a significant main effect of experiment (F(1, 38) = 6.49, p = .015, ηp2 = .146), with higher performance in 1b (M = .65 , SD = .29) than in 1a-control (reported above); the interaction was not significant (p = .459). Similar to Experiment 1a, we also conducted a planned independent samples t-test on proportion correct across experiments (1a-control, 1b) for the table-move rotation type only. As predicted, we found that exposure to the location-descriptive verbal cues during the comprehension task significantly improved children’s performance on the table-move trials compared to no exposure (t(38) = 2.13, p = .039, ηp2 = .107; cf. table-move bars in Figures 3 and and44).
Although we had no specific predictions regarding children’s performance in the comprehension task in the current experiment, we can compare across experiments to assess whether the order of presentation (before versus after recall) significantly affected performance. The level of performance on the comprehension task in the current experiment (M = .79, 95% CI [.72, .86]) was roughly in between children’s performance in the two conditions from Experiment 1a. A one-way ANOVA comparing these three conditions (1a-descriptive, 1a-control, 1b) showed no significant effect (p = .248).
Experiment 1b assessed whether pre-exposure to location-descriptive verbal cues would facilitate 4-year-olds’ selection of the intrinsic reference frame in the subsequent recall task. As predicted, children’s recall performance, both overall and on table-move trials only, was better with pre-exposure (Experiment 1b) relative to no exposure (Experiment 1a-control). Results across these experiments support our hypothesis that children can use verbal encoding to enhance their use of an intrinsic reference frame in recall if they can access task relevant spatial information through language. However, it is possible that the verbal cues did not lead to verbal encoding, but rather simply drew children’s attention to the relevant visual information, which was then visually encoded. We tested this possibility in Experiment 2 by attempting to draw children’s attention to the landmarks non-verbally.
The results from Experiment 1 showed that children’s selection of an intrinsic reference frame for recall was predicted by their production of relational terms in the task space, and could be supported by providing location-descriptive cues before or during recall. The similar results between Experiments 1a and 1b indicate that children do not need to hear and repeat the cues on every recall trial to benefit, raising the question of whether verbal encoding is the mechanism by which our language manipulations improved their performance. An alternative explanation could be that the experimenter’s verbal description of the landmarks and relations simply drew children’s attention to information that could help them perform the task. Similarly, our correlational results could reflect children’s attention to the landmarks and relations across both production and recall: the children who did not verbally describe these features of the task space may lack the relevant verbal information (especially relational terms; cf. Hund & Plumert, 2007) or may not attend to the relevant visual information. In either of these cases, the experimenter’s location-descriptive verbal cues in Experiments 1a and 1b would improve performance by providing the verbal terms and/or by drawing attention to the visual information.
Experiment 2 tested whether drawing children’s attention to spatial relations by visually highlighting relations among objects during encoding would support 4-year-olds’ selection of the intrinsic reference frame in the recall task. We used the same recall task as in Experiment 1a, but rather than providing verbal cues during the hiding event, we visually highlighted the relevant features of the array by lifting the landmark(s) that were closest to the hiding location and moving the landmark(s) on top of the hiding location and then placing the landmark(s) back to their original location (see Figure 1). This visual highlighting manipulation was intended to parallel the location descriptive verbal cues condition in Experiment 1a in that, similar to the verbal cues describing the hiding location relative to the nearest landmark(s), this manipulation visually drew attention to the relevant landmark(s) and showed the relation between the landmark(s) and the nearest cup. If children’s attention to these relations (rather than verbal encoding) helped them select an intrinsic reference frame in Experiment 1a, then exposure to non-verbal highlighting of the relations should improve children’s performance similarly. However, if the location-descriptive language supported verbal encoding, then non-verbally highlighting the relations should be less effective.
Twenty 4-year-olds (M = 4.43, SD = 0.16, 4 girls and 16 boys) participated in the study. One additional child participated but was excluded for non-compliance. Participants were recruited through the same methods as in Experiment 1.
The materials and recall task were identical to Experiment 1 with one exception: during the recall task, all children received the visual highlight manipulation. On each trial of the recall task, the experimenter highlighted the location of the nearby landmarks without using verbal cues. The experimenter lifted the landmark corresponding to the hiding location (e.g., for hiding location 4, lifting the frog) and held it on top of the cup for approximately 3 seconds before replacing it on the table. The experimenter repeated this action twice to parallel the approximate length and repetition of the descriptive condition of Experiment 1a. Twenty percent of the sessions (i.e., four participants) were blind coded from video by a different research assistant for reliability; the blind coded responses matched the original on 98% of the trials and disagreements were resolved by the third author.
We again used proportion correct on the recall task across conditions as our dependent measure, shown in Figure 5. To test for sex differences, we conducted an independent-samples t-test and found no significant differences (p = .435) and thus excluded this factor from further analyses. As Figure 5 shows, children showed the same qualitative pattern across rotation types as before, but overall performance was slightly higher than in the control condition of Experiment 1a, with all means in the current experiment significantly above chance. A one-way ANOVA with rotation type as a within-subjects factor revealed a significant main effect (F(3,57) = 13.38, p < .001, ηp2= .413). Follow-up Tukey HSD tests (p < .05) showed that performance was significantly higher on neither- and child-move trials than on both- and table-move trials (see Figure 5); no other differences were significant.
To test whether visually highlighting relations among objects supported children’s selection of an intrinsic reference frame, we conducted a two-way ANOVA on mean proportion correct with experiment (1a-control and 2) as a between-subjects factor and rotation types as a within-subject factor, and only report effects of experiment. The ANOVA revealed a significant main effect of experiment (F(1,37) = 5.715 p = .028, ηp2= .133), but no significant interaction (p =.270). The results showed that children in Experiment 2 (M = .63, SD = .31) performed significantly better than children in the control condition of Experiment 1a (see means above). Following Experiment 1, we also conducted a planned independent samples t-test on the table-move trials and found no significant difference (p = .186; cf. table-move bars in Figures 3 and and5).5). These findings suggest that visually highlighting relations among objects on the testing array increased children’s recall performance overall, but not selection of the intrinsic reference frame in particular.
Finally, we were interested in whether our visual highlight manipulation was as effective as the descriptive condition of Experiment 1a. We conducted a two-way ANOVA on mean proportion correct with experiment (1a-descriptive, 2) as a between subjects factor and rotation type as a within-subject factor, and report only effects of experiment. We found a significant main effect of experiment (F(1, 37) = 4.35, p = .043, ηp2 = .105), with overall higher performance in the descriptive condition of Experiment 1a, and a trend toward an interaction between experiment and rotation type (F(3, 111) = 2.26, p = .085, ηp2= .057). As in prior analyses, we also conducted a planned independent samples t-test on the table-move trials only and found that performance was significantly higher in Experiment 1a than Experiment 2, t(37) = 3.05, p = .004. Thus, visually highlighting the landmarks was not as supportive of children’s performance as providing location-descriptive verbal cues, especially on trials that depend most on the intrinsic reference frame.
Experiment 2 tested whether visually highlighting the landmarks near the hiding location during encoding would support children’s intrinsic reference frame selection in recall. Comparison with the control group of Experiment 1a showed an overall benefit, but no significant difference on trials depending most on the intrinsic reference frame. Furthermore, comparison with the descriptive condition of Experiment 1a showed that verbal cues resulted in significantly higher performance, both overall and specifically on the trials depending most on the intrinsic reference frame. Thus, a visual analog to the verbal cues from Experiment 1a did not support children’s performance as well as the verbal cues did. These results lend further support to the explanation that verbal encoding drove the effects in Experiment 1.
The current studies examined verbal encoding as a potential mechanism underlying improvements in children’s abilities to select an intrinsic reference frame to recall object locations. Nardini et al. (2006) showed a transition in intrinsic reference frame selection between 4 and 5 years of age, but their “surprise trial” results suggested that verbal encoding was not driving this developmental change. These findings contrasted with research suggesting that children’s language abilities account for development changes on a range of spatial tasks (Dessalegn & Landau, 2008, 2013; Hermer-Vazquez et al., 2001; Loewenstein & Gentner, 2005; Pruden et al., 2011; Shusterman et al., 2011; Simms & Gentner, 2008). One possible explanation for this variation across tasks could be that language supports various types of spatial skills differently over development. For instance, spatial recall may rely less on language than object recognition (Dessalegn & Landau, 2008, 2013) or reorientation (Hermer-Vazquez et al., 2001). Alternatively, it could be the case that effects of language were not apparent in Nardini et al.’s task because their paradigm was not designed to test language specifically. Our studies used a similar recall task to Nardini et al. to examine whether language supports children’s use of an intrinsic reference frame in recall.
Experiment 1a showed that providing 4-year-olds with location-descriptive verbal cues on every trial of a spatial recall task improved recall performance relative to the control condition, both overall and specifically on trials that relied most on the intrinsic reference frame. Additionally, children’s use of relational language to describe hiding locations during an initial production task predicted their subsequent recall performance on trials that relied most on the intrinsic reference frame (in the control condition). These findings suggest that developmental changes in children’s verbal encoding of spatial relations could account for improvements in intrinsic reference frame selection for recall. Experiment 1b showed that pre-exposing children to location-descriptive cues in a comprehension task prior to the recall task increased children’s performance relative to the control condition from Experiment 1a, both overall and specifically on trials relying most heavily on the intrinsic reference frame. Taken together, the results from Experiment 1a and 1b show that if children are provided with the appropriate verbal cues to encode task-relevant spatial information—either before or during the recall tasks—they can use this language to support their selection of an intrinsic reference frame. These results are consistent with the hypothesis that verbal encoding can supports children’s selection of an intrinsic reference frame, similar to the relation between language and other spatial skills (e.g., Shusterman et al., 2011).
Experiment 2 was designed to assess whether the effects in Experiment 1 were specific to language or reflected a more general increase in attention to relations within the array. We found that visually highlighting the landmarks improved children’s overall performance relative to the control condition, but not on trials that relied most on the intrinsic reference frame. Furthermore, compared with the descriptive condition from Experiment 1a, children’s performance in Experiment 2 was significantly lower, both overall and on the table-move trials specifically. These findings suggest that increasing children’s attention to locations on the testing array can promote recall performance, but may not be as effective as language in facilitating intrinsic reference frame selection. In the sections that follow, we consider why language was more effective than visual cues in supporting children’s use of the intrinsic reference frame and whether these effects of language generalize across all spatial skills or are specific to certain types.
Why did the location-descriptive verbal cues promote children’s selection of an intrinsic reference frame more than the visual highlight cues? It is possible that the verbal and visual cues differed only quantitatively (i.e., language was more salient) or they differed qualitatively (i.e., language changed how children approached the task). In line with the quantitative explanation, the location-descriptive verbal cues and visual highlight cues may have differed in the extent to which they drew children’s attention to relevant cues. If this is the case then using a different visual highlight manipulation may prove more effective. Perhaps using a manipulation similar to Learmonth et al. (2001, 2002), increasing the size and/or stability of the landmarks, might have improved children’s performance more. Another possibility is that our visual highlight manipulation did not draw children’s attention to the spatial relations specifically because moving the landmark temporarily disrupted this relation; this also suggests that a different type of visual highlight (e.g., pointing, drawing a line between the cup and landmark) could be more effective.
A more qualitative explanation would be that the location-descriptive verbal cues were more effective because only language can effectively convey relational or pragmatic information. Language can convey relational information that is non-apparent simply from perceptual information (Gentner, 2003), which may then support spatial cognition in qualitatively different ways. For example, children and adults who lack relational language tend to perform poorly on spatial tasks that require relational reasoning (Gentner, Özyürek, Gürcanli, & Goldin-Meadow, 2013; Pyers et al., 2010). Additionally, language may better reveal pragmatic information about the relevance of particular cues (Shusterman et al., 2011). In the context of our task, children may have interpreted our verbal descriptions as a way to perform the task better, whereas our visual highlight did not communicate the relevance of the landmarks cues for the recall task.
Additionally, verbal cues may lead children to encode and to hold in memory a different kind of information compared to visual cues. In our task, a child who has verbally encoded the hiding location only has to remember the verbal cue to the location and then use the cue to find the hidden object, regardless of the orientation of the array relative to the child. However, a child who has visually encoded the hiding location may have a mental “snapshot” of the array in memory, which then has to be updated to align with the changed viewpoint at search. This latter form of encoding would be most impaired on recall trials that rely most on the intrinsic reference frame. Overall, our results seem more consistent with verbal encoding, but we cannot definitively rule out a quantitative explanation.
One last question we consider is whether language facilitates performance on all types of spatial tasks, or if language might be more important for certain tasks than for others? To date, the spatial tasks that have been shown to relate to language typically depend on representing relations among two or three objects with at least one object having a unique identity (e.g., Dessalegn & Landau, 2008; Hermer-Vazquez et al., 2001; Shusterman et al., 2011). For example, in the disorientation search task, one needs to remember the hiding location relative to one colored wall and in our spatial reference frame task one could solve the task by remembering the hidden toys’ location relative to one or two landmarks. In these types of spatial tasks, verbal encoding is an effective strategy because there are many words one can use to effectively verbally encode the relations among the objects (e.g., by, next to, in front of, between). However, verbal encoding may be more difficult in certain tasks, such as the radial maze task (e.g., Aadland, Beatty, & Maki, 1985; Foreman, Warry, & Murray, 1990) or the task used by Nardini et al. (2006), where there are multiple identical objects in positions that are difficult to disambiguate verbally, or more distant landmarks with relations that are not easily described. Future research will be needed to examine the generalizability of verbal encoding as a strategy across different spatial tasks where verbal encoding simple relations among two or three objects is insufficient.
In conclusion, the results of the current investigation indicate that children’s verbal encoding of spatial relations contributes to their selection of an intrinsic reference frame in spatial recall. Language is a mechanism that supports many cognitive abilities over development, but may not be the only mechanism instrumental in the development of cognitive processes. In the future, it will be important to tease apart the effects of language and other cognitive processes when evaluating how language supports spatial cognition over development.
Thanks to the families who participated in these research, as well as the research assistants who aided in data collection and/or commented on manuscript drafts. Special thanks to Victoria Kay for copy editing before submission. Participant recruitment was funded by the Eunice Kennedy Shriver National Institute of Child Health & Human Development of the National Institutes of Health (R03-HD067481) and the Waisman Intellectual and Developmental Disabilities Research Center grant (P30HD03352). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Subsets of the data included here were presented at the 69th biennial meeting of the Society for Research in Child Development, the 53rd Annual Meeting of the Psychonomic Society, and the 8th Biennial Meeting of the Cognitive Development Society.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.