|Home | About | Journals | Submit | Contact Us | Français|
The purpose was to determine the number of semantic neighbors, namely semantic set size, for 88 nonobjects (Kroll & Potter, 1984) and determine how semantic set size related to other measures and age.
Data were collected from 82 adults and 92 preschool children in a discrete association task. The nonobjects were presented via computer, and participants reported the first word that came to mind that was meaningfully related to the nonobject. Words reported by two or more participants were considered semantic neighbors. The strength of each neighbor was computed as the proportion of participants who reported the neighbor.
Results showed that semantic set size was not significantly correlated with objectlikeness ratings or object decision reaction times from Kroll and Potter (1984). However, semantic set size was significantly negatively correlated with the strength of the strongest neighbor(s). In terms of age effects, adult and child semantic set sizes were significantly positively correlated and the majority of numeric differences were on the order of 0–3 neighbors. Comparison of actual neighbors showed greater discrepancies; however, this varied by neighbor strength.
Semantic set size can be determined for nonobjects. Specific guidelines are suggested for using these nonobjects in future research.
Word learning entails the formation of at least two new representations, namely lexical (whole word sound form as an integrated unit) and semantic (meaning, Gupta & MacWhinney, 1997). In addition, associations must be created between these new representations as well as between each new representation and existing similar representations of known words (Gupta & MacWhinney, 1997). Recent work suggests that the number of lexically similar real words influences word learning by adults, typically developing children, and children with delayed phonological development (Storkel, 2001, 2003, 2004a, 2004b; Storkel, Armbruster, & Hogan, 2006; Storkel & Maekawa, 2005). The number of words that are lexically similar to a given word is referred to as neighborhood density. Past word learning research has shown that adults and children tend to learn words with many lexical neighbors more quickly than words with few lexical neighbors (Storkel, 2001, 2003, 2004a; Storkel et al., 2006; Storkel & Maekawa, 2005). Although acquisition of semantic representations has been studied extensively, the influence of the number of semantically similar real words on word learning has yet to be documented. As a result, it is difficult to determine whether the number of similar known words has a parallel influence on the acquisition of lexical and semantic representations. One barrier to this research may be the lack of appropriate stimuli. Specifically, many word learning studies use novel words created by pairing a nonword (i.e., novel phonological sequence) with a nonobject (i.e., novel referent). In this case, the characteristics of the nonobjects determine the semantically similar known words, namely the semantic neighbors. To our knowledge, there are no sets of nonobjects for which semantic neighbors have been established. The goal of this note was to determine the semantic neighbors for an already constructed set of 88 black and white line drawings of nonobjects (Kroll & Potter, 1984) so that these stimuli could be used in future research. Although the intent of the authors was to use the stimuli in word learning research, these stimuli also might be useful for other paradigms that rely on nonobjects, including artificial grammar learning (e.g., Nakamura, Plante, & Swisher, 1990; Till & Goldstein, 1980), language use with novel items (e.g., Campbell, Brooks, & Tomasello, 2000), object decision (e.g., Hashimoto, McGregor, & Graham, 2007), nonverbal memory (e.g., Fazio, 1998; Leonard et al., 2007), nonverbal learning (e.g., Mou, Anderson, Vaughan, & Rouse, 1989), and visual perception (e.g., Arnell & Jolicoeur, 1997).
The set of 88 nonobjects was originally developed for use as distracter items in an object decision task (Kroll & Potter, 1984) In the object decision task, participants were asked to decide whether the picture presented was of a real or not real object. The nonobjects were “created by tracing parts of drawings of real objects and regularizing the resulting figures” (Kroll & Potter, 1984, p. 41). The appendix of the original article provides the line drawings of the nonobjects, which are identified by a number. Digitized versions also are available for use with a computer (Brooks & Bieber, 1988). All nonobjects will be referred to by their original reference numbers. The appendix of the original article also includes ratings of objectlikeness.
This set of nonobjects was selected because the large size should make it possible to identify nonobjects that have many versus few semantic neighbors. In addition, the data available in the original article can be used to examine how the number of semantic neighbors relates to visual aspects of the nonobject such as objectlikeness, as indexed by objectlikeness ratings, and visual processing time, as indexed by reaction time in the object decision task. Moreover, these nonobjects have been used in past research with adults (e.g., Arnell & Jolicoeur, 1997; Dean & Young, 1997) and children (e.g., Hashimoto et al., 2007).
An agreed upon method for determining the semantic neighbors of words has remained elusive. There are a variety of methods that have been used with real words including computation of co-occurrence of the target word with other words in the language (e.g., Burgess, 1998; Landauer & Dumais, 1997; Lund & Burgess, 1996), linguistic analysis of the target word in terms of characteristics such as semantic features, categories, and/or synonyms (e.g., Felbaum, 1998), or word associations generated by adults (e.g., Kiss, Armstrong, Milroy, & Piper, 1973; Nelson, McEvoy, & Schreiber, 1998) and/or children (e.g., Entwisle, 1966; Palermo & Jenkins, 1964). How can these methods be applied to novel words (i.e., nonword-nonobject pairs)?
Computing co-occurrence of a novel word with other words in the language is not possible because a novel word, by our definition, does not occur in the language. Thus, the co-occurrence of a novel word with any other word in the language is always zero prior to exposure. Therefore, the co-occurrence method could not be used. Likewise, the linguistic analysis method may be difficult to use for novel words because the items would not occur in the resources typically used to determine semantic neighbors (e.g., dictionary; thesaurus). Consequently, the linguistic analysis approach did not seem appropriate for this study.
The association method seemed the most appropriate for the existing set of 88 nonobjects. When the association method is used with real words, a large group of participants (e.g., approximately 100) are presented with the real words, either in printed or spoken form, and asked to generate the first word that comes to mind that is meaningfully related to or frequently associated with the given word (Nelson et al., 1998). Participants are only allowed to report one word in response to the given word because previous work has shown that this leads to more reliable measures of the number of semantic neighbors than allowing multiple responses (Nelson, McEvoy, & Dennis, 2000; Nelson & Schreiber, 1992). For this reason, the task is often referred to as a discrete association task, rather than a free association task. In addition, only responses reported by two or more participants are counted as neighbors of the given word because responses reported by only one participant are potentially idiosyncratic, impacting reliability (Nelson & Schreiber, 1992). Variations of this method have been used with adults and children as young as 4 years (e.g., Entwisle, 1966; Kiss et al., 1973; Nelson et al., 1998; Palermo & Jenkins, 1964). Furthermore, past research has shown that the number of semantic neighbors determined in this way, termed semantic set size, impacts memory and language processing (e.g., Buchanan, Westbury, & Burgess, 2001; Locker, Simpson, & Yates, 2003; Nelson, Schreiber, & McEvoy, 1992; Nelson & Zhang, 2000; Yates, Locker, & Simpson, 2003).
In what sense can a nonobject have a neighbor? Samuelson and Smith (2000b) argue that knowledge can be thought of as the integration of the dynamic processes of perceiving and remembering. Specifically, when an object or event is perceived, other relevant objects or events are remembered. This allows integration of current experiences with past experiences, such that regularities across experiences emerge, leading to stability of knowledge. This view predicts that when a nonobject is perceived, relevant past experiences or objects can be recalled. In this sense, those relevant objects and experiences would be considered semantic neighbors of the nonobject, potentially influencing learning about the nonobject. Obviously, perceptual properties (e.g., shape) of the nonobjects would likely play a critical role in determining which known objects are remembered upon viewing a nonobject. This is appropriate given that recent theories assert perception and category learning are intertwined rather than separable (e.g., Goldstone & Barsalou, 1998), and there is clear evidence that perceptual properties are relevant to semantic categories (e.g., Jones, Smith, & Landau, 1991; Landau, Smith, & Jones, 1988; Macario, 1991; Soja, Carey, & Spelke, 1991). In addition, other evidence also suggests that perceptually obvious properties can be used to infer nonobvious properties, such as actions or functions (e.g., round things can roll, Samuelson & Smith, 2000a; Sheya & Smith, 2006). Thus, we might expect a varied array of semantic neighbors for nonobjects.
The semantic set size literature also points to another variable that may be important to consider: neighbor strength. The strength of a semantic neighbor is determined by counting the number of participants who reported the word as a neighbor of the given word and then dividing by the total number of participants. Strength appears to influence memory performance (e.g., Nelson et al., 1992; Nelson & Zhang, 2000). Perhaps more importantly, it has been suggested that there is an inherent relationship between semantic set size and neighbor strength for real words. Specifically, real words with a large semantic set size are argued to have weaker neighbors whereas real words with a small semantic set size tend to have stronger neighbors (Buchanan et al., 2001). That is, when participants report many different words as semantic neighbors of a given word, yielding a large semantic set size, it is suggested that fewer participants are reporting the same word as a semantic neighbor, reducing neighbor strength. In complement, when participants report only a few different words as semantic neighbors of a given word, yielding a smaller semantic set size, it is suggested that many participants are likely reporting the same word as a semantic neighbor, increasing neighbor strength. For this reason, computation of neighbor strength in combination with semantic set size will provide a better understanding of the structure of the semantic neighborhoods of the selected nonobjects, which may impact the interpretation of future word learning data.
There is much debate concerning the relationship between adult and child categories in general and semantic neighborhoods in particular. For example, Murphy (2002) in a comprehensive review suggests that adult and child semantic neighborhoods are similar in structure but that adults and children differ in experience with members of the neighborhood and in processing capacity or fluency. In addition, Entwisle (1966) documents changes in association responses between adults and school-age children. Observed patterns include that the frequency of the three most frequent associates increases with age, that idiosyncratic responses (i.e., those reported by only one participant) decrease with age, and that younger children are more likely to provide noun associates to words of all grammatical classes than older children and adults. This opens the possibility that semantic set size and specific neighbors of the nonobjects could vary by age, making stimulus selection for developmental research problematic. This study explores this possibility by comparing adult and child semantic set sizes and neighbors.
The purpose of the current study was to determine the semantic set size for each nonobject in a set of 88 so that these nonobjects could be used in future research. Semantic set size was determined using a discrete association task with undergraduate college students and preschool children. Specific goals were to:
Two groups of native English speakers participated sequentially. The first group consisted of 82 undergraduate college students (M age = 19 years, SD = 1.3 years, range 17 to 26 years). This adult group was relatively balanced in gender (52% women; 48% men). The majority of adults were white (84%). The students reported a normal developmental history and never received physical, speech, language, hearing, cognitive, social, or academic treatment services based on self-report.
The second group consisted of 92 preschool children (M age = 4; 6, SD = 8 months, range 3; 2 to 6; 4). There were 21 3-year-olds (M age = 3; 7, SD = 3 months, range 3; 2 to 3; 11), 45 4-year-olds (M age = 4; 5, SD = 3 months, range 4; 0 to 4; 11), 25 5-year-olds (M age = 5; 2, SD = 2 months, range 5; 0 to 5; 9), and 1 6-year-old. This child group was relatively balanced in gender (57% girls; 43% boys). The majority of children were white (74%). Parents reported a normal developmental history, and the children never received physical, speech, language, hearing, cognitive, or social treatment services. In addition, speech and language development were screened through administration of standardized tests of articulation and vocabulary (Brownell, 2000; Goldman & Fristoe, 2000). All children scored above one standard deviation below the mean on the Goldman-Fristoe Test of Articulation – 2 (M percentile rank = 61, SD = 23, range 18–98) and on the Receptive One Word Picture Vocabulary Test (M standard score = 108, SD = 10, range 86–134).
The full set of 88 nonobjects developed by Kroll and Potter (1984) was used with adult participants. The nonobjects were “created by tracing parts of drawings of real objects and regularizing the resulting figures” (Kroll & Potter, 1984, p. 41). No further description of the development of the nonobjects was provided by Kroll and Potter (1984).
After the adult data were analyzed, a subset of 47 nonobjects was quasi-randomly selected for use with child participants to shorten the task to better accommodate the attention span and motivation of young children. Nonobject selection was not fully random because an attempt was made to select nonobjects that represented the full range of adult semantic set sizes. Percentiles were computed for the adult semantic set size data and used as a way of tracking selection of nonobjects from the full range of adult semantic set sizes. Of the 47 selected nonobjects, 17 (36%) had adult semantic set sizes at the 25th percentile or below, 17 (36%) had adult semantic set sizes between the 26th and 74th percentiles, and 13 (28%) had adult semantic set sizes at the 75th percentile or above.
In general, procedures followed those used in past discrete association research (Nelson et al., 1998), and relatively similar procedures were used for adults and preschool children.
Adult participants were seated in front of a computer. Presentation of instructions and pictures as well as recording of responses was accomplished using experimental control software (i.e., DirectRT, Jarvis, 2002). Printed instructions were provided, indicating that the participant would see pictures on the computer screen and that some of the pictures would be familiar (i.e., training items) and some would be unfamiliar (i.e., target nonobjects). Adults were instructed to type in the first word that came to mind that was meaningfully related or strongly associated with the picture. They were instructed to respond quickly to encourage them to respond with the very first word that came to mind.
Five pictures of black and white line drawings of real objects (Snodgrass & Vanderwart, 1980) were presented first as training items to check that participants understood the instructions. The training items were: bear, kite, pumpkin, shirt, television. The picture appeared on the screen with the printed prompt “What is the first word that comes to mind that is meaningfully related or strongly associated with this picture?” and a response box. Adults typed their responses using the computer’s keyboard. The nonobject pictures were then presented in random order as determined by the software, using the same procedures.
The procedures for child participants were similar to that of adults except that the examiner read the instructions to the child and the wording was changed to be more easily understood by children. Specifically, children were told to say the first word that they thought of that was like the picture, and the response prompt was changed to be in line with this. In addition, the examiner typed the child’s response. Training items also were changed for children because it was thought that children might need more prompting to respond to nonobjects. Thus, real objects were not used as training items. Instead, five nonobjects that were not selected in the target set of 47 were used for training. The training items were nonobjects 4, 9, 72, 74, and 83. Finally, two types of additional prompts were provided to children. The first additional prompt was provided if the child failed to respond to an item or responded with “I don’t know.” In these instances, a no response was recorded but children were told that there was no right answer and it was ok to guess. The second additional prompt was provided if the children responded with more than one word. When this occurred, the entire response was typed by the examiner, but the child was reminded to respond with one word. For most children, these prompts were sufficient and the only other feedback given was general encouragement. One child received additional prompting during training to respond with real words instead of nonwords.
For both adults and children, typed responses from the experimental control software were imported into Microsoft Excel. Responses were spell checked with misspelled words being corrected. In the case where the target word was unclear, the response was left misspelled. Responses were analyzed separately for adults and children. For each group, the number of individuals who reported a particular response to a given nonobject was tallied. Responses that were given by only one individual were not included in any further analyses. In determining the number of participants who reported a particular response, consolidation of equivalent responses was allowed. For the adult data, this consisted of ignoring inflectional morphology (e.g., plural vs. singular; verb tense) and spelling variations that might not be detected by spell check. For example, responses of “sock” and “socks” would be treated as one response, and the number of participants producing either response would be computed. In addition, responses of “screwdriver” and “screw driver” would be treated as one response, and the number of participants producing either response would be computed.
For the child data, the same procedure of ignoring inflectional morphology and spelling variation also was used. However, greater consolidation was performed for child data because of the tendency of the children to respond with more than one word. If the “main” response word was the same (e.g., usually the noun or verb or the more informative word in the phrase), longer responses were combined with shorter single word responses. For example, “cutter thing” was combined with “cutter” because “cutter” was judged as the more informative word in the phrase. Similarly, “silly vacuum cleaner” was combined with “vacuum” because “vacuum” was judged to be the main word in the phrase. In addition, child versions of words were combined with adult versions of words. For example, “nana” and “banana” were combined. Ambiguous responses, where a main word could not be determined, were not combined. For example, “hat whale” and “whale” were not combined because it could not be determined whether “hat” or “whale” should be considered the main word. Likewise, more specific responses were not combined with more general responses when these responses indicated different items. For example, “train” and “train track” were not combined, and “crocodile” and “crocodile mouth” were not combined.
To what extent does this greater consolidation for children impact the data? For children, there were 4,324 total responses (including “I don’t know”). Of these, only 6% (i.e., 256 responses) were considered for consolidation. Of these consolidation decisions, 48% affected which responses were classified as semantic neighbors of the nonobject (i.e., semantic set size), whereas 52% affected the tally of the number of children who reported the response as a semantic neighbor (i.e., neighbor strength). In total, 44% of these consolidation decisions resulted in actual consolidation and a corresponding change in the data. In 56% of cases, the data were not consolidated. In addition, the adult data were searched for any child consolidated responses, and any responses found in the adult data also were consolidated. The only child consolidated responses identified in the adult data were those consisting of a shorter and a longer form of a word (e.g., “phone” and “telephone”). These responses were consolidated for both children and adults.
For each nonobject, data processing yielded an adult list and a child list of responses reported by two or more participants as well as the number of participants that reported each response. Three dependent variables were computed from each list (i.e., adult vs. child). All data are available on-line in an Excel file at the ASHA website.
Semantic set size was determined for each nonobject by counting the number of different semantic neighbors reported by two or more participants.
Neighbor strength was computed for each neighbor by dividing the number of participants who reported the neighbor in response to the nonobject by the total number of participants (i.e., 82 for adults; 92 for children). Because at least two participants had to report a word for it to be considered a neighbor, the lowest possible value for neighbor strength is 0.02 (i.e., 2/82 for adults; 2/92 for children).
For the 47 nonobjects with both adult and child data, the proportion of overlap in the actual neighbors was determined separately for four groups of neighbors based on child neighbor strength. First, the neighbors of each nonobject were classified into one of the four child neighbor strength groups. These neighbor strength groups were determined based on the child neighbor strength median (Mdn = 0.02), mean (M = 0.05) and standard deviation (SD = 0.07). The four neighbor strength groups were: (1) strength = 0.02 (i.e., the median), (2) strength = 0.03–0.05 (i.e., the mean), (3) strength = 0.06–0.11 (i.e., mean to mean + 1 standard deviation), (4) strength > 0.11 (i.e., greater than 1 standard deviation above the mean). The rationale for this approach was that weaker neighbors (i.e., neighbors reported by few children) would potentially be less reliable than stronger neighbors (i.e., neighbors reported by many children), even if the comparison group were a second group of similar aged children. There were 463 total neighbors reported by children across nonobjects. The distribution by strength group was 53% with strength 0.02, 34% strength 0.03–0.05, 6% strength 0.06–0.11, and 7% strength > 0.11.
Each child semantic neighbor of a given nonobject was compared to the adult semantic neighbors for the same nonobject and scored as an adult neighbor or not. The proportion of child neighbors that also were adult neighbors was then computed for each nonobject. As an illustration of the full procedure, nonobject 68 had the following child neighbors with child strength in parentheses: cup holder (0.02), glasses (0.02), milk (0.02), pancake (0.02), roof (0.02), refrigerator (0.04), book (0.05), door (0.05), house (0.08), and computer (0.21). Each neighbor was assigned to a neighbor strength group: strength 0.02 = cup holder, glasses, milk, pancake, roof; strength 0.03–0.05 = refrigerator, book, door; strength 0.06–0.11 = house; strength > 0.11 = computer. Then, each neighbor was scored as 1 or 0 depending on whether it also was an adult neighbor. This leads to the following scoring: strength 0.02 = cup holder 0, glasses 1, milk 0, pancake 0, roof 0; strength 0.03–0.05 = refrigerator 0, book 1, door 0; strength 0.06–0.11 = house 0; strength > 0.11 = computer 1. Finally, the proportion of child neighbors that were adult neighbors was computed. This leads to the following proportions: strength 0.02 = 0.20, strength 0.03–0.05 = 0.33, strength 0.06–0.11 = 0.00, strength > 0.11 = 1.00. This procedure was repeated for all 47 nonobjects.
Mean objectlikeness ratings for all 88 nonobjects were taken from Kroll and Potter (1984), Table A-1 (p. 61). The objectlikeness ratings were obtained from 100 undergraduates who rated on a 7-point scale the degree to which the nonobject resembled a real object. A rating of 1 indicated that the nonobject “looked very much like a real object,” whereas a rating of 7 indicated that the nonobject “looked nothing like a real object” (Kroll & Potter, 1984, p. 60).
Mean object decision reaction times for 60 of the nonobjects were taken from Kroll and Potter (1984), Table A-1, Experiment 1 (p. 61). In the Experiment 1 object decision task, 12 adults saw black and white pictures on a projection screen and pressed one of two buttons to indicate whether the picture was a real object or not. Only 60 of the 88 nonobjects were tested.
Table 1 displays the mean, standard deviation, range, and percentiles for the semantic set size data for adults and children. The adult data are displayed twice: once for the full set of 88 nonobjects and once for the subset of 47 nonobjects for comparison to the child data. The values for adult data appear relatively close to the values for the child data, although comparability of adult and child data will be explored in greater detail in a later section of these results. In addition, data reported from past adult studies involving real word association tasks (Nelson et al., 1998) are included for comparison in Table 1. As shown in Table 1, the range of semantic set sizes for nonobjects is somewhat limited compared to past work with real words. Past work with real words has yielded semantic set size ranges from approximately 1 to 34 neighbors (Nelson et al., 1998), whereas the semantic set size range for the nonobjects is approximately half that. Likewise, the standard deviation reported for the nonobjects is approximately half that reported for real words (Nelson & Zhang, 2000). Note that these past real word studies tended to have more participants than in the current study with sample sizes ranging from 94 to 206 participants (M = 149; SD = 15) and have not used pictures to elicit responses, which could constrain responses to a particular meaning of a word. Although the range of semantic set sizes for nonobjects is more limited than that for real words, the variation that is present may still be sufficient for selecting nonobjects with many versus few semantic neighbors.
Table 2 shows the mean, standard deviation, and range of objectlikeness ratings and object decision reaction times for the nonobjects as originally reported in Kroll and Potter (1984). The relationship between semantic set size and objectlikeness ratings and between semantic set size and object decision reaction times was examined via correlations to determine whether semantic set size was distinct from these other measures. For the full set of 88 nonobjects, adult semantic set size was not significantly correlated with objectlikeness ratings, r (1, 88) = 0.09, p > 0.40, r2 = 0.01, or object decision reaction time, r (1, 60) = −0.08, p > 0.50, r2 = 0.01. The same finding was obtained when the analysis was confined to adult semantic set size of the subset of 47 nonobjects, r (1, 47) = −0.01, p > 0.90, r2 < 0.01 for objectlikeness ratings, and r (1, 30) = −0.15, p > 0.40, r2 = 0.02 for object decision reaction time. A similar pattern is observed in the child data. Child semantic set size was not significantly correlated with objectlikeness ratings, r (1, 47) = 0.03, p > 0.80, r2 < 0.01, or object decision reaction time, r (1, 30) = −0.05, p > 0.80, r2 < 0.01. Taken together, both adult and child semantic set sizes appeared to be relatively independent of object properties.
For a better understanding of the relationship between objectlikeness ratings and semantic set size, qualitative data were explored for adults (Appendix A) and children (Appendix B). Nonobjects were split at the median objectlikeness rating (i.e., 4.1) and coded as being more like a real object (i.e., rating of 4.0 or lower) or less like a real object (i.e., rating of 4.1 or higher). Then, each group of nonobjects was searched for pairs of nonobjects that had the same (or similar) objectlikeness rating but differed in semantic set size by 3 or more neighbors (i.e., 1 SD). A representative subset for adults and children are shown in Appendices A and andB.B. Review of those appendices suggests that objectlikeness ratings seem to capture the cohesiveness of the nonobject as well as the interpretability of the parts of the nonobject. For example, as shown in Appendix A, nonobject 3, a nonobject that looks less like a real object, has recognizable parts but the relationship of the parts to one another does not seem as cohesive as other nonobjects that were rated as more objectlike. In addition, nonobject 2, a nonobject that looks less like a real object, does not seem to have any obvious parts. It looks more like an abstract design than an actual object.
Turning to the semantic neighbors shown in the appendices, nonobjects with lower set sizes seemed to have neighbors that fell into a few cohesive themes, whereas nonobjects with higher set sizes seemed to have neighbors that fell into more varied themes. For example, nonobject 8 in Appendix A had several neighbors related to tools or kitchen utensils (e.g., knife, can opener, tool, bottle opener), the action of some of these tools (e.g., open), and the object these tools would be used on (e.g., beer). There were a few neighbors that did not fit this theme, consisting of terms describing perceptual features (e.g., ring) and other terms that do not fit the theme (e.g., handle, hook). In contrast, nonobject 60 in Appendix A showed several themes. One theme related to fishing (e.g., net, hook, fishing). Another related to kitchen utensils (e.g., scoop, ladle, spoon). A third related to sewing (e.g., sew, pin, safety pin). A fourth related to music (e.g., instrument, trombone). Two neighbors remain, one describing perceptual features (e.g., long), and another related to office supplies (e.g., paperclip). A similar pattern is observed for children (see Appendix B).
Taken together, the quantitative and qualitative analyses converge on the conclusion that objectlikeness ratings and semantic set size are capturing different properties of the nonobjects. Objectlikeness ratings seem to capture something about the plausibility or typicality of the nonobject in the real world; whereas, semantic set size seems to capture similarity to known objects, features, and functions.
Table 3 displays the means, standard deviations, and range of neighbor strength for the four strongest neighbors for adults and children. Only the four strongest neighbors were examined because the smallest semantic set size was four neighbors. Thus, cutting off the strength analysis at four neighbors avoided missing data. For the full set of 88 nonobjects, adult semantic set size was significantly negatively correlated with adult neighbor strength of the first strongest, r (1, 88) = −0.68, p < 0.0001, r2 = 0.46, and second strongest neighbors, r (1, 88) = −0.34, p = 0.001, r2 = 0.11. In contrast, adult semantic set size was positively, but not significantly, correlated with adult neighbor strength for the third strongest, r (1, 88) = 0.03, p > 0.70, r2 < 0.01, and fourth strongest neighbors, r (1, 88) = 0.04, p > 0.70, r2 < 0.01.
Analysis of the adult data for the subset of 47 nonobjects yielded similar findings: r (1, 47) = −0.76, p < 0.0001, r2 = 0.58 for adult semantic set size and strength of first neighbor; r (1, 47) = −0.39, p < 0.01, r2 = 0.15 for second neighbor; r (1, 47) = 0.13, p > 0.30, r2 = 0.02 for third neighbor; r (1, 47) = 0.10, p > 0.50, r2 = 0.01 for fourth neighbor. This is illustrated in Table 4, which reports the neighbor strength for the first through fourth strongest neighbors for words with many versus few semantic neighbors as defined by a median split of adult semantic set size. For adults, nonobjects with many semantic neighbors (i.e., higher semantic set size) tended to have significantly weaker first neighbors than nonobjects with fewer semantic neighbors (i.e., lower semantic set size), t (36) = 4.95, p < 0.001 (Note that adjusted degrees of freedom are reported because the assumption of equal variances for this t-test was not met). Comparison of second, third, and fourth strongest neighbors showed no significant difference between nonobjects with many versus few semantic neighbors, all t (45) < 2.0, all p > 0.08, although there was a trend for nonobjects with many semantic neighbors to have weaker second neighbors than nonobjects with fewer semantic neighbors.
The pattern for children was relatively similar. Child semantic set size was significantly negatively correlated with child neighbor strength of the first strongest neighbor r (1, 47) = −0.50, p < 0.0001, r2 = 0.25. The correlation between child semantic set size and neighbor strength was negative but not significant for the second strongest neighbor, r (1, 47) = −0.14, p > 0.30, r2 = 0.02. Finally, the correlation between child semantic set size and neighbor strength was positive for the third and fourth strongest neighbors, and was significant for the fourth strongest neighbor: r (1, 47) = 0.11, p > 0.40, r2 = 0.01 for third strongest neighbor; r (1, 47) = 0.34, p = 0.02, r2 = 0.12 for fourth strongest neighbor. As illustrated in Table 4, nonobjects with many child semantic neighbors (i.e., higher semantic set size) tended to have significantly weaker first neighbors than nonobjects with fewer child semantic neighbors (i.e., lower semantic set size), t (34) = 3.83, p = 0.001. Comparison of second, third, and fourth strongest neighbors showed no significant difference between nonobjects with many versus few semantic neighbors, all t (45) < 1.8, all p > 0.07, although there was a trend for nonobjects with many semantic neighbors to have weaker second neighbors than nonobjects with fewer semantic neighbors.
Taken together, the structure of the semantic neighborhoods of novel words with many versus few neighbors differs in terms of the strength of the first (and possibly second) strongest neighbors for both adults and children, and this will need to be taken in consideration when interpreting results from future research using these nonobjects.
All analyses of the overlap between adult and child semantic set size focus on only the subset of 47 nonobjects that were administered to both adults and children. Three levels of overlap were examined: relative ranking, number of neighbors, and neighbor identity.
Adult semantic set size was significantly positively correlated with child semantic set size, r (1, 47) = 0.33, p < 0.05, r2 = 0.11. Specifically, as adult semantic set size increased so too did child semantic set size. Thus, we might expect that nonobjects with many semantic neighbors based on adult responses also would have many semantic neighbors based on child responses. Likewise, nonobjects with small adult semantic set sizes would likely have small child semantic set sizes.
Although adult and child semantic set sizes were significantly correlated, it is possible that the actual semantic set size could differ between adults and children. This possibility was explored by comparing adult and child semantic set sizes in a t test analysis. Adult and child semantic set sizes did not differ significantly, t (1, 46) = 0.14, p > 0.80, suggesting that the number of semantic neighbors is similar regardless of whether adult or child data are used. Agreement between adult and child semantic set sizes was examined more thoroughly by computing absolute difference scores. Specifically, the child semantic set size was subtracted from the adult semantic set size, and then the absolute value of this difference was taken. The number of positive (i.e., adult set size larger than child) and negative differences (i.e., adult set size smaller than child) was counted before taking the absolute value and was equal (50% positive differences and 50% negative differences). For the 47 nonobjects, 15% showed no difference between adult and child semantic set sizes, 21% differed by 1, 15% differed by 2, 17% differed by 3, 13% differed by 4, 13% differed by 5, and 6% differed by 6 or 7.
Overall, approximately 30% of child neighbors (SD =17%, range 0% to 71%) also were adult neighbors. Recall that child neighbors of each nonobject were classified into one of four neighbor strength groups (i.e., 0.02 vs. 0.03–0.05 vs. 0.06–0.11 vs. > 0.11). As hypothesized, the degree of overlap between child and adult neighbors tended to increase as strength increased. Specifically, approximately 15% (SD = 20) of child neighbors were adult neighbors for strength 0.02; increasing to 30% overlap (SD = 29) for strength 0.03–0.05; increasing to 66% overlap (SD = 46) for strength 0.06–0.11; and finally reaching 89% overlap (SD = 30) for strength > 0.11. To examine whether overlap varied significantly by neighbor strength group, each strength group was compared to every other using paired t tests and Bonferroni correction for multiple comparisons (n = 6 comparisons). Results showed that the proportion of child neighbors that were adult neighbors for the strength 0.02 group was significantly smaller than for any other strength group, all t < −3.00, all corrected p < 0.02. In addition, the proportion of child neighbors that were adult neighbors for the strength 0.03–0.05 group was marginally significantly smaller than the strength 0.06–0.11, t (21) = −2.85, corrected p = 0.05, and significantly smaller than the strength > 0.11 group, t (20) = −5.02, corrected p < 0.001. Finally, the proportion of child neighbors that were adult neighbors for the strength 0.06–0.11 group was similar to the strength > 0.11 group, t (8) = −2.53, corrected p = 0.21. Thus, the percentage of child neighbors that were adult neighbors was lowest for the strength 0.02 group, increased for the strength 0.03–0.05 group, and reached the highest level for the strength 0.06–0.11 and strength > 0.11 groups.
Qualitative data were examined to better understand neighbor overlap. Specifically, the adult and child neighbors of the 7 nonobjects that showed no difference between adult and child set sizes were examined. These are shown in Appendix C. As observed in Appendix C, overlap varied widely across these 7 nonobjects from a low of 0% overlap for nonobject 81 to a high of 50% for nonobjects 52 and 82. In some cases, the lack of overlap might be attributable to life experiences and interests. For example, for nonobject 81, adults reported several words that were likely unknown by preschool children (e.g., binocular, experiment), and children reported words that were likely less salient interests for college students (e.g., slide, merry-go-round). However, this clearly can not explain all discrepancies as there are cases where either adults or children report words that the other group should have relatively similar knowledge and familiarity (e.g., face, eye for adults for nonobject 81). In other cases, the strict criteria for overlap (i.e., same neighbor reported by both groups) might underestimate overlap. That is, similarity between responses can be identified even though the exact same words were not reported. For example, for nonobject 41, adults reported “chop” and “shoe” and children reported “ax” and “sock.”
Given the low overlap between adults and children and the wide age range of child participants (i.e., 3; 2 to 6; 4), it was important to explore how well the previously reported child data reflected responses from both younger and older children. Child participants were divided at the median age (4; 6) into equal groups (n = 46) of younger (i.e., 3;2 to 4;5) and older (i.e., 4;6 to 6;4) children. Then, each of the already identified child neighbors was examined to determine whether at least one younger and one older child reported the word as a neighbor of the nonobject. On average, 66% of child neighbors (SD = 15%; range 22 – 92%) were reported by both a younger and an older child. Moreover, the proportion of child neighbors that were reported by both younger and older children was not significantly correlated with child semantic set size, r (47) = 0.10, p > 0.50, r2 = 0.01. Turning to the complementary data, approximately 16% of child neighbors (SD = 10%; range 0 – 40%) were reported only by younger children, whereas approximately 17% of child neighbors (SD = 13%; range 0 – 56%) were reported only by older children. The majority (M = 84%, SD = 27%, range 0 – 100%) of these age-specific neighbors had a child strength of 0.02. Most of the remaining age-specific neighbors (M = 14%, SD = 26%, range 0 – 100%) had a child strength of 0.03. Rarely did an age-specific neighbor have a child strength of 0.04 or greater (exceptions are: 3 age-specific neighbors with child strength 0.04, 2 with 0.05, and 1 with 0.08). Taken together, the child data tended to reflect responses by both younger and older children, especially for strength of 0.04 or greater. Examples of younger and older child overlap are shown in Appendix C.
The goal of this research note was to determine the number of semantic neighbors, namely semantic set size, for an existing set of 88 nonobjects and to better understand this measure by exploring how semantic set size related to other variables and how semantic set size varied by age. Results showed that semantic set size for the nonobjects was more limited in range than semantic set sizes previously reported for real words. However, the range of semantic set sizes for the nonobjects may be sufficient to define nonobjects with many versus few semantic neighbors for future research. In terms of the relationship between semantic set size and other measures, semantic set size was relatively independent of objectlikeness and object decision time. Qualitative analysis revealed that objectlikeness ratings seemed to capture the plausibility or cohesiveness of the nonobjects; whereas, semantic set size captured the real world objects, properties, or functions that participants judged to be similar to the nonobject. In contrast, semantic set size was significantly correlated with neighbor strength, paralleling past work with real words (Buchanan et al., 2001). Therefore, any interpretation of the effect of semantic set size in future research with these nonobjects would need to be qualified by the potential influence of neighbor strength. That is, the strongest neighbors of nonobjects with many neighbors tended to be weaker than the strongest neighbors of nonobjects with few neighbors.
In terms of the similarity in semantic set size between adults and preschool children, relatively good agreement was observed when numeric values were considered. Specifically, adult and child semantic set sizes were significantly correlated and the majority of differences (i.e., 68%) were on the order of 0–3 neighbors. Comparison of actual neighbors showed greater discrepancies, with the amount of discrepancy varying by neighbor strength. Stronger child semantic neighbors (i.e., strength > 0.05) were more likely to be adult semantic neighbors than weaker child semantic neighbors. How can this quantitative (i.e., set size) similarity be reconciled with this qualitative (i.e., specific neighbors) difference? One possibility is that the nonobjects afford similarity to few versus many real objects or properties and this is constant across age, yielding similar semantic set sizes across children and adults. However, the actual real objects or properties that are remembered when viewing a nonobject changes with development, leading to different neighbors being reported by children or adults. This idea fits current models of categorization where the items recalled when viewing an object or listening to a word are influenced by the recent past experiences as well as past acts of categorization (Samuelson & Smith, 2000b). In this way, the recent past of preschool children (e.g., playing during recess) likely differs from college undergraduates (e.g., attending biology lab), influencing the specific words recalled when viewing the nonobjects (e.g., slide and merry-go-round for children for nonobject 81 but binocular and experiment for adults). Likewise, children have less experience categorizing objects than adults simply by their age. In this way, children and adults might attend to different perceptual features when viewing the nonobjects, leading to recall of different known objects and properties. Finally, methodological differences can not be ruled out as an explanation for the qualitative discrepancies between children and adults. Recall that children and adults received slightly different instructions and slightly different practice items. It is possible that these differences lead the two groups to report different neighbors, although the qualitative data analysis supports alternative explanations for at least some neighbor differences.
Future researchers may wish to use these data to either systematically vary or control the properties of the nonobjects used in a particular paradigm, such as word learning, artificial grammar learning, language use with novel items, object decision, nonverbal memory, nonverbal learning, or visual perception. Several factors need to be considered in using these data to select stimuli. In particular, one needs to consider the overlap between the methods of this study and the methods of the future study. In addition, the relationship between semantic set size and neighbor strength needs to be kept in mind. Finally, developmental research will need to consider what level of overlap (i.e., relative ranking, numeric value, or specific neighbors) is desired across the ages tested. Each of these issues is explored, in turn.
The range of semantic set sizes for these nonobjects was smaller than what has typically been observed for real words. It is possible that this discrepancy arises from methodological differences between this study of nonobjects and previous studies of real words. Specifically, pictures were used in this study to elicit semantic neighbors, whereas past studies of real words present the printed or spoken word without a specific referent. It is possible that presenting a picture to elicit a semantic neighbor restricts the neighbors that participants report. For example, for a real word with multiple meanings, presentation of a picture would likely restrict the semantic neighbors reported to just those that are relevant to the one meaning depicted in the picture, potentially reducing the number of neighbors reported. Based on this explanation, one would predict that elicitation of semantic neighbors of real words using pictures would result in a reduction of semantic set size when compared to elicitation of real words using printed or spoken words. This suggests that the way that semantic set size is determined may influence the obtained values. Thus, one needs to consider how well a future research task would match the methods of the current study. That is, a future research task that presents these nonobjects in their current form and provides no additional information would be very similar to the methods of this study. Therefore, the semantic set size information reported in this study would likely be valid. In contrast, a future research task that presents these nonobjects in an altered form (e.g., adds color to the nonobjects; constructs three dimensional representations of the nonobjects) or provides additional information (e.g., “This is a ______. It is a type of ____. It is used for ____.”) would be different from the methods of this study. In this case, the validity of the semantic set size information reported in this study is somewhat questionable. It is possible that participants in this study would have reported different neighbors of these nonobjects if more detail was provided during the discrete association task. This hypothesis warrants further investigation, but researchers should be cognizant of possible contextual influences on semantic set size. Likewise, researchers should consider the appropriateness of these two-dimensional black and white images for the target age of their participants.
When interpreting the results from future research studies manipulating semantic set size with these nonobjects, the correlation between semantic set size and neighbor strength should be considered. It might be possible to select nonobjects that vary in semantic set size but are matched for neighbor strength for the first and second strongest neighbors. However, if this is not possible given the correlation between semantic set size and neighbor strength or because of other constraints in nonobjects selection, the difference in neighbor strength should be considered as an alternative explanation for the obtained results. That is, any obtained effect could be due to differences in the number of neighbors, differences in strength of the strongest neighbors, and/or a combination of number of neighbors and strength.
For developmental research, our results suggest relatively comparable semantic set sizes for adults and children. Thus, developmental research that is interested primarily in manipulating (or controlling) the number of semantic neighbors should be feasible. In selecting stimuli for future research, potential differences can be minimized with one of two methods. One way to minimize potential differences across age is to select nonobjects that have many versus few neighbors for both adults and children. That is, select nonobjects where the classification of semantic set size as large or small is the same regardless of the participant group chosen. An alternative method is to select stimuli with many versus few semantic neighbors with a gap between the two types of nonobjects. This method may be needed if nonobjects are selected from the set of 41 that were only tested on adults. Recall that 51% of the nonobjects showed a difference of 0–2 neighbors between adult and child semantic set sizes. Thus, if nonobjects with many versus few semantic neighbors were selected so that there was a gap of four neighbors, a reduction of 0–2 neighbors for nonobjects with many neighbors and a corresponding increase of 0–2 neighbors for nonobjects with few neighbors would still maintain a distinction between the stimuli. With this method, nonobjects with many neighbors might be defined as an adult semantic set size of 13 or more (N.B., 20 of the 88 nonobjects meet this criterion) and nonobjects with few neighbors might be defined as having adult semantic set sizes of 8 or fewer (N.B., 20 of the 88 nonobjects meet this criterion). Note that these definitions approximately correspond to the 75th and 25th percentiles of adult semantic set size respectively. If a further decrease in the possibility that nonobject classification would be altered for different ages was desired, a larger gap could be used. Specifically, 68% of the nonobjects showed a difference of 0–3 neighbors between adult and child semantic set sizes. A gap of 6 would yield cut-offs of adult semantic set sizes of 14 and higher (N.B., 9 of the 88 nonobjects meet this criterion) and 7 and lower (N.B., 11 of the 88 nonobjects meet this criterion) for many versus few semantic neighbors respectively. Note that these operational definitions correspond to the 90th and 10th percentiles of adult semantic set size respectively.
Developmental research that is interested primarily in manipulating (or controlling) the actual identity of the semantic neighbors is more challenging. Recall that the overlap between adult and child neighbors varied by neighbor strength and was rather low overall (i.e., only 30% of child neighbors were adult neighbors). If a wide age range is being used (e.g., preschool children to adults), then one of the following two approaches might be useful. One approach is to select nonobjects with more neighbors reported by both adults and children (i.e., high overlap) or to select neighbors of nonobjects that were reported by both adults and children, depending on the specific goal and design of the study. A second approach is to select nonobjects with strong neighbors or select the strongest neighbors of the chosen nonobjects, depending on the specific goal and design of the study. Results of this study indicate that neighbors with strength of 0.06 or greater should be used as a minimum strength criterion because few child neighbors with strength below 0.06 are likely to be adult neighbors (N.B., 37 of the 47 nonobjects tested with children have at least one neighbor with strength of 0.06 or greater). Moreover, strength greater than 0.11 may be desirable because a larger proportion of these child neighbors also are adult neighbors (N.B., 23 of the 47 nonobjects tested with children have at least one neighbor with strength greater than 0.11).
A narrower developmental window may be more feasible for developmental research that is interested primarily in manipulating (or controlling) the actual identity of the semantic neighbors. Specifically, the overlap between younger and older preschool children in this study was relatively high with the majority of neighbors being reported by both younger and older preschool children. In pursuing this type of research, the recommendations detailed above may still prove useful. That is, one can select stimuli with high overlap between younger and older children. Alternatively, strength could be considered when selecting stimuli. Neighbors with child strength of 0.04 or greater tended to be reported by both younger and older preschool children (N.B., 45 of the 47 nonobjects tested with children had at least one neighbor with strength of 0.04 or greater).
The goal of this research note was to determine the semantic set size for 88 nonobjects for use in future research. Results showed that it was possible to use a discrete association task with adults and preschool children to determine semantic set size for these nonobjects, although semantic set size was not independent of neighbor strength. Moreover, adult and child semantic set sizes were relatively comparable, even though specific neighbors reported by adults and children varied by neighbor strength. Guidelines for using these nonobjects in future research were offered.
This research was supported by DC 08095, DC 00052, DC 05803, HD02528. We are grateful to the following individuals for their contributions to data collection, data processing, and manuscript preparation: Teresa Brown, Deborah Christenson, Jennie Fox, Andrea Giles, Jennica Kilwein, Jill Hoover, Junko Maekawa, Shannon Rogers, Mariza Rosales, Allison Wade, and Courtney Winn.