|Home | About | Journals | Submit | Contact Us | Français|
We develop and evaluate a model of behavior on the Give-N task, a commonly-used measure of young children’s number knowledge. Our model uses the knower-level theory of how children represent numbers. To produce behavior on the Give-N task, the model assumes children start out with a base-rate that make some answers more likely a priori than others, but is updated on each experimental trial in a way that depends on the interaction between the experimenter’s request and the child’s knower-level. We formalize this process as a generative graphical model, so that the parameters—including the base-rate distribution and each child’s knower-level—can be inferred from data using Bayesian methods. Using this approach, we evaluate the model on previously published data from 82 children spanning the whole developmental range. The model provides an excellent fit to these data, and the inferences about the base-rate and knower-levels are interpretable and insightful. We discuss how our modeling approach can be extended to other developmental tasks, and can be used to help evaluate alternative theories of number representation against the knower-level theory.
A basic challenge in understanding human cognitive development is to understand how children acquire number concepts. Since the time of Piaget (1952), number has been one of the most active areas of research in the field. One prominent current theory about the origin of integer concepts is the ‘knower-level’ theory (Carey, 2001; Carey & Sarnecka, 2006; Wynn, 1990, 1992; see also Le Corre, Van de Walle, Brannon, & Carey, 2006; Le Corre & Carey, 2007; Sarnecka, Kamenskaya, Yamana, Ogura, & Yudovina, 2007; Sarnecka & Gelman, 2004).1
The knower-level theory asserts that children learn the exact cardinal meanings of the first three or four number words in order. That is, children begin by learning the meaning of ‘one’ first, then ‘two’, then ‘three’, and then (for some children) ‘four’, at which point they make an inductive leap, and infer the meanings of the rest of the words in their counting list. In the terminology of the theory, children start as NN-knowers (for “No Number”) or “Pre-number-knowers”, progress to one-knowers once they understand ‘one”, through two-knower, three-knower and (for some children) four-knower levels, until they eventually become CP-knowers (for “Cardinal Principle”). Thus, the cardinal meanings of “one”, “two”, “three” and sometimes “four” are learned in a completely different way than the meanings of “five” and higher number words. The former are learned gradually, one at a time; the latter are learned all at once, by induction (see Carey, 2001, 2004; Carey & Sarnecka, 2006, for reviews). Our concern is mainly with the early part of this process, involving the learning of small-number words.
An important task for the knower-level theory is a widely-used one known as the ‘Give-N’ task (e.g., Frye, Braisby, Lowe, Maroudas, & Nicholls, 1989; Fuson, 1988; Schaeffer, Eggleston, & Scott, 1974; Wynn, 1990, 1992). In this task, children are simply asked to give some number of objects (usually small toys) to the experimenter (or an experimenter substitute, such as a puppet). The behavioral data are just a set of question-answer pairs, recording how many toys the child was asked to give and how many they actually gave.
The knower-level theory makes a number of strong predictions about children’s performance on the Give-N task. For example, it predicts that children at a given knower-level, when asked about a higher number whose exact meaning they do not know will avoid giving any set size they can name. In practice, this means that children’s guesses about unknown number words will lower-bounded by their knower-level. This is because children learn the number words in sequence. For example, if a child understands only the number “one” (i.e., is a “one-knower”) they might mistakenly give 3 toys when asked for 2, but they will not give 1 toy, because they know what “one” means, and they know that none of the other number words means 1 (Wynn, 1990, 1992).
Following this line of reasoning, the performance of a child on the Give-N task should be highly diagnostic in assessing their knower-level, and so the task potentially provides an important developmental measure. It is not easy, however, to determine knower-levels from raw Give-N data, because there are task-specific influences on behavior that need to be accounted for in determining knower-level. For example, it is empirically quite likely that a no-number-knower, whatever they are asked for, will give 1 toy, or 2 toys, or a small handful of toys, or the whole basket-full of them. So, if the basket of toys the child selects from has 15 toys in total, answers like 1, 2, 3, and 15 are more likely than numbers like 8, 9, or 10, but this is just a task-specific quirk of the Give-N procedure.
This behavior is a problem for diagnosis because, for example, it might lead a two-knower to give (apparently correctly) 3 toys when asked for three, but only because it is a default number to give when the instructions are not meaningful, not because they actually understand the concept “three”. The same two-knower is very unlikely to give 8 toys when asked, though, because that is not a default response.
More generally, it is not straightforward to test directly the predictions of the knower-level theory against Give-N data, because the theory itself does not provide a complete description of behavior. It provides a detailed theory of how children represent numbers, but does not fully explain how these representations lead to behavior on developmental tasks. What is observed experimentally is a (potentially complicated) mixture of the representations of number concepts children have, and the decision processes they use to transform their understanding to action. Accordingly, our ability to understand Give-N task behavior, measure children’s knowledge of number concepts, and evaluate the knower-level theory itself in detail, all depend on having a complete model of task performance.
This paper develops such a behavioral model, and evaluates the model directly against previous data. We begin by describing the data, and then the model, at first intuitively, and then formally. We present the results of applying the model to the data, and finish with a discussion of possible extensions and applications of our approach.
We consider previous data, presented as Dataset 1 by Sarnecka and Lee (2009), including 82 monolingual speakers of English, ages 2–4 years (mean 3 years, 7 months; range 2–11 to 4–6), tested at preschools in Irvine, California, or at a university cognitive development lab in Cambridge, Massachusetts. As part of their participation in other studies, each child completed an intransitive counting task, where the experimenter simply asked them to “count to ten”. Our data include only those children who counted to 10 perfectly. Thus, we can be sure that every child was familiar with the number words “one” through “ten.”
Table 1 provides some examples of Give-N behavior, by detailing all the data for 3 of the 82 children in the full data set. (The full data set is presented in the appendix). Each row in Table 1 corresponds to a question, asking the children for “one”, “two”, “three”, “four”, “five”, “six”, “eight” or “ten” toys. The entries in the columns for each child correspond to how many toys they actually gave when asked each question. Multiple entries are the same child’s responses over multiple trials asking for the same number. So, for example Child A gave 2, 5 and 5 toys again on the three trials where “two” toys were requested.
There are several interesting features to note in the sample data in Table 1. One is that, if a child correctly gives a number, they also tend to avoid giving that number of toys when asked for a different number word. For example, Child B responds correctly when asked for “one” and “two”, and also does not give 1 or 2 toys when asked for larger numbers, even though they make many errors for these higher number questions. This pattern suggests that Child B understands the meanings of the words “one” and “two”. It is not clear, however, whether they understand the meaning of “three”, because, although they always give 3 when asked, they also give 3 in error when asked for “eight” and “ten”. As these examples make clear, an application of the knower-level theory must account for both aspects of number knowing: giving the correct number when asked, and not giving that number when asked for something else.
A second observation about the data in Table 1 is that errors follow a non-obvious distribution. Some errors fall near the target number and could be attributed to miscounting, or to the use of estimation rather than counting (e.g., when Child B is asked for ”five” but gives 6.) But other errors fall far from the target, as when Child A and Child C give 10 and 15 toys for the word ”five.” In fact, giving all 15 toys is a common error.
Finally, Table 1 also shows how behavior can adhere to the knower-level theory assumption that children learn the number words in order. While Child A seems only to understand “one”, Child B seems to understand “one” and “two”, and Child C seems to understand “one”, “two”, “three” and “four”. All of these interesting features of the data set need to be explained by our model.
One powerful way to build models of simple human inferences—such as how many toys to give—is to adopt a rational, or computational-level perspective (Marr, 1982). The idea is to use rational principles, in this case the framework for inference provided by Bayesian statistics, as a working theoretical assumption about the goals of cognitive processes. We do not assume that children are doing formal Bayesian computations, but rather that the goal of human cognitive processes is to approximate these computations. In this sense, a computational model provides an account of why cognition behaves as it does, without making a commitment to what actual processes humans use, or how those processes are implemented in neural systems.
This approach has been successfully employed throughout psychological modeling, in areas including vision, causal learning, property induction, categorization, and decision-making (see Chater, Tenenbaum, & Yuille, 2006; Griffiths, Kemp, & Tenenbaum, 2008; McClelland, 2009, for overviews). It is ideal for our purposes, because we seek a way of relating the knower-level theory of number-concept representation to behavioral data, so that we can use experimental data to draw conclusions about developmental states. Adopting a computational approach lets us build a quantitative model of behavior, with a rational justification, but without speculative theorizing about mental processes.
Using the computational Bayesian perspective, we assume that the child’s decision-making process on the Give-N task has four parts
Initially, a child has a ‘base-rate’ distribution which expresses the probability of giving each possible number of toys. This base-rate distribution can be thought of as the child’s a priori bias toward or against each possible response, even before any particular number has been requested. Behaviorally, for a Give-N task with 15 toys, the base-rate represents the probabilities that children would give 1,…,15 toys if they were asked to give objects in a completely non-numerical way (e.g., if they were asked “Can you give me fish?” and English did not make a singular-plural distinction).2 In Bayesian terms, the base-rate distribution is the prior the child has over appropriate Give-N behavior, in the absence of any other information.
When the experimenter gives an instruction, this is used to update the base-rate probabilities, and create a new distribution of likely responses. In Bayesian terms, the instruction is the datum on which inference about appropriate Give-N behavior is based.
The updated distribution will depend critically on the child’s knower-level. In Bayesian terms, the knower-level theory provides the likelihood function through which data updates prior to posterior beliefs. We describe this carefully below.
Finally, the child will give some number of toys. This is the actual behavior observed in the task, and recorded by the data. The probability of each possible response is expressed by the updated distribution. In Bayesian terms, observed behavior is sampled from the updated belief distribution.
Two concrete examples of this decision-making process are shown in Figure 1. One example is shown in each row, as a sequence of the four stages connected by arrows. Both relate to a child we assume to be a three-knower, with a fixed base-rate. This base-rate is shown by the leftmost bar graphs, and gives the initial probability that the child will give 1,…,15 toys. Giving a ‘handful’ of toys (i.e., 1,…,4) is most likely; giving all of the toys (i.e., 15) is also more likely; the other possibilities (i.e., 5,…,14) are less likely, but still possible. To begin with, we are just assuming a plausible base-rate to help us explain the model using concrete examples. Later we will use the model to infer the base-rate from the actual data.
In the first row of Figure 1, the child is asked to “give two” toys. This child, being a three-knower, knows what “one”, “two” and “three” mean. So they are very likely to give 2 toys, and very unlikely to give 1 or 3 toys. This is reflected in the updated belief distribution. The other possible responses (4,…,15 toys) do not change in their relative probabilities, although they will change their absolute probabilities, because the probabilities for 1, 2 and 3 have changed. That is, because 4 and 15 were more likely than 5–14 in the base-rate, they will still be more likely after updating, although all the numbers 4–15 are less likely in absolute terms. All of these changes can be seen in the rightmost bar graph in the first row. Giving 2 is very likely, giving 1 or 3 is not at all likely, and giving 4, …,15 is unlikely, with 4 and 15 slightly more likely than the other responses. In this example, the most likely response is 2. Thus a three-knower who is asked for “two” will probably respond correctly.
The second row of Figure 1 shows an example of another trial with the same hypothetical three-knower. This time, the child is asked to “give five” toys, but “five” is a number they do not know. All they know is that “five” does not mean 1, 2 or 3. So the responses 1, 2, and 3 will become much less probable, but all of the other numbers (i.e., 4,…,15) will retain the same relative probabilities to each other. All of these changes can be seen in the rightmost bar graph in the second row. Giving 4 becomes the most likely response, followed by 15, followed by the other numbers the child does not know (i.e., 5,…,14). The numbers they do know (i.e., 1, 2, and 3) are very unlikely responses. The actual behavior produced by the child is again just a sample from the updated belief distribution. In this example, the most likely response is 4. Thus, a three-knower who is asked for “five” will probably respond incorrectly.
Figure 2 presents the graphical model we used to implement our model. Graphical models are a standard approach to implementing probabilistic models in machine learning and statistics, and have more recently been used as a framework for implementing and analyzing models of cognition (see Lee, 2008; Lee & Wagenmakers, 2008; Shiffrin, Lee, Kim, & Wagenmakers, 2008, for overviews). In graphical models, variables are represented by nodes in a graph, and their connections show how they relate to each other. In Figure 2, observed variables (i.e., data) are shown as shaded nodes, and unobserved variables (i.e., model parameters to be inferred) are shown as unshaded. Discrete variables are indicated by square nodes, and continuous variables are indicated by circular nodes. Stochastic variables are indicated by single-bordered nodes, and deterministic variables (included for conceptual clarity) are indicated by double-bordered nodes. Finally, encompassing plates are used to denote independent replications of the graph structure within the model.
In our implementation of the knower-level model in Figure 2, the data are the observed qij and gij variables, which give the number asked for (the ‘question’) and the answer (the number ‘given’) respectively, for the ith child on their jth question. The base-rate probabilities are represented by the vector π, which is updated to π′, from which the number given is sampled. The updating occurs using the number asked for, the knower-level zi of the child, and an evidence value v that measures the strength of the updating. The base-rate and evidence parameters, which are assumed to be the same for all children, are given vague priors (i.e., ones that allow for a very large range of possible inferences.)
The updating rule that defines π′ decomposes into three basic cases, as explained in discussing the examples in Figure 1. If a number k is greater than the knower-level zi then, whatever number q they are being asked for, the updated probability remains proportional to the base-rate probability pk for that number. If a number k is within the child’s knower-level range zi, it either increases in probability by a factor of υ if it is the number q being asked for, or decreases in probability by a factor υ if it is not. For a child who is a CP-knower, their range encompasses all of the numbers. The final part of the graphical model relates to the behavior step, with the number of toys given being a draw from the probability distribution π′ representing the updated beliefs.
The graphical model in Figure 2 provides a generative probabilistic model of behavior on the Give-N task. This means it provides a formal account of how data from the task are produced or generated. The model starts the generating process from the unknown psychological variables—the base-rate distribution and the evidence value parameters, which are the same for all children, and the child’s knower level parameter, which varies child-by-child—and then says how these variables interact with the task instruction (i.e., the question asked) to produce the observed behavior (i.e., the number of toys given).
The great strength of generative models is that by formalizing the process that produced data, inference can automatically be done using Bayesian methods.3 Intuitively, Bayesian inference works out what the base-rate, knower-level and evidence value must have been, to have produced the data that were actually observed. It does this simultaneously for all of the psychological parameters, and for all of the children. Because it knows what behavior any fixed set of parameters would produce, it can take actual observed behavior and infer what the parameters must have been. We think generative models have an advantage in our context, because the inferences they make come from a clearly articulated formal account of how observed behavior was produced. This puts the modeling emphasis on psychological theorizing, rather than data analysis.
The graphical model implementation in Figure 2 is an especially convenient formalization of our generative model, because it makes it easy to do fully Bayesian inferences. We achieve this using standard WinBUGS software (Spiegelhalter, Thomas, & Best, 2004), which applies Markov Chain Monte Carlo computational methods (see, for example Chen, Shao, & Ibrahim, 2000; Gilks, Richardson, & Spiegelhalter, 1996; MacKay, 2003) to make inferences about model parameters and data. In particular, we applied our model to the data by collecting 5 independent chains of 5,000 samples, each with 1,000 samples of burn-in. The standard measure of convergence—which basically measures between- to within-chain sample variability—was between 0.99 and 1.01 for π, υ and all 82 zi variables, indicating good convergence (e.g., A. Gelman, Carlin, Stern, & Rubin, 2004, pp. 296–297).
We report the results in four parts. First we report the base-rate distribution inferred by the model. Second, we report the degree to which evidence (in the form of the experimenter’s request) changes the base-rate distribution. Third, we report on the model’s ability to assign a knower-level for each child. Each of these analyses comes immediately from the posterior distribution over the π, υ and zi variables provided by the graphical model. Fourth, we examine the posterior prediction our model makes about data, which is a standard Bayesian way to examine the goodness-of-fit between model and data.
Figure 3 shows the inferred base-rate,4 which represents children’s predisposition to give each possible number of toys, before they are asked for any particular number. The distribution accords surprisingly well with what we might intuitively expect. That is, children seem predisposed to give either a small number of toys (1,…,5) or all 15 toys, rather than something in between.
We want to emphasize that this base-rate is entirely inferred from the data, under the generative model of behavior we developed using the knower-level theory. Any possible combination of probabilities summing to one was given equal prior probability in our modeling; we did not “insert” this, or any other, base-rate, into the model in any way. The fact that a highly reasonable and interpretable base-rate was inferred is one piece of suggestive evidence that the model is a useful one for Give-N data.
The posterior distribution for the evidence υ was approximately Gaussian distributed, with a mean of 29.2 and standard deviation of 7.4. This is a sensible result that is straightforward to interpret. It means that the instructions provided to children in the Give-N task had the effect, under our model, of increasing or decreasing the probability of any given response by a factor of about 30.
Figure 4 shows the posterior distribution over the six knower-level (NN-, one-, two-, three-, four- and CP-knowers) for each child, ordered from the smallest expected value to the largest. The noteworthy feature of this result is that most of the children are classified with high certainty into a single knower-level. There are exceptions (e.g., Child 3 in the first row; Child 78 in the second row), but, for the most part, there is confidence in a single classification (indicated by a single, high peak for each child). In fact, over 89% of the children have a posterior mass for a single knower-level that is at least twice as likely as any other alternative, and more than 68% of children have a classification that is fully 10 times more likely than any other.
When inferring a discrete latent variable like a knower-level, highly-peaked posteriors are a suggestive indication that the model is a useful one. When models are badly mis-specified, Bayesian inference tends to mix over a wide range of possibilities to try and fit the data, making interpretation difficult. What the peaked distributions in Figure 4 show is that the model leads to confident predictions about the knower-level of most children.
We also note that, in those cases where the posterior distribution shows uncertainty about a child’s knower-level, that uncertainty is invariably distributed over neighboring knower-levels. For example, the model shows uncertainty about whether Child 3 is a two-knower or three-knower. There is no case where the posterior distribution covers two levels that are not adjacent. For example, there is no case where the distribution is split between two-knower and CP-knower. This is not an assumption built into the model, which treats the knower-levels as a set of nominally scaled possibilities. Accordingly, the patterns of uncertainty seen in Figure 4 provide suggestive support to the claims of knower-level theory that children learn number concepts in order.
Finally, we assess the model more directly, using posterior prediction. This is a standard Bayesian approach, comparing the probability of data according to the model with the data actually observed. In a sense, posterior predictive analysis is a way of assessing goodness-of-fit, but it important to understand that it automatically accounts for model complexity in ways that approaches like maximum-likelihood fitting do not. Each prediction is the average across the entire parameter space, as weighted by the posterior distribution for parameters, not the prediction that gives the maximum agreement at a specific set of parameter values. In this way, the posterior predictive guards against over-fitting, and constitutes a principled and useful way to evaluate whether a model provides an adequate account of data.
Figure 5 shows the posterior predictions of the model for the NN-, one-, two-, three-, four- and CP-knower-levels. Each level corresponds to a panel, and each panel is organized with each possible question (i.e. how many toys are asked for) along the x-axis and each possible answer (i.e., how many toys are given) along the y-axis. This organization means that each cell corresponds to a possible question and answer combination. The shading of each cell corresponds to the posterior probability of that number of toys being given when that question is asked, with darker shading indicating greater probability. The overlayed circles represent the behavioral data for those children classified into each knower-level by the posterior inferences presented earlier, with their size showing how many toys were actually given when questions were asked.5
It is clear from this analysis that the model provides an excellent account of the data, because the larger circles representing data almost always fall on darkly shaded regions, showing that the model expects this behavior. Note that Figure 5 shows the posterior prediction of the model for all possible question and answer pairs, including for questions that were not asked as part of the current data set. As a consequence, there are many dark squares without circles in Figure 5, corresponding to predictions the model makes for questions where data are not available. Obviously, these cases correspond to gaps in the available data, not failures in the prediction of the model.
The benefit of showing the full range of model predictions in Figure 5 is that it makes graphically clear how the model formalizes the key assumptions of the knower-level theory, and how those assumptions are borne out by the experimental data. For the NN-knowers, the model is able to capture the non-obvious pattern of errors we noted earlier, giving highest probability to the numbers 1 through 5 and 15, as observed in the data. The base-rate is responsible for these good predictions because, for a NN-knower, the experimenter’s instructions provide no additional information, and the base-rate is the sole guide for behavior.
For one- through four-knowers, the model predicts that all of the numbers that are understood will be used correctly. That is, they will tend to be given when asked for, and they will not be given in error when asked for a different number. Those numbers larger than the knower-level, however, continue to follow base-rate probabilities. In the posterior predictive display in Figure 5, this leads to a distinctive pattern whereby predictions for small numbers are largely on the diagonal (i.e., correct responses), but numbers above the knower-level have predicted errors consistent with the base-rate. The super-imposed data show that this pattern of predictions reflects actual behavior very well. There are only a few data points that violate the expected pattern, and those are explained by the probabilistic nature of our account of decision-making, as captured by the evidence parameter.
Finally, a similar story holds for CP-knowers, who are inferred to understand all of the numbers. The model predicts correct behavior for all of the numbers, and the data again show very few exceptions.
A basic goal of empirical science is to relate formal models to experimental data. Among the many benefits of this endeavor are the ability to make inferences about unobserved but substantively meaningful parameters, and the ability to make direct predictions about empirical observations. Our development of a model using the knower-level theory of number development was motivated in this way, and we think it is successful.
Our results show how the formal model allows us find the base-rate distribution for the Give-N task, a measure of how much task instructions influence behavior, and the knower-level of each child. The base-rate quantifies the ‘chance’ distribution for the Give-N task. From the outset, it seemed unlikely that this distribution was uniform, because some responses seem more likely than others due to the nature of the Give-N task. But, because chance responding is never directly observed, it would be extremely difficult to quantify the appropriate distribution without a formal model. Thus, the ability of the model to infer the base-rate shown in Figure 3 provides an insight into the nature of the Give-N task that otherwise would not be available. Similarly, knowing how much instructions in the Give-N task influence behavior is useful task-specific information.
Perhaps the greatest benefit of being able to infer the base-rate and evidence, however, is that it enables knower-level theory to be applied cleanly to the problem of measuring children’s understanding of number words. This is seen in the ability of the model to make inferences about knower-levels, as shown in Figure 5. Assessing knower-levels has previously been done by applying ad hoc heuristics to behavioral data, and has failed to account for the non-obvious chance distribution captured by our base-rate. For this reason, applying our model provides a sharper inference about an important developmental variable.
Our posterior predictive assessment of model fit shows how we are able to assess the knower-level theory directly in terms of observed raw data. This is possible because our model provides a complete generative account of how behavior on the Give-N task is produced. The knower-level theory is the cornerstone of this account, but is supplemented with simple rational assumptions that specify how children transform their understanding of number concepts into actual behavior. Without these additional mechanisms, empirical evaluation of the knower-level theory would have to rely on less-direct statistical tests of properties of the data, and would not be amenable to making quantitative predictions about Give-N behavior. For these reasons, we think our model is a good example of the benefits of adopting a generative approach to psychological modeling.
It would be straightforward to apply our model to data from alternatives to the Give-N task, such as the ‘What’s-On This-Card?’ task, in which children produce number words for sets presented visually (R. Gelman, 1993; Le Corre et al., 2006; Le Corre & Carey, 2007). We would expect the base-rate inferred from these data to be quite different, but the model itself, including the key knower-level theoretical commitments, to be unchanged. Indeed, one way to understand the benefits of our model is that it separates, in a formal way, the task-specific base-rate effects on behavior from the effects coming from a child’s understanding of number concepts. This separation serves to ‘factor out’ the task-specifics, and focus on the fundamentally important psychological concept of knower-levels.
Finally, the model-based approach we have adopted has the potential to contribute to the most basic questions of theory evaluation and comparison. An alternative theory of how children initially represent exact numbers involves an analog magnitude scale (e.g., Dehaene, 1997; Gallistel, 1990). There are various possibilities, including mechanisms based on scalar estimation and counting processes (e.g., Cordes, Gallistel, & Gelman, 2001; Whalen, Gallistel, & Gelman, 1999), for using this theory to develop a model of Give-N task behavior. With a rival to the current model in place, it would be possible to evaluate both directly against experimental data, using standard quantitative criteria measuring their descriptive adequacy and predictive ability (see Myung, Forster, & Browne, 2000; Shiffrin et al., 2008). While formal model-based evaluations are certainly not the only criteria for choosing between competing theories, they can provide important evidence that is difficult to obtain by other means. Accordingly, we believe models like the one we have presented constitute an important, but currently under-developed, line of research needed to evaluate and improve our theories of how children represent numbers.
This research was supported by NICHD Grant 00234342 to the second author. Massachusetts data collection was supported by NSF REC Grant 0337055 to Elizabeth Spelke and Susan Carey. We thank Josh Tenenbaum and two reviewers for their very helpful comments. We also thank the children and families who participated in the original studies, the preschools hosting that research, and UCI Cognitive Development Lab Manager Emily Carrigan and research assistants John Cabiles, Alexandra Cerutti, Jyothi Ramakrishnan, Sarah Song, Dat Thai and Gowa Wu for their help with data collection.
1Of course, the knower-level theory is not the only well-developed account of how children represent numbers. We discuss how the modeling approach we adopt in this paper can address more general questions of theory evaluation and comparison in the Discussion.
2We thank a reviewer for this psychological interpretation of the base-rate.
3It is important to distinguish between the two distinct ways Bayesian inference is being used in our study. One is as a theoretical assumption about how the child uses instructions as data to update their base-rates. The other is as a statistical framework for relating a cognitive model to behavioral data, for the purposes of inferring parameter values and producing model predictions. These two uses are quite independent. It would be possible to develop a cognitive model of Give-N behavior that did not involve Bayesian assumptions about the mind, and it would be possible—although technically challenging—to do statistical analysis on our model of Give-N behavior using standard frequentist approaches.
4Technically, Figure 3 shows the posterior predictive distribution for the baserate. This is a convenient way to summarize visually the most important properties of the 15-dimensional joint posterior distribution of parameters using the one-dimensional data space of numbers 1,…,15.
5To classify each child using the posterior distributions in Figure 4, we took a conservative approach, and assigned the first knower-level with posterior mass greater than the prior mass (i.e., the first knower-level for which the data provided positive evidence). Generally, of course, this procedure just gives each child their obvious classification based on Figure 4 (e.g., Child 2 is classified as a NN-knower), but in the rarer ambiguous cases our approach is conservative (e.g., Child 79 and Child 1 are also classified as NN-knowers, despite their being some possibility they are one-knowers).