When we look around us, we encounter environments characterized by numerous objects and people at varying distances. The scene depicted in is typical. It shows a garden in Southern California occupied by flora, people, and artifacts such as buildings and walkways. The scene is busy and cluttered. The objects have multiple parts and are located at various distances from the observer; nearer objects obscure farther objects—the trees hide part of the building's roof, and many of the people on the patio cannot be seen. In some instances, color, texture, size, and shape can serve as information for the unity of objects. The leaves on the trees, for example, are all roughly similar in appearance, and they are perceived as grouping together. In other instances, however, we see objects as unified despite considerable discrepancies in these kinds of visual information: The girls on the stairs wear blue shorts and red shirts, yet we do not see these distinctions in color as denoting four “parts” of girls, but instead we see them as belonging to objects in common. In real world scenes, motion of observers and of objects provides additional information to determine the contents of our surroundings. We can move through the environment, obtain new perspectives, and see parts of objects invisible from previous vantage points. And as objects move, we can track them across periods of temporary invisibility, often predicting their reappearance.
These facts about the visual world are at once ordinary and remarkable. They are ordinary because virtually every sighted observer experiences the environment as composed of separate, distinct objects of varying complexity and appearance. Yet they are nonetheless remarkable because of the intricate cortical machinery needed to produce them (Zeki, 1993
): several dozen areas of the brain, each responsible for processing a distinct aspect of the visual scene or coordinating the outputs of other areas, working in concert to yield a more-or-less seamless and coherent experience. Action systems are likewise elaborate (Gibson, 1950
): eyes, head, and body, each with distinct control systems, working in concert to explore the visual environment.
In this article I consider theory and research on the development of infants' perception of the visual environment, in particular object perception. The garden example illustrates some of the issues faced by researchers who wish to better understand these processes. Visual perception has a “bottom-up” foundation built upon coding and integrating distinctive kinds of visual information (variations in color, luminance, distance, texture, orientation, motion, and so forth). Visual perception also relies on “top-down” knowledge that observers bring to the scene, knowledge that aids in interpreting potentially ambiguous juxtapositions of visual attributes (such as the blue and red clothing on the girls). Both operate continuously in mature, sighted individuals who have had sufficient time and experience with which to learn about specific kinds of objects.
Young infants have not had as much time and experience with which to learn about objects, yet they inhabit the same world as adults and they encounter the same kinds of visual information. How do infants interpret visual scenes? Are they restricted to bottom-up processing, lacking the cognitive capacity to interpret visual information in a meaningful fashion, however that might be defined? Or might there be some capacity to perceive objects that is independent of visual experience? These questions have long been dominated by a tension between arguments for a learned or constructed vs. an unlearned or innate system of object knowledge in humans. This article will examine the question of infants' object perception by considering these theoretical perspectives, and attempts to answer the question with evidence from modeling of developmental processes and from empirical studies. I will restrict discussion to the developmental origins, in humans, of the ability to represent objects as coherent and complete across space and time—that is, despite partial or full occlusion—a definition of object perception akin to the object concept that was originally described by the eminent developmental psychologist Jean Piaget (more on this subsequently). The limited scope necessarily omits many interesting literatures on other topics related to object knowledge, such as object identity, numerosity, animacy, object-based attention, and so forth. Nevertheless, there has been a great deal of research effort directed at object concept development, and these investigations continue to bear on the question of infants' object perception by providing an increasingly rich base of evidence.