|Home | About | Journals | Submit | Contact Us | Français|
Recent behavioural work suggests that newborn face preferences are derived from a general, non-specific attentional bias toward patterns with more features in the upper versus lower half. In the current study we predicted that selectivity for the specific geometry of the face may emerge during the first 3 months of life as a product of perceptual narrowing, leading to the construction of the first broadly defined face category segregating faces from other visual objects which may share with faces one or more visual properties. This was investigated behaviourally, using a standard preferential looking paradigm, and electrophysiologically, using high-density ERPs. Behavioural results indicated that, at 3 months, the top-heavy property is no longer a crucial factor in determining face preferences. ERP results showed evidence of differentiation between the two stimuli only for the N700. No differentiation was found for earlier components that are thought to reflect the adult-like structural encoding stage of face processing in infants (N290 and P400). Together, ERP and behavioural results suggest that, by 3 months, the perceptual narrowing process has led to a behavioural response specific to the geometry of the human face, but that this response is not purely perceptual in nature. Rather, it seems related to the acquired salience of this stimulus category, which may reflect the high degree of familiarity and/or the social value faces have gained over the infants’ first 3 months of life.
It has been reported in a number of behavioural studies that newborns’ visual attention is spontaneously and preferentially attracted toward face-like configurations. Specifically, it has been shown that when newborns are presented with upright and upside-down schematic face-like configurations (Johnson & Morton, 1991; Mondloch et al., 1999; Valenza, Simion, Macchi Cassia, & Umiltà, 1996) or real face images (Macchi Cassia, Turati & Simion, 2004) that are matched for level of complexity, amount of energy, or both, they prefer those stimuli that display the upright structure of the face. These findings have been taken as evidence that, from birth, infants selectively respond to the geometry of the face, as defined by the correct relative location of the internal features for the eyes and the mouth (three dark blobs arranged in a triangular-shaped configuration with one vertex pointing down). Moreover, newborns’ face preference phenomenon has been taken as one of the strongest pieces of evidence supporting the existence at birth of a biologically determined, experience-independent neural mechanism dedicated to face processing (Farah, 2000; Farah, Rabinowitz, Quinn & Liu, 2000; Johnson & de Haan, 2001; Johnson & Morton, 1991). Based on a well-known case study demonstrating a lack of plasticity in the development of face recognition abilities, Farah (Farah et al., 2000) claimed that the anatomical cortical localization of face recognition is explicitly specified in the human genome. Alternatively, the developmental model of of face processing proposed by Johnson and Morton (1991; see also Johnson & de Haan, 2001) explains newborns’ face preference as a result of the existence at birth of a specific face-detecting mechanism located subcortically rather than cortically. During the first 2–3 months of life, this subcortical mechanism (i.e., Conspec) would act as a guide, biasing visual input to the developing cortex and thus favouring the emergence of cortical circuitry specialized for face processing.
Regardless of whether the neural localization of face processing originates cortically or subcortically, both the aforementioned claims assume that a highly specific starting point is necessary to initiate development in the face domain. More recent behavioural work with newborns has questioned this assumption, suggesting that face processing at birth is mediated by general, rather than domain-specific perceptual processes. This work showed that newborns’ face preference derives from a number of more general attentional biases that cause certain structural properties of a visual stimulus to be preferred, rather than from a content-determined bias for the unique geometry of the face (Simion, Macchi Cassia, Turati & Valenza, 2003). In fact, much evidence is now available supporting the contention that the visual structural properties embedded in the face are capable of producing a preferential response also when they are embedded in non-face stimuli (Macchi Cassia, Valenza, Pividori & Simion, 2002; Simion, Valenza, Macchi Cassia, Turati & Umiltà, 2002). Specifically, one of these structural properties (i.e., up-down asymmetry) relates to the up-down asymmetrical distribution of the inner facial features along the horizontal plane (i.e. faces tipically display two features -the eyes- in the upper part and one feature -the mouth- in the lower part) (Macchi Cassia, Turati & Simion, 2004; Simion et al., 2002; Turati, Simion, Milani & Umiltà, 2002).
The results of a recent study, in which natural and scrambled real face images were used as stimuli, showed that newborns attend equally long to a face as to a non-face “top-heavy” stimulus that shares with faces the same up-down asymmetrical distribution of the elements (Macchi Cassia et al., 2004). In this study, newborns showed a preference for an upright over an upside-down face (Macchi Cassia et al., 2004, Exp. 1), and for a non-face top-heavy pattern over a non-facelike bottom-heavy pattern (Macchi Cassia et al., 2004, Exp. 2). Importantly, when newborns were presented with a face and a non-facelike top-heavy pattern that were equated for the number of features appearing in the upper and lower halves (i.e., two features and one, respectively), they did not manifest any spontaneous preference (Macchi Cassia et al., 2004, Exp. 3; see Table 1). This demonstration that the face-like arrangement of the inner features displayed by the natural face did not affect newborns’ visual behaviour was taken by the authors as evidence that what is classically interpreted as a specific inborn preferential response to faces is in fact the result of a more general preference for any class of top-heavy visual stimuli displaying more patterning in the upper portion. Indeed, stimuli within this category include, but are not limited to faces. The authors suggested that this attentional proclivity toward the top-heavy stimulus category likely derives from endogenous constraints of the newborns’ visual system (i.e., an upper versus lower visual field advantage in visual sensitivity similar to that observed in adults, Skrandies, 1987), which renders top-heavy patterns more easily detectable for newborns than other stimuli.
The evidence provided by these studies with newborns seems to fit well with the perceptual narrowing account of the development of face processing proposed by Nelson (2001, 2003), according to which the ability to perceive faces narrows with development based on experiential input. This model suggests that the face specificity assumed in the adult neural and perceptual system arises from a general-purpose perceptual system that, during development, becomes progressively tuned to upright human faces, due to the extensive experience with this stimulus category reliably provided by the species-typical environment. The narrowing of the perceptual window would thus produce a more precisely defined face category comprised of the type of faces seen most often in the environment, for which more efficient strategies of perceptual processing may be utilized. The neurophysiological counterpart of this process of perceptual narrowing would be an increase in the selectivity and localization of the cortical circuits involved in face processing (Johnson, 2000). Over time, these circuits would pass from being activated by a broader range of stimuli to responding to only certain kinds of stimuli, thus giving rise to a more localized and specialized neural response. Note that this recently proposed model of the development of face processing differs from that previously proposed by Johnson and Morton (1991; Johnson & de Haan, 1991) in that it assumes that general rather than specific initial input is sufficient to set the stage for the development of the face recognition system into its adult-like, specialized form. Some of the strongest evidence supporting the idea of a perceptual narrowing process with respect to face perception comes from behavioural and event-related potential (ERP) studies that compared processing of human faces to that of nonhuman primate faces in infants over the first year of life, and adults, testing what is known as the “other-species effect”. Behavioural studies showed that although 6-month-olds, 9-month-olds, and adults are all equally capable of discriminating two human faces, 6-month-olds are superior to older infants and adults in discriminating monkey faces (Pascalis, de Haan & Nelson, 2002). In line with these results are findings from ERP studies showing increasing selectivity, between 6 and 12 months, to upright human (versus monkey) faces at the two components that have been proposed as possible “developmental precursors of the adult N170” (de Haan, Johnson & Halit, 2003), namely the N290 and P400 (de Haan, Pascalis & Johnson, 2002; Halit, de Haan & Johnson, 2003). The N170 is a well-documented, face specific component that is used as an electrophysiological marker for specialized mechanisms for face processing in adults (after Bentin Allison, Puce, Perez, & McCarthy, 1996). Evidence concerning the development of the N290 and P400 suggests that, between 6 and 12 months, the face processing system has become more finely tuned to upright human faces.
In addition to these findings, some earlier behavioural studies examining infants’ preference for face-like stimuli offer indirect evidence supporting the idea that, with increasing experience with faces, infants’ preferential responses to faces become tuned to more specific characteristics of this stimulus category. For example, it has been reported that the preference that newborns show for an upright highly schematic face-like configuration (i.e., three squared blobs arranged as facial features) over its inverted counterpart disappears by 6 weeks (Johnson, Dziurawiec, Ellis, & Morton, 1991; Mondloch et al., 1999). Moreover, 12- but not 6-week-old infants show a preference for a positive contrast schematic face over a negative contrast version of the same face (Dannemiller & Stephens, 1988; Mondloch et al., 1999). Together, these results suggest that, between birth and 12 weeks of age, infants’ behavioural responses start to depend more on the extent to which various characteristics of the stimuli resemble those included in real faces. Relevant to the aim of the present study, this evidence could be interpreted as a demonstration that, by 12 weeks, up-down asymmetry by itself is no longer sufficient to drive infants’ face preference, but rather other characteristics of the face stimuli have begun to play a more crucial role.
The goal of the current study was to provide direct support for the hypothesis that a process of perceptual narrowing, analogous to that reported to take place for faces of our own species between 6 and 9 months of life, may take place even earlier in development, leading to the construction of the first broadly defined face category segregating faces from other visual objects which may share with faces one or more visual properties. Specifically, we hypothesize that perceptual and neural selectivity for the specific geometry of the face may emerge during the first 3 months of life arising from the non-specific attentional bias toward top-heavy patterns that is present at birth (Macchi Cassia et al., 2004). Because this attentional bias causes faces to be a frequent input to the developing face processing system, we could expect that by 3 months of age faces might have emerged as a separated class of stimuli from other geometrically similar stimuli (i.e., top-heavy patterns). At the behavioural level, this perceptual narrowing process would result in increased attention triggering values of faces as compared to other non-face top-heavy visual stimuli, with respect to what has been found in newborns. At the neural level, we would expect to observe a difference in the neural response to the two different object categories of faces and other non-face top-heavy stimuli.
Both these hypotheses were investigated in the current study by using a between-subjects design, in which behavioural and ERP data were obtained from two different groups of 3-month-old infants. To determine whether, at 3 months of age, infants preferentially orient their visual attention to faces as compared to other non-face top-heavy patterns, we used the same behavioural paradigm - preferential looking - and the same set of natural and scrambled top-heavy and bottom-heavy images derived from real faces as those used in the newborn study by Macchi Cassia et al. (2004) (see Table 1). In the ERP study, we attempted to gain converging evidence for the differential processing of faces and non-face top-heavy stimuli in 3-month-olds by recording ERPs as infants viewed natural and scrambled top-heavy real face images.
The driving question of the behavioural study was whether the general structural property of up-down asymmetry is still a driving factor in determining face preference at 3 months. To address this, a single group of infants was presented, within a preferential looking paradigm, with three pairs of stimuli, each of which was intended to test a specific aspect of the general driving question.
Pair 1 -an upright and an inverted real face image- was used to determine if a preference for the face can be observed at 3 months when stimuli derived from real face images are used. The presence of a face preference in infants older than 2 months has been investigated using schematic stimuli of varying complexity in a few dated studies, which provided conflicting results (e.g., Dannemiller & Stephens, 1988; Johnson et al., 1991; Koopman & Ames, 1968; Maurer & Barrera, 1981; Mondloch et al., 1999; also see Maurer, 1985 for a review). In particular, when highly schematic upright and inverted face-like configurations were used in the two most recent of these studies, the “face” preference was no longer observed at 6 weeks (Johnson et al., 1991; Mondloch et al., 1999). In contrast, we predicted that, when real face images are used, such preference should be present even at 3 months.
Pair 2 -a top-heavy and a bottom-heavy non-face configuration- was presented in order to verify if the attentional bias toward top-heavy non face patterns, as has been found with newborns (Simion et al., 2002), is still active to any extent in 3-month-old infants. Understanding the power of the top-heavy property in inducing a preference for non-face-like stimuli was necessary for interpreting the role of the top-heavy property in producing the face preference predicted to be observed in Pair 1, as well as for making previsions regarding the possible outcome for the direct comparison performed in Pair 3 between an upright face and a top-heavy non-face pattern. If a preference for the top-heavy over the bottom-heavy pattern was present in Pair 2, two possible outcomes could be predicted for Pair 3. Three-month-old infants tested with Pair 3 could either show no preference, as newborns do (Macchi Cassia et al., 2004), or manifest a preference for the upright face. A non-preference in Pair 3 would indicate that the geometry of the face has not gained any additional attention triggering value with respect to the top-heavy property. On the other hand, a preference for the top-heavy stimulus in Pair 2 accompanied by a preference for the upright face in Pair 3 would suggest that, even though the bias toward top-heavy patterns still exerts itself, by the age of 3 months the face geometry has become an even more influential property. Finally, should we observe a non-preference in Pair 2, we could conclude that the top-heavy factor no longer plays a role in driving infants’ visual preferences. Therefore, we would expect to observe a preference for the face in Pair 3.
The final sample consisted of 20 infants (5 females) with a mean age of 92 days (range 78–97 days). All infants were born full-term and were of normal birthweight. An additional 27 infants were tested but their data were removed from the final sample because they did not provide data for all three stimulus pairs (n=3), they showed a strong position bias (i.e., they looked more than 95% of the time to one side) for one or more of the three stimulus pairs (n=15), experimenter error (n=2), or fussiness (n=7).
Three high-quality grayscale photographs of young female faces were digitally modified in the same way as in the newborn study by Macchi Cassia et al. (2004), so as to create 4 versions of each face differing exclusively in the spatial positioning of the inner features, the outline being equal, for a total of 12 stimuli. The faces in the original photographs were cropped just below the neck, with the hair and ears removed, and they served as one of the 4 versions of the stimuli, namely the upright face (UF). From UF, a second version of the stimuli, namely the inverted face (IF), was created by a 180° rotation of the inner region of the face. The other two versions of the stimuli were created by displacing and rearranging the inner facial features in such a way as to obtain a scrambled top-heavy (ST) configuration, containing denser features in the upper than in the lower half, and a scrambled bottom-heavy (SB) configuration, containing more features in the lower than in the upper half (see Table 2). The ST and SB stimuli differed exclusively for the up-down positioning of the inner features. Neither ST nor SB resembled a face in that the natural orientation of the eyes, eyebrows, and nose were modified, and the spatial position of each feature was altered. Moreover, the ST stimulus contained the same overall amount of visual information as the UF stimulus, with such information being equally asymmetrically distributed across the horizontal plane in the two configurations. Rather, these two stimuli differed exclusively in the presence or absence of the typical relations among the inner features that characterize the face geometry.
The stimuli were presented side by side against a black background on two computer screens and were separated by a 12 cm gap. They subtended a horizontal angle of approximately 11° and a vertical angle of 15° when viewed from a distance of about 65 cm.
Infants were placed in an infant seat located in a dimly lit experimental chamber, approximately 65 cm from a black wooden panel. The panel had two square holes where the black screens of two side-by-side computer monitors appeared; luminance and contrast were comparable for the stimuli. A video camera situated above the monitors recorded the infant’s face and looking behaviour. Infants’ eyes were aligned with two multicolored shapes that appeared one in the center of each screen, and that were used to attract the infant’s gaze at the start of each trial. To prevent interference from extraneous and distracting stimuli, peripheral vision was limited by two black panels placed on both sides of the infant.
Infants were tested using a standard preferential-looking paradigm (Berlyne, 1958; Fantz, 1958). All infants viewed two 20-second bilateral presentations of three stimulus pairs, for a total of 6 trials. Pair 1 consisted of the UF and IF stimuli, Pair 2 consisted of the ST and SB stimuli, and Pair 3 consisted of the UF and ST stimuli. In order to control for the presence of a possible position bias in the infant’s looking behaviour, the left/right position of the stimuli was reversed between the first and second presentation of each pair. At the start of the testing session, the infant’s attention was attracted with a noise from behind the black panel to the center of the computer monitors. Before each pair, a multicolored fixation point was presented on each screen for 6 seconds. If needed, during this time, the infants’ attention was re-attracted to the monitors with a noise from behind the panel. The two presentations of each pair were separated by a blank, 2-second black screen. The order of pair presentation, as well as the initial left/right position of the stimuli, was counterbalanced among participants. The two stimuli within each pair were versions derived from the same original face. The three pairs presented to each infant contained versions of a different face, and the nature of the face used for each pair was counterbalanced between participants.
Videotapes of eye movements were recorded and subsequently analyzed frame by frame to the nearest 33 ms by a coder who was blind to the specific position of the stimuli on each trial. The coder recorded, separately for each stimulus and each trial, the total fixation time, i.e., the sum of all fixations. As a measure of inter-observer reliability, the duration of single fixations was recorded by a second coder for 50% of the infants in the sample. The average level of agreement was 95%.
In order to test whether infants showed a preference for either of the two stimuli within each pair, a preference score, expressed in percentage, was computed. Each infant’s looking time at the UF stimulus for Pairs 1 and 3 and at the ST stimulus for Pair 2 was divided by the total time spent looking at either stimuli within the pair and subsequently converted into a percentage score. Hence, only scores significantly above 50% indicated a preference for the considered stimuli.
Three preliminary one-way Analyses of Variance (ANOVAs), one for each of the considered stimuli, performed on the preference scores manifested by the three groups of infants who saw Pair 1, 2 or 3 as the first stimulus pair within the testing session, revealed that order of pair presentation did not affect infants’ visual preferences. To determine whether preference scores differed from chance (50%) in each of the three stimulus pairs, three separate one-sample t-tests were applied, one for each pair. Preference scores were significantly above chance for the UF stimulus in Pair 1 (M = 58.09%, SD = 16.03, t (19) = 2.26, p < 0.05) and Pair 3 (M = 58.56%, SD = 17.02, t (19) = 2.25, p < 0.05). In contrast, preference scores for the ST stimulus in Pair 2 did not differ significantly from chance (M = 46.8%, SD = 15.12, t (19) = 0.95, p = 0.36). Crucially, a t-test for dependent samples revealed that the mean preference scores for the UF stimulus in Pair 1 and Pair 2 did not significantly differ (t (19) = 0.07, p = 0.94) (see Table 2).
Statistical analyses demonstrated that, as with newborns (Macchi Cassia et al., 2004, Exp. 1), 3-month-old infants looked longer at the real image of a canonical upright face than at the atypical image of the same face with the inner portion 180° rotated (Pair 1). Nevertheless, 3-month-olds’ visual behaviour differed from that of newborns in two important aspects. First, when presented with scrambled up-down asymmetrical configurations, infants did not prefer the top-heavy over the bottom-heavy configuration (Pair 2), as newborns did (Macchi Cassia et al., 2004, Exp. 2). Second, also differing from newborns (see Macchi Cassia et al., 2004, Exp. 3), 3-month-olds maintained the preference for the upright canonical face even when the face was paired with a non-face top-heavy configuration matched for the number of features appearing in the upper part (Pair 3). This latter finding was further strengthened by the observation that this preference for the upright face did not differ (i.e., was not less pronounced) from the preference that infants manifested for the same stimulus when it was paired with the inverted face.
Overall, the evidence suggests that, although the face preference phenomenon observed in newborns is still present in an apparently unchanged form at 3 months of age, at this later age the phenomenon is driven by a different type of bias with respect to the one that produces face preference in newborns. Infants at 3 months appear to base their preferential response to faces on the specific structural information conveyed by this stimulus category, rather than on the amount of patterning appearing in the upper as compared to the lower part of the stimulus. This evidence suggests that, by the age of 3 months, faces have emerged as a separate class of stimuli from other top-heavy visual objects, and are capable of selectively triggering infants’ visual attention.
To explore the degree to which a similar narrowing of the visual perceptual window to faces manifests itself at the neurophysiological level at 3 months, we next conducted an ERP study.
Recent ERP evidence suggests that by 3 months some degree of specialization of cortical processing of faces has emerged, as reflected by the N290 and the P400 discriminating between faces and matched noise stimuli (Halit, Csibra, Volein & Johnson, 2004). Furthermore, at 3 months the N290 and P400 have been shown to discriminate human from monkey faces (Halit et al., 2003), suggesting that some amount of perceptual narrowing to human faces may have already taken place.
The aim of the ERP portion of the current study was to determine if, at 3 months of age, electrophysiological evidence can be obtained that would support the view that faces are differentiated from other non-face top-heavy visual configurations that are matched with face stimuli for the amount of patterning appearing in the upper versus lower half of the image. Our prediction was that the behavioural differentiation between these two stimulus categories, as revealed by the visual preference for the upright face – UF stimulus – over the scrambled top-heavy configuration – ST stimulus – observed in the behavioural portion of the current study, would also manifest itself neurophysiologically as a difference between the ERPs elicited by the two stimuli at one or more time points during the 1200 millisecond recording.
The final sample consisted of 15 infants (9 females) with a mean age of 89.5 days (range 85–95 days). An additional 20 infants were tested but were excluded from the final sample due to excessive eye and/or body movements that resulted in recording artifacts (n=8) or fussiness that resulted in too few trials being recorded (n=12).
The stimuli were 16 upright faces (UF) and 16 scrambled top-heavy configurations (ST), created by modifying an analogous number of grayscale photographs of young female faces in the same way as in the behavioural study, for a total of 32 stimuli.
ERPs were recorded using a Geodesic Sensor Net (V2.0) consisting of 63 evenly distributed electrodes embedded in small sponges. EEG was recorded continuously and referenced to a single vertex electrode. Signals were amplified using an EGI NetAmps amplifier (Eugene, OR) with gain set to 10,000 times, a sampling rate of 250 Hz, and a bandpass filter of 0.1–100 Hz. Impedances were checked on-line prior to recording and were considered acceptable when they were below 50 Kohms.
After application of the sensor net, infants passively viewed the stimuli while seated on the caregiver’s lap in a dimly lit experimental chamber similar to that used for the behavioural study. Infants were positioned approximately 65 cm from a black wooden panel, which had a hole where the black screen of the computer monitor appeared. Both the infant and the caregiver were observed at all times by an experimenter via a video camera positioned just above the monitor. Online judgments were made by the experimenter so as to present the stimuli only when the infant was attending to the monitor.
Stimuli were presented for 500 ms with an experimenter controlled inter-stimulus interval that lasted no less than 1500 ms. Stimuli from each of the two conditions were presented with equal probability and the order of presentation was random with the constraints that (1) each unique image in the set was shown before any was repeated, and (2) stimuli from the same condition were not repeated more than three times in succession. Stimuli presentation continued until the infant became too fussy or bored to attend, with a maximum of 96 trials, 48 for each experimental condition. The average number of total trials viewed by the infants was 82.
Continuous EEG recordings were processed offline with a 30-Hz low-pass filter. The EEG was then divided into individual segments of 1300 ms (100 ms of pre-stimulus recording, 500 of ms stimulus presentation and 700 ms of post-stimulus recording), and baseline corrected to the average voltage during the 100 ms prior to stimulus onset. Due to infant intolerance of the lower EOG electrodes (sensors 63 and 64), these were removed from the GSN, rendering it necessary to visually inspect the data for EOG artifact. Trials were also visually inspected and subsequently excluded from further analysis if they contained motion artifact such as head turning. Finally, trials were rejected if they contained more than 9 total bad channels. Of the remaining trials, individual bad channels were replaced using spherical spline interpolation. Individual subject averages were computed separately for the UF and ST conditions (mean = 20 trials per condition), and then re-referenced to the average reference1.
Inspection of the grand-averaged waveforms revealed a series of seven well-defined components that were subsequently analyzed within the following time windows: P1 (90 – 170 ms), N1 (100 – 300 ms), N290 (250 – 450 ms), P400 (450 – 650 ms), Nc (300 – 700 ms), N700 (600 – 800 ms) and N900 (700 – 1100 ms). Electrode groupings, as well as time windows capturing these components, were chosen based on visual inspection of both the grand-averaged and individually-averaged ERPs. The peak latency and amplitude of the P1, N1, N290, P400, Nc and N700, as well as the average latency and amplitude of the N900, were derived by averaging the peak or average amplitude and latency from each channel within each electrode grouping (for electrode groupings see Fig. 1, ,22 and and3).3). These electrode grouping averages were then analyzed separately for each component by means of 2 X 2 repeated-measures ANOVAs with Stimulus type (UF, ST), and Hemisphere (left, right) as within-subject factors, using Greenhouse-Geisser adjusted degrees of freedom. Post-hoc paired t-tests were also performed when necessary using Bonferroni corrections for multiple comparisons.
Both the UF and the ST stimuli elicited the typical positive and negative deflections over the occipital leads known to occur in response the onset of the stimulus (P1 and N1) (see Fig. 1). Following these early deflections, both a negative (N290) and a positive component (P400) was visible over occipital-temporal electrodes (see Fig. 2). A sequence of three later-occurring negative components was also observed; the first – Nc – was apparent over frontal–central electrode sites (see Fig. 3), the second – N700 – was a well defined peak occurring at posterior electrodes (see Fig. 1), and the third – N900 – was a more gradual negative deflection occurring again over frontal-central electrodes (see Fig. 3). Means and standard deviations of peak amplitudes and latencies of the first six components are reported in Table 3. Due to the gradual nature of the N900, average amplitude instead of peak amplitude was analyzed for this component. Therefore, Table 3 reports the mean peak latency and average amplitude of the N900.
Neither latency nor amplitude of the P1 differentiated the UF from the ST stimuli (see Fig. 1).
No effect of Stimulus type or Hemisphere was found for N1 latency or amplitude (see Fig. 1).
Neither latency nor amplitude of the N290 differentiated the UF from the ST stimuli. However, a main effect of Hemisphere (F1,13 = 6.05; p = 0.02)2 showed that the latency to peak of the N290 was shorter over the right than the left hemisphere regardless of stimulus type (see Fig. 2).
Similarly to the N290, no latency or amplitude differences were found between the UF and the ST stimuli for the P400. However, for both latency (F1,13 = 5.03, p < 0.05)2 and amplitude (F1,13 = 4.74, p < 0.05)2 a main effect of Hemisphere revealed that, regardless of stimulus type, the P400 was of shorter latency and greater amplitude over the right hemisphere as compared to the left hemisphere (see Fig. 2).
No effect of Stimulus type or Hemisphere was found for Nc latency or amplitude (see Fig. 3).
N700 latency showed a main effect of Stimulus type (F1,14 = 4.88, p < 0.05), due to the UF stimuli eliciting a shorter latency to peak than the ST stimuli over both hemispheres (see Fig. 1). No significant main effects or interactions were found for the amplitude of the N700.
The latency of the N900 did not differ between the two stimulus types and the two hemispheres. For N900 average amplitude there was a significant interaction between Stimulus type and Hemisphere (F1,14 = 4.68, p < 0.05), but all follow-up comparisons were non significant (see Fig. 3).
Overall, ERP data showed that no stimulus effects were found in the early visual responses represented by the P1 and N1. More interestingly for the purpose of the present study, the findings provided no evidence for a differential response between the face and the non-face top-heavy configuration at the level of the N290 or P400. Rather, the two components showed the same topographical effects for both stimuli, peaking earlier over the right hemisphere and, in the case of the P400, also displaying larger amplitude over this hemisphere. Similarly, no difference between faces and top-heavy stimuli were found at the level of the later occurring negative components – the Nc and the N900 –, with the only exception of the N700, which peaked earlier to faces as compared to scrambled top-heavy configurations.
The aim of the current study was to investigate the degree of face specificity of 3-month-old infants’ looking preference and electrophyisiological responses. Face specificity was investigated behaviourally by determining the extent to which the visual preference for the face at 3 months is a specific attentional response to the face geometry, or a response to a more general structural property of faces, namely “top-heaviness”, as was found with newborns (Macchi Cassia et al., 2004; Turati et al., 2002). Electrophysiologically, face specificity was investigated by the extent to which the ERP waveforms elicited by canonical faces and non-face top-heavy patterns differ at 3 months.
As predicted, results showed that a visual preference for the face still exists at 3 months when realistic face stimuli are used. Most crucially, this preference was present not only when the face was compared to an inverted face, which has fewer elements than the face in the upper part, but also when it was compared to a scrambled face which possesses the same number of elements in the upper half as the face. These findings are consistent with evidence provided by a recent study in which a between-subjects design and different methodology were used to investigate the same phenomenon (Turati, Valenza, Leo & Simion, 2005). Moreover, the data from the current study provide more clear cut evidence that up-down asymmetry does not recruit 3-month-old infants’ attention when it is embedded in non-face stimuli. This finding was ambiguous in the study by Turati et al. (2005, Exp. 2), in which two different pairs of geometrical top-heavy and bottom-heavy patterns were used, and a preference for the top-heavy version of only one of the two pairs was observed.
Overall, the behavioural findings are consistent with our predictions that, by 3 months of age, the specific geometry of the face has gained the ability to recruit attention in a way that is not present at birth; specifically, the face geometry now overwhelms up-down asymmetry as the major contributor to the face preference. In contrast, our ERP investigation provided less clear, albeit intriguing, results. Here we failed to observe differences between the canonical faces and the non-face top-heavy stimuli for the N290 and P400 components. Rather, these two components showed the same topographical effects for both stimuli, in that they peaked faster to both the face and the non-face top-heavy stimuli in the right as compared to left hemisphere. Assuming that the N290 and P400 are the “developmental precursors to the adult N170” (de Haan et al., 2002; Halit et al., 2003; Halit et al., 2004), our demonstration that canonical faces and scrambled top-heavy configurations are not differentiated does not provide any support to the hypothesis that, at the basic level of face detection, our two stimulus types are treated as belonging to different object categories. Instead, although null results are difficult to interpret, our data may suggest that one or more of the perceptual properties shared by the two stimuli may have caused the brain to respond to them in a similar manner. This interpretation appears strengthened by the finding that both stimuli evoked the same enhanced N290 and P400 responses over the right as compared to left hemisphere.
This evidence of similar inter-hemispheric differences for the two stimuli is in keeping with the demonstration of an early right hemisphere advantage in face recognition as early as 4 months, as observed by de Schonen and colleagues in studies using a split visual field method (de Schonen, Deruelle, Mancini & Pascalis, 1996; de Schonen, Gil de Diaz & Mathivet, 1986;). Specifically, these studies have demonstrated that face (Deruelle & de Schonen, 1998), as well as non face (Deruelle & de Schonen, 1991) pattern discrimination based on configural differences is carried out more effectively by the right hemisphere as compared to the left. Assuming that, at 3 months, the right hemisphere is already more involved than the left in configural processing of face-like stimuli, this would imply for our data that, with regard to their structural, configural properties, our two stimuli are not yet able to be differentiated.
With regard to the perceptual narrowing process of face perception, this could mean that, although by the age of 2 months the infant brain may have started to form a face template which would allow all face stimuli to be processed by the same specialized cortical circuitry, as many sources of evidence seem to suggest (Johnson and Morton, 1991; de Schonen et al., 1986; Tzourio-Mazoyer et al., 2002), by 3 months this template may not yet be specific enough to reject the processing of all non-face stimuli. Should this be the case, it would be reasonable to assume that the scrambled face used in the present study may be too perceptually similar to a face, containing the same top-heavy property, facial outline and features, to be excluded by the immature face template of a 3-month-old infant. This hypothesis could be further explored in future ERP studies in which the amount of face-similarity of the top-heavy stimulus used in the face versus non-face comparison should be systematically diminished.
Alternatively, the lack of differentiation between the faces and the scrambled top-heavy stimuli for the N290 and P400 could be an indication that these components reflect the detection of the specific features of the face, which are present in both stimuli. Indeed, one offered interpretation for the functional properties of the N170, is that this component may reflect the activation of an internal eye-detector (Bentin et al., 1996). Assuming that the N290 and P400 are associated with the processes reflected by the N170, the same interpretation could be extended to these infant components. However, two sources of evidence speak against the possibility that the failure of the N290, and possibly the P400, to differentiate between our two stimuli may be due to the presence of the same facial features, and specifically the eyes, in both stimuli.
The first source of evidence comes from the only existing functional neuroimaging study to investigate face processing in infants, showing that some of the areas involved in face processing in adults can also be activated by faces by 2–3 months of age (Tzourio-Mazoyer, et al., 2002). Specifically, mapping the brain activity by means of positron emission tomography (PET) while infants viewed faces versus a set of 3 light diodes, this study found superior activation to faces in the inferior occipital gyrus as well as the inferior temporal gyrus, which in adults is the location of the fusiform face area (FFA; Kanwisher, McDermott & Chun, 1997). This activation was predominant in the right hemisphere. While in adults the inferior occipital gyrus is thought to be involved in the perception of facial features, the fusiform gyrus seems to respond to the invariant aspects of faces (like the specific face configuration and identity) (Haxby, Hoffman & Gobbini, 2000). Thus, these PET data may suggest that by 3 months infants’ face processing involves more than simple feature detection. Although the low spatial resolution of ERP data prevent us from positing any direct link between the observed ERP waveforms and the activation of specific brain areas, the above mentioned evidence seems to suggest that the similar inter-hemispheric differences observed for the upright faces and the scrambled top-heavy stimuli used in the current study is unlikely due solely to the fact that the same facial features are present in both stimuli. Rather, the other common property of the two stimuli, up-down asymmetry, may have prompted the developing face-specific circuitry in the right hemisphere to respond similarly to both stimuli.
There is also a second source of evidence making unlikely the possibility that the observed lack of differentiation between our two stimuli for the N290 and P400 may be due to the presence of the same features in both stimuli. This comes from an ERP study on 4-month-old infants’ gaze discrimination (Farroni, Csibra, Simion & Johnson, 2002). In this study, the N290 was found to discriminate between a photographic image of a face with direct gaze and the same face with an averted gaze. Given the sensitivity of this component to such a minor stimulus change (i.e., the location of the pupil within the eye), we would argue that the same component should also be sensitive to the misorientation of each of the local facial features in the scrambled top-heavy stimulus used in the current study. Again, this reasoning would suggest that the structural property of up-down asymmetry shared by our two stimuli, rather than the presence of the same facial features, may have acted as a critical factor in producing the ERP responses to each stimulus, therefore resulting in not only the observed non differentiation of the stimuli at the “infant face-sensitive” components, but also in the enhanced processing observed for these components in the right as compared to the left hemisphere for both stimuli.
We would argue that the present findings do not conflict with previous reports, which tested infants’ face sensitivity for the N290 and P400 components at 3 months in similar ERP paradigms, using the same 63-channel recording system. While it could be posited that subtle discrepancies in aspects of ERP recording paradigms (e.g., mean number of artefact-free trials per condition or duration of stimulus presentation) could account for differences in obtained results, it seems more likely that these differences originate from the nature of the stimuli used in each study. For example, Halit et al. (2004) reported that the N290 and P400 components discriminated between faces and matched visual noise stimuli, which did not share any specific perceptual properties with faces, apart from low-level visual information. Our findings also seem not to conflict with evidence provided by Halit et al. (2003) showing that these two components were sensitive to the species of the face. In this case, although both human and monkey faces share the same top-heavy structure, the monkey faces may not have served as an adequate control for human faces with regard to other possible perceptual differences (e.g., shape of face outline, amount of contrast, etc).
In addition to the early perceptual components, we also observed a series of three later occurring negative components. The first of these components, the Nc, is a well-studied, mid-latency, negative deflection occurring at frontal-central electrodes. It has been reported to be of larger amplitude to infrequent compared to frequent stimuli in infants from 4 to 7 months (Courchesne, Ganz & Norcia, 1981; Richards, 2003), but also of larger amplitude to familiar as compared to novel faces, and to familiar as compared to novel toys, at 6 months (de Haan & Nelson, 1997, 1999). Overall, the Nc is interpreted as reflecting the infant’s allocation of attention with greater negativity to the most “salient” stimulus (Nelson, 1994; Nelson & Monk, 2001; Reynolds & Richards, 2005). In contrast to the results obtained in the behavioural part of the current study, which speak to the salience of the upright canonical face for 3-month-old infants, the Nc response was found not to be enhanced to this stimulus. Two possible interpretations of this discrepancy between the behavioural and ERP measures could be proposed. First, it could be argued that the amount of attentional resources, as reflected by the Nc, required to process a stimulus that is salient due to its familiarity, such as the upright face, is equal to the amount of attentional resources required to process a stimulus that may also be salient due to its novelty, such as the scrambled top-heavy configuration, which has resemblance to, but is not, a face. A second possibility could be that the time window captured by the Nc, as well as the later N900, is too early to allow for the manifestation of the difference between the two stimuli, which emerges at a later time window in behaviour. This reasoning appears to be strengthened by the observation that the visual preference for the upright face (UF) over the scrambled top-heavy (ST) stimulus observed in the two consecutive 20-second presentations of Pair 3 in the behavioural preferential looking task does not present itself within the first 1200 ms of the two presentations (UF mean preference score = 42%, SD = 24.93; vs 50% t (19) = 1.43, p = 0.17).
Inspection of the grand means shows that the Nc appears to resolve at approximately 700 ms, corresponding to the peak of a well-defined negative component occurring over occipital leads, the N700. This component peaked faster to the face as compared to the scrambled top-heavy configuration over both hemispheres. To our knowledge, the only published study to have intentionally analyzed the N700, conducted by Reynolds and Richards (2005), found no differentiation when novelty and frequency of stimulus presentation were manipulated. Based on its occipital location, the authors proposed that the N700 is associated with early visual processing. However, its temporal location would suggest that it is specifically related to the stimulus offset, which in our study, as well as in the study by Reynolds and Richards (2005), occurred at 500 ms. The stimuli used in our ERP comparison were matched for low-level visual properties, as proven by the fact that the early P1-N1 complex seen in response to the stimulus onset did not differentiate between them. Therefore, we would suggest that the difference observed at the later occurring N700 does not purely reflect a difference in sensory processing, but may rather reflect the attentional disengagement induced by the disappearance of the stimuli. If this is the case, our findings may indicate that disengagement of attention occurs more quickly from the face than from the scrambled top-heavy configuration, possibly due to the high familiarity of the face in contrast to the novelty of the scrambled stimulus. Interpreted in this vein, the results of the current study would resonate with those obtained in an ERP study examining the effects of familiarity on 8-month-olds face processing (Nelson, Thomas, de Haan, & Wewerka, 1998). Although found inadvertently while analyzing the Nc at midline electrodes, in this study the N700 peaked faster at the occipital lead to the familiarized face as compared to the novel face.
The final component of interest in our data was the N900, a late negative component occurring at frontal-central electrodes, which, like the Nc, did not differentiate between the face and the scrambled top-heavy stimulus. Although no evidence regarding this component has previously been reported, we propose that the N900 may be a continuation of the Nc, which in our data appears to be bifurcated by the polarity reversal produced by the simultaneously occurring N700 elicited over occipital leads. Although possible, it seems unlikely that the N900 could be a type of late slow wave (Nelson, 1994, 1997; Richards, 2003), as slow waves are typically observed as starting at roughly 1000 milliseconds and resolving as late as 2000 milliseconds.
Overall, the discrepancy between behavioural and ERP findings obtained in the present study appears relevant to the question of the functional relationship between infant overt responses and the underlying neurophysiological processes. To our knowledge, only one study in the field of infant face processing has been published to date using a similar dual approach of investigating both behavioural and electrophysiological responses within infants of the same age (i.e., 6 months, de Haan & Nelson, 1997). Interestingly, this study also observed discrepancies between ERP measures and looking times in a preferential looking task. However, contrary to our pattern of results, ERP measures were found to be more sensitive to the stimulus manipulation as compared to behavioural measures, revealing null results in the visual preference task, but larger Nc amplitude to the familiar as compared to novel face.
As previously discussed with regard to the attentional response reflected by the Nc and N900, we propose that the discrepancies between our ERP and behavioural data may be accounted for by one of two possible explanations. First, it may be that 3 months of visual experience is enough for the perceptual narrowing process to produce a segregation of the face category from other non-face top-heavy visual objects at the behavioural level, but the same amount of time may not be sufficient to produce an analogous specialized response to faces at the neural level. This seems unlikely given that a change in behaviour would necessitate a change at the neural level.
A second possible explanation accounting for the discrepancy between our ERP and looking time measures may be related to methodological differences. For example, the behavioural study provided infants with two 20-second simultaneous presentations of a face and top-heavy non-face stimulus, whereas in the ERP study infants were provided with several individual 500-ms presentations of each stimulus. This difference in stimulus duration may have had a significant influence on the observed differential sensitivity of the behavioural and ERP measures. This possibility seems to be supported by the observation that only at one of the latest time windows analyzed is discrimination between the two stimuli in the ERP paradigm visible, and that the behavioural preference emerges sometime after 1200 ms, which is the duration of the recording window used in the ERP paradigm. This may suggest that 500 ms of stimulus exposure, followed by 700 ms during which any possible additional processing was recorded, is not sufficient to allow for the discrimination of the two stimuli in 3-month-olds.
In either case, our ERP data do not provide evidence in support of the claim that, by 3 months, the specific geometry of the face may cause faces to be treated differently from other non-face top-heavy stimuli at the structural encoding stage, thought to be reflected by the infant N290 and/or P400. Rather, the latency and amplitude effects observed at these two components demonstrate that both stimuli elicited similar right hemisphere advantages. Nevertheless, the latency effect in favour of the face, observed at the N700 component, soon after the stimulus offset, suggests that the 3-month-old brain shows signs of sensitivity to the familiarity of this stimulus category.
Finally, evidence from the current study provides some insight into the open question of whether the ability to process faces is one that is acquired through experience. If indeed the development of this ability reflects an experience-dependent process, as several authors have suggested (e.g., Diamond & Carey, 1986; Gauthier & Logothetis, 2000; Mondloch, Le Grand & Maurer, 2003; Nelson, 2001, 2003), then we need to ask how much experience is necessary for this system to become specialized, given the relatively non-specific biases observed at birth. Considered together with data provided by newborns (Macchi Cassia et al., 2004), our behavioural findings suggest that the system is capable of bootstrapping from a minimally specified predisposition toward a broadly defined category of top-heavy visual stimuli. In as little as 3 months, the presence of this general, non-specific attentional bias toward patterns containing more information in the upper portion, including but not limited to faces, is able to produce the emergence of a behavioural response specific to the human face. Nevertheless, our ERP and behavioural data, taken together, seem to suggest that this specific response is not purely perceptual in nature, as reflected by the fact that no differentiation was observed at earliest components analyzed in the ERP study. Rather, the late occurring differentiation observed in both the ERP (i.e., around 700 ms) and behavioural measures (i.e., after the first 1200 ms of stimuli presentation) seems to suggest that face specific responses at 3 months may be related to the acquired salience of this stimulus category, which may reflect the high degree of familiarity and/or the social value that faces have gained during the infants’ first 3 months of life.
This work was supported by a National Institutes of Health grant to Charles A. Nelson (NS32976) and a research grant from the University of Milano-Bicocca (F.A.R. 2004) to Viola Macchi Cassia. We wish to thank Vanessa Vogel for her help in testing subjects and Jim Williams and Qaisar Jawad for technical assistance.
1Based on the location of five electrodes of the Geodesic Sensor Net (sensors 10, 19, 23, 59, 60), and on our knowledge of the propensity of these locations to be overly sensitive to both EOG and movement artifacts, these electrodes were excluded from the computation of the average reference. Therefore, the end product consisted of 58 average re-referenced waveforms.
2The ANOVAs on the latency and amplitude of the N290 and the P400 were performed on 14 of the 15 subjects. One subject was removed from these analyses because he had one of the electrodes included in the groupings (channel 41) marked as bad.