|Home | About | Journals | Submit | Contact Us | Français|
Humans are subject to the composite illusion: two identical top halves of a face are perceived as “different” when they are presented with different bottom halves. This observation suggests that when building a mental representation of a face, the underlying system perceives the whole face, and has difficulty decomposing facial features. We adapted a behavioural task that measures the composite illusion to examine the perception of faces in two nonhuman species. Specifically we had spider (Ateles geoffroyi) and rhesus monkeys (Macaca mulatta) perform a two-forced choice, match-to-sample task where only the top half of sample was relevant to the task. The results of Experiment 1 show that spider monkeys (N = 2) process the faces of familiar species (conspecifics and humans, but not chimpanzees, sheep, or sticks), holistically. The second experiment tested rhesus monkeys (N = 7) with the faces of humans, chimpanzees, gorillas, sheep and sticks. Contrary to prediction, there was no evidence of a composite effect in the human (or familiar primate) condition. Instead, we present evidence of a composite illusion in the chimpanzee condition (an unfamiliar primate). Together, these experiments show that visual expertise does not predict the composite effect across the primate order.
Humans discriminate face stimuli with remarkable speed and accuracy. It has been previously asserted that the underlying system builds mental representations of face stimuli which are holistic in nature, i.e. distinctive facial features are not processed independently, but are instead fused together into a single unit of analysis. One clear prediction that follows from this assertion is global precedence; the identity of a whole face will change the perceived identity of any isolated facial feature (e.g. the left eye). Evidence of holistic processing in humans comes from a number of behavioural experiments where the perceived identity of a facial feature was affected by the presence (or absence) of other facial features (Davidoff & Donnelly, 1990; Donnelly & Davidoff, 1999; Farah, Wilson, Drain, & Tanaka, 1998; Ingvalson & Wenger, 2005; Mermelstein, Banks, & Prinzmetal, 1979; Sergent, 1984; Suzuki & Cavanagh, 1995; Tanaka & Farah, 1993; Tanaka & Sengco, 1997; Thompson, 1980). However, perhaps the most direct measure of holistic processing that is currently available is, what is widely known as, the composite effect (Goffaux & Rossion, 2006; Hole, 1994; Le Grand, Mondloch, Maurer, & Brent, 2004; Maurer, Le Grand & Mondloch, 2002; Robbins & McKone, 2007; Rossion, 2008; Young, Hays, & Ellis, 1987).
The composite paradigm explicitly asks subjects to ignore the global identity of a face and to instead attend to a subset of facial features. In the original investigation, for example, subjects were asked to name the top half of a highly familiar face while it was presented with the bottom half taken from a different face (Young, et al., 1987). Performance was compared across two experimental conditions where the only difference was the alignment of the two halves. In the ‘aligned’ condition the task relevant information (the top half of the face) was presented directly above the bottom half, in a typical face format. In the ‘misaligned’ condition, the two halves were laterally offset (i.e. shifted along the common horizontal axis). Naming the familiar features in the top half of a composite face was found to be more difficult in the aligned condition than in the misaligned condition. The relatively poor performance associated with the aligned condition was attributed to the holistic interference generated by the inappropriate integration of the two inconsistent halves. Thus, performance recovered in misaligned condition because the features in the top half were not tightly bound to the features in the bottom half (also see Goffaux & Rossion, 2006; Hole, 1994; Hole, George & Dunsmore, 1999; Le Grand, et al., 2004; Michel, Rossion, Han, Chung, & Caldara, 2006; McKone, 2008; Robbins & McKone, 2007; Rossion, 2008; Young, et al., 1987; Yovel, Paller, & Levy, 2005).
A common presumption in the human literature is that the face processing system has developed in response to our complex social interactions. Variation between the faces is known to carry socially relevant information. Not only do they help identify the people around us, faces also communicate direction of attention (Langton, Watt, & Bruce, 2000), future intentions (Baron-Cohen, 1995) and current emotional state (Ekman, 1982). Effectively, being able to link a particular individual to their past behaviour allows us to modify our response to that individual and to build a relationship over time. All primate societies can be characterised by long-term relationships and thus it might be beneficial for any primate to be able to efficiently recognise faces at the individual level. If face recognition supports social behaviour in humans, then one reasonable expectation is that other nonhuman primates will process face stimuli in a similar way. An outstanding question is whether nonhuman species use the same processing strategies as humans?
The impact of holistic interference on the accuracy of chimpanzees during a composite task (Pan troglodytes) has been previously studied using a match-to-sample (MTS) procedure (Parr, Heintz & Akamagwuna, 2006). In a slight deviation from the human paradigm, every sample face (the stimulus to be matched) was comprised of two halves that were drawn from different individual chimpanzees. These were then presented as either aligned or misaligned composites, or as the top half alone or the bottom half alone. Each subject was then required to select one of two images that best matched the sample. However, these images showed the unaltered faces of those individuals used to make the composite. It was found that subjects more often matched the individual who was represented in the top face half of the composite when it was aligned compared to misaligned. One caveat was that subjects were rewarded for any choice they made, so these results suggest that the identity of the individual presented in the aligned composite was better represented by features of the upper face than lower face. Subjects chose the individual represented by the top and bottom face parts about equally as often when the composite sample was misaligned, suggesting no integration of upper and lower facial features.
Fission/fusion dynamics are a central feature of the chimpanzee social system, which also characterise the societies formed by bonobos, orangutans, and spider monkeys (Smuts, Cheney, Seyfarth, Wrangham, & Struhsaker, 1987; Campbell, Fuentes, Mackinnon, Panger, & Beader, 2006). Amici, Aureli, and Call (2008) found a high correlation between fission fusion societies and greater cognitive flexibility. If you assume that this level of social complexity is also correlated with the perception of face stimuli, and given that there is some evidence that chimpanzees build holistic representations of faces, then you might also expect that spider monkeys will experience holistic interference during a composite task. The primary aim of Experiment 1 was to determine if spider monkeys would be better able to match the top half of face stimuli when presented in the misaligned format compared to the aligned equivalent. Moreover, Experiment 1 also addressed the extent to which holistic processing in spider monkeys is finely tuned to the faces of conspecifics when the subjects are also highly familiar with their human caregivers.
A firmly established phenomenon in the human face processing literature is that subjects experience difficulty recognising the faces of unfamiliar races. Lower recognition performance associated with other-race faces is commonly referred to as the other-race effect (Anthony, Copper, & Mullen, 1992; Chance & Goldstein, 1981; Cross, Cross, & Daly, 1971; Malpass & Kravitz, 1969; Sangrigoli & de Schonen, 2004; Sangrigoli, Pallier, Argenti, Ventureyra, & de Schonen, 2005). The other-race effect has been shown to reduce the markers of holistic processing such as the composite effect (Michel, Caldara, & Rossion, 2006; Michel, Corneille, & Rossion, 2007; Michel, et al., 2006; Tanaka, Keifer, & Bukach, 2004). Visual experience, therefore, intensifies holistic processing. The same pattern of results is evident at the level of species. Experience with the faces of allospecifics has been shown to improve face discrimination in human, and some nonhuman primates (Pascalis, de Haan, & Nelson 2002; Pascalis, Scott, Kelly, Shannon, Nicholson, Coleman, & Nelson, 2005; Sugita, 2008; Parr, Dove & Hopkins, 1998) with no such expertise effect has been consistently demonstrated in old world monkeys (Phelps & Roberts, 1994; Tomonaga, 1994). In the current experiment, a new world monkey was tested with the faces of familiar primate species (conspecifics and humans), an unfamiliar primate species (chimpanzees), an unfamiliar non-primate species (sheep) and an unfamiliar non-face species (sticks). The expectation was that previous visual experience would drive the composite effect in the familiar primate conditions.
The data were collected from two black-handed spider monkeys (Ateles geoffroyi). Both males (Bb and Kg) were approximately 23 years of age at the time of testing and are permanently pair housed in an indoor/outdoor mesh enclosure at Symbio Wildlife Park, Helensburgh, NSW. Although these monkeys were born and raised in captivity, the extent to which they were able to interact with other conspecifics is not known. Normal feeding routines were not interrupted, and water remained freely available during testing. The sample size was limited by the poor availability of subjects. Spider monkeys could only be recruited from a small number of zoological parks, which tended to house a single pair of spider monkeys. Not only did the zoo have to house spider monkeys, but also, their enclosure needed to be conducive to training and research. Aspects of spider monkeys enclosures that were considered impracticable or hazardous included wet moats, mixed species exhibits or enclosures that provided no shelter at ground level. A benefit of testing two individuals from the same zoo is that differences across subjects cannot be attributed to differential testing or housing conditions.
The procedures for Experiment 1 were approved by the Macquarie University Animal Ethics Committee and are identical to those described by Taubert (in press). The subjects were tested voluntarily in their home enclosure at two times during the day; in the morning from 8:00am and in the afternoon from 3:00pm. The subjects had no prior experience with cognitive experiments of any kind, so prior to the onset of testing, both subjects were habituated to sitting on a raised platform (approximately 10 cm from the ground) at the back of their enclosure and reaching through the mesh into a dedicated testing area that was directly behind the enclosure. The testing area was equipped with a custom-made window display that was designed for a simultaneous match-to-sample paradigm. The window display sat in front of a Benq 22″ (55.9 cm) LCD widescreen monitor that was driven by a 2GHz intel MacBook. All computerised tasks described below were programmed using Superlab 4 software (www.superlab.com).
The testing apparatus was built so that subjects would be presented with three windows of equal size (10.2 × 15.2 cm). The central window will be hereafter referred to as the ‘sample’. The windows to the left and right of the sample presented the comparison stimuli and each had a closed food tray positioned underneath it. The subjects were trained to open these food trays. During any trial, one of the comparison windows, either the left or right, would depict the target (the stimulus that matched the sample). The opposing window would depict the distractor. All three images (left choice/sample/right choice) were visible for the same amount of time. The stimuli for this experiment consisted of digitised images that were constructed using Adobe® photoshop CS2 software (www.adobe.com).
A session began when a subject sat on the raised platform. A trial began when an opaque barrier between the subject and the window display was lifted, allowing the subject to view all three windows. If the trial was reinforced, the tray that corresponded to the target stimulus (to the left or right of the sample) contained a small food reward, such as a piece of fresh or dried fruit. The tray directly underneath the distractor was always empty. Opening this tray indicated an incorrect choice, which would immediately terminate the trial. A trial was terminated by abruptly bringing down the barrier between the subject and the window display. Conversely, if the correct tray was chosen the participating monkey was allowed enough time to check the other tray before the barrier was brought down. The average inter-trial-interval (ITI) was 7 s. The subject ID and choice was recorded by the experimenter for every trial completed. A session was terminated when the subject left the raised platform for longer than ten minutes.
Using simple schematic pictures of different objects and animals, both subjects were trained to perform this match-to-sample (MTS) task. Once a high level of proficiency (≥ 80% correct) was achieved, the subjects were moved to a more challenging training phase. During this phase a single session was comprised of 150 trials in total; 50 viewpoint training trials, 50 face matching trials and 50 composite training trials. Trial presentation was fully randomised. To pass this more specific training phase the criterion was changed. To complete this phase of the experiment the monkeys needed an average accuracy that was greater that 70% across three consecutive sessions.
Viewpoint training trials involved matching pictures of animals across changes in viewpoint (see Figure 1). These were included to encourage the subjects to treat the visual stimuli as two-dimensional representations of three-dimensional objects. The stimuli for the viewpoint training trials were photographs taken of 60 plastic toy animals from four different viewpoints. These images were then converted to 256 greyscale, cropped to remove all the background information and placed on a white background. In the last three training sessions prior to the experiment Bb responded accurately to viewpoint training trials 78% of the time (117 out of 150). Kg was 81% accurate (121 out of 150) when responding to viewpoint training trials.
Face matching trials were included to train the subjects to match stimuli at the individual level (see Figure 1). The stimuli used during face matching trials were drawn from a larger stimulus set of human face images that were downloaded from the public domain (256 greyscale). Human faces were used because of the extensive exposure the subjects were known to have to human faces. Accuracy tended to be lower on face matching trials. When the total number of correct responses from the final three sessions were added together they indicated that Bb was 70% accurate (105 out of 150), while Kg was 74% accurate (111 out of 150).
Composite training trials were designed to teach the monkeys that the task relevant information was in the top half of the sample. These trials were essentially a colour matching task. The composite training stimuli depicted coloured oval regions (4 × 6 cm in size). The sample oval was divided into two halves by colour, e.g. if the top half was blue, the bottom half might have been yellow. Half of the composite samples were presented in the aligned format (i.e. the top half was sitting directly above the bottom half), the other half were presented in the laterally misaligned format (i.e. the two half was shifted along the common horizontal axis). The comparison windows depicted oval shapes that would not divid into halves. The target oval was coloured the same as the top half of the sample, while the distractor was a different colour altogether, e.g. if the target was blue, the distractor might have been red (see Figure 1). The total number of correct responses, taken from the final three training sessions, show that both subjects were more than 70% accurate when completing composite training trials (Bb, 110 out of 150 trials correct; Kg, 129 out of 150 trials correct).
After three training sessions were successfully completed, experimental trials were introduced. In the post-training phase, a single session comprised up to 300 trials in total; 100 experimental trials and 200 reinforced trials (randomly selected from the viewpoint training trials, face matching trials and composite training trials). Experimental trials, however, were not reinforced with food rewards. The subjects were tested with five species chosen to represent considerable shifts in phylogenetic distance (conspecifics, common chimpanzees, humans, domestic sheep, and sticks). The sample face, for any species, would appear in one of three different formats (control, aligned or misaligned). Six digital photographs per species, were taken of 6 different individuals, all with neutral facial expressions. The subjects had no pre-experimental familiarity with these images or the identities they represented. Of the six monkey exemplars, four were female, as were three human exemplars, five chimpanzee exemplars and all six sheep. The six photographs that were taken of different sticks all had the same number of leaves and were photographed against a uniform grey background.
The whole samples and comparison stimuli for each experimental exemplar were converted to greyscale and cropped to fit on the standard canvas. These stimuli were not altered in any other way. However, the composite samples for each species had a top and bottom half taken from different exemplars. An aligned sample was constructed by removing the bottom half of the original whole face, from the horizontal midline down. The remaining top half was then rejoined with the bottom half taken from another exemplar of the same species. If necessary, the bottom half was adjusted to align the nose and cheek features. Each experimental exemplar appeared in an aligned composite sample once as the top half and once as the bottom half. The only difference between the aligned and misaligned samples was that the bottom halves of the misaligned samples were offset to the right by a width of approximately half a face (see Figure 2). During the experiment, each exemplar appeared twelve times as a target (six on the left) and 12 times as a distractor (six on the left). Each exemplar appeared as both a target and a distractor an equal number of times to neutralise any exemplar-specific bias, e.g. selecting one image when it was both target and distractor, rather than matching the sample within a trial. The data recorded from the experimental trials in a single session were included in the analysis only if performance during the reinforcing trials was greater than 70%. Data collected from incomplete sessions were included in the analysis, provided the aforementioned criterion was met. Sessions continued, twice daily, until both subjects had given a valid response to each of the 360 experimental trials.
In control conditions, when the sample was identical to the target, Bb was generally very accurate (monkey faces, 18/24 correct; human faces, 15/24 correct; chimpanzee faces, 16/24 correct; sheep faces, 21/24 correct; sticks, 17/24 correct). To determine if accuracy was independent of sample format (aligned v misaligned), each correct (and incorrect) response for each level of species was included in one of five separate 2 × 2 contingency tables. The critical values were adjusted for multiple comparisons by using Benjamini and Hochberg rule and taking control of the false discovery rate (see Benjamini & Hochberg, 1995; Koen, Verhoeven, Simonsen, & McIntyre, 2005). These tests revealed significant composite effects only in a subset of face categories (see Figure 3a). Bb found misaligned samples easier to match, suggesting holistic interference, during the conspecific (χ2 (1, N = 48) = 10.2, p = 0.001) and human conditions (χ2 (1, N = 48) = 4.46, p = 0.03). Meanwhile there was no association between accuracy and format during the chimpanzee (χ2 (1, N = 48) = 0.33, p = 0.56), sheep (χ2 (1, N = 48) = 0.08, p = 0.77), or stick trials (χ2 (1, N = 48) = 1.34, p = 0.25).
Kg was less consistent than Bb when matching control samples (monkey faces, 16/24 correct; human faces, 20/24 correct; chimpanzee faces, 10/24 correct; sheep faces, 20/24 correct; sticks, 19/24 correct). The same χ2 tests were run on the trials completed by Kg and they indicate holistic interference in the human condition (χ2 (1, N = 48) = 12.0, p = 0.001). However, the predicted advantage for misaligned trials was not found in the conspecific (χ2 (1, N = 48) = 3.2, p = 0.07), chimpanzee (χ2 (1, N = 48) = 0.09, p = 0.77), sheep (χ2 (1, N = 48) = 0.08, p = 0.77), or stick conditions (χ2 (1, N = 48) = 0.34, p = 0.56).
The number of correct responses in the various misaligned conditions was taken into account because above chance performance would indicate a general ability to perform the task. Raw scores are presented in Figure 3b. In misaligned trials, both subjects performed above chance in the conspecific and human conditions.
The control conditions suggest that spider monkeys can match photographs of individuals in a 2AFC task at a level that is above chance. When the sample was a human (Bb and Kg) or a spider monkey (Bb), the number of correct responses depended on the alignment of the sample, with more correct responses in the misaligned condition. These were considered the ‘familiar’ species because the subjects had interacted with both conspecifics and humans in the past and thus had the opportunity to develop real world expertise for both species’ faces. This observation strongly suggests that spider monkeys build holistic representations of familiar face stimuli that interfere with the recognition of smaller featural components. Importantly, there was no indication of holistic interference for either subject in the unfamiliar primate condition (chimpanzee faces), the unfamiliar non-primate condition (sheep faces), or the non-face condition (stick stimuli). The implication is that holistic processing appears dependent, to some extent, on visual experience.
Performance in the misaligned trials indicates that the spider monkeys were, perhaps, unable to successfully complete the task in the unfamiliar conditions. Attention might have served as a mediating factor here; the subjects had no pre-experimental need to individuate chimpanzees, sheep or sticks stimuli which may have severely limited their capacity to attend to these stimuli. Even so, this does not detract from the observation of a composite effect in the familiar face categories.
Does it then follow that experience alone can predict the misaligned advantage in any primate system? In 2007, Dahl, Logothetis, and Hoffman concluded that holistic processing influenced the eye movements of rhesus monkeys (Macaca mulatta). During a viewing preference task, eye movements were recorded while the subjects viewed composite faces (aligned or misaligned) passively without any pre-experimental training. The assumption was that aligned composite faces would induce an illusion of novelty and result in greater ‘rebound’ (longer looking time). The subjects did, indeed, show a preference for aligned composite faces over misaligned composite faces. However, this conclusion is at odds with available human data, which suggests that the composite effect can operate independently of eye movements (de Heering, Rossion, Turati, & Simion, 2008). It is not clear from the available evidence that a series of fixations on a face reflect the integration of facial features into a global representation. In related research, Hsiao and Cottrell (2008) found that just two fixations, in the centre of the face, are sufficient for accurate face recognition. These initial fixations are more likely to support holistic processing than a long series of fixations because holistic processing is thought to act quickly and involuntarily (see Goffaux & Rossion, 2006; Tsao & Livingstone, 2008). Therefore, all the eye movements that take place in the first few seconds that follow the presentation of a face may correlate with any number of cognitive processes in addition to holistic processing.
Furthermore, while it could be argued that trained tasks may encourage nonhuman subjects to learn abnormal processing strategies, it should be noted that human composite tasks instruct subjects to adopt an abnormal approach to face stimuli (i.e. match the top half and ignore the bottom half of a face). The key observation is that they ignore the bottom half more easily in misaligned trials compared to aligned trials (regardless of the strategy human subjects use to performing this “unrealistic” task). Thus, even though verbal instructions differ dramatically from training procedures, until the relationship between eye movements and holistic processing is better understood, there will be a demand for standardisation across species.
Because the manifestation of the composite effect on the behavioural performance of humans is unequivocal, Experiment 2 was run to collect the necessary behavioural data from a group of captive-born rhesus monkeys. If rhesus monkeys process faces holistically, as the conclusions of Dahl and colleagues (2007) imply, then rhesus monkeys should build holistic representations of aligned composite stimuli, involuntarily. It follows from results of Experiment 1 that misaligned human faces will be matched more accurately than aligned human faces because these subjects have had substantial experience with human faces. To be consistent with the human literature, and the results of Experiment 1, the composite effect should not be induced by unfamiliar primate faces (chimpanzee and gorilla faces), unfamiliar non-primate faces (sheep) and non-face objects (sticks). Moreover, if the MTS task has trained a feature-based strategy in the monkeys, such as a pixel matching strategy, then performance on all image categories should be the same and no differences between aligned or misaligned composite images is expected.
The data for Experiment 2 were collected from seven rhesus monkeys (Macaca mulatta), two males (Sm and Rk) and five females (On, Oi, Cw, Lm, and Cl). These subjects were born at the Yerkes Primate Centre field station, Lawrenceville, GA. After three to four years, the subjects were relocated to the Yerkes main station in Atlanta, GA, and pair-housed in a large room with several other pairs of rhesus monkeys. At the time they were tested in Experiment 2, these experienced cognitive subjects were all between six and seven years of age. All subjects had prior expertise matching faces in a MTS format but not in the format presented here (see Parr, Heintz & Pradhan, 2008).
The procedure for Experiment 2 was approved by the Institutional Animal Care and Use Committee of Emory University. Testing took place in a dedicated testing room and/or directly in the home cage. The training and experimental trials were presented on a 17″ flat panel touchscreen monitor (Elo secure touch, surface wave monitors, www.elotouch.com).
The details of the match-to-sample (MTS) task for Experiment 2 deviate from those described for Experiment 1. These differences were necessary because the subjects had previously acquired considerable experience responding to a particular two-forced choice MTS procedure, but there is no theoretical reason to suspect that these differences would change the experimental outcome (see Parr, Heintz, & Pradhan, 2008, for details of the procedure). For example, during Experiment 2, the presentation of the comparison stimuli was preceded by an orienting response to the sample stimulus as it appeared on any of the four walls of the computer. Approximately 500 ms after the subject had touched the sample three times in rapid succession, the two comparison stimuli appeared on the wall opposite the sample. At this time the subjects were trained to select the comparison image that matched the sample. Correct responses were rewarded with food (via an automatic feeder) and an ITI of 2 s. The ITI was longer (8 s) if the subject’s response was incorrect.
Having had extensive practice matching stimuli, the training phase that took place before Experiment 2 only involved composite training trials. The visual stimuli for the training trials were taken from Experiment 1. In a single training session the monkeys completed 74 trials in total. The number of sessions it took to perform above chance on the composite training trials varied for each participant (M = 11.57, SEM = 1.25).
The subjects were tested twice daily; once mid morning and again in the afternoon. Once the subjects were trained to treat the top half of the sample as the task relevant information (≥ 75% accurate), they completed two blocks of experimental trials consecutively (200 trials in total; 100 aligned and 100 misaligned). This was then repeated for every stimulus species (human faces, chimpanzee faces, sheep faces and stick objects; see Figure 2). During Experiment 2, there were 10 exemplars per species. As in Experiment 1, each exemplar appeared as a target and a distractor an equal number of times. For construction of the visual stimuli see the appropriate subsection above. The 200 experimental trials that were associated with any of the five species were presented in a random order and were preceded by a set of composite training trials where the subject performed above chance.
In Experiment 2 subjects were either matching aligned or misaligned samples. A summary of overall performance is presented in Figure 4a. A 2 × 5 repeated measures ANOVA was performed on the data collected during Experiment 2 (see Figure 4b). Regardless of whether the sample was presented as an aligned or misaligned composite, there was evidence of a systematic difference across the levels of species (F4,24 = 4.49, MSE = 46.07, p < 0.01). The subjects were more accurate when matching familiar primate faces (human composite faces) compared with Taubert & Parr chimpanzee composite faces (t6 = 3.83, p < 0.01; Bonferroni adjusted). There was no evidence that human composite faces were more accurately matched than any other category (gorilla faces, t6 = 1.63, p > 0.1; sheep faces, t6 = 0.20, p > 0.1; sticks, t6 = 1.89, p > 0.1; Bonferroni adjusted). Averaged across species, there was no difference between aligned and misaligned trials (F1,6 = 0.73, MSE = 43.12, p > 0.1). Importantly, the significant interaction between species and format strongly suggested that the relationship between aligned and misaligned trials depended on the species of the sample face (F4,24 = 4.57, MSE = 23.79, p < 0.01).
To identify any significant composite effects, a series of planned t tests were run. The reported p values have been Bonferroni adjusted. The only significant advantage for misaligned trials was in the chimpanzee condition (t6 = 3.01, p < 0.05). No other comparisons were significant (human, t6 = 0.28, p > 0.1; gorilla, t6 = 0.68, p > 0.1; sheep, t6 = 1.96, p > 0.05; stick t6 = 0.38, p > 0.1). It follows that, contrary to prediction, there was no evidence of the predicted composite effect for human faces, the only species with which these subjects had had any real world experience. Instead, a composite effect was found in the chimpanzee condition. This result implies that while the rhesus monkeys experienced holistic interference it was not predicted from visual experience.
Experiment 1 provides evidence that nonhuman primates (namely spider monkeys) build holistic representations of face stimuli, particularly the species’ faces for which they have acquired real world expertise. Furthermore, the results of Experiment 2 do nothing to reject the claim that holistic representations are built outside the Hominoidea family. Overall, Experiment 1 and 2 provide evidence of the same behaviour across the primate order (apes, new world and old world) and suggest that holistic processing is a conserved processing strategy. To understand the benefits associated with being able to suppress the identity of facial features, the determinants of holistic processing need to be more closely examined.
One advantage of the present work is that the experimental task was preceded by a training phase that required the subjects to match the top half of the sample, as opposed to measuring the their spontaneous response to composite stimuli (Dahl, et al., 2007; Parr, et al., 2006). Since the present training procedure rewarded the subjects for matching the top half, while the bottom half was completely irrelevant, an alternative procedure could have the distractor match the bottom half. In any event, deliberately training the monkeys to effectively ignore the information in the bottom half of the sample is similar to the specific instructions given to human subjects (see Young, et al., 1987).
The ability to discriminate faces, in primates, appears to be dependent on visual experience and extensive practice. A convincing demonstration of the strong relationship between discrimination and experience is perceptual narrowing. Perceptual narrowing, in this context, describes the ability to discriminate the faces of any primate species, which narrowed as a result of selective exposure to human morphology. Perceptual narrowing for faces has been observed (Pascalis, et al., 2002) and reversed (Pascalis, et al., 2005) in human infants. Furthermore, other primate species have performed better during face recognition tasks with familiar primate faces when compared to unfamiliar primate faces (brown capuchin monkeys, Dufour, Pascalis, & Petit, 2006; chimpanzees, Martin-Malivel & Okada, 2007; cotton-top tamarins, Neiworth, Hassett, and Sylvester, 2007; Japanese monkeys, Sugita, 2008). Therefore, assuming that holistic processing is involved in the discrimination of face stimuli, it seems likely that visual experience may predict the composite effect.
This hypothesis was supported by the results of Experiment 1. The spider monkeys, having been housed in a zoo for a lengthy amount of time, had acquired a lot of experience with human faces. Therefore, given their experience of the visual world, it is not surprising that the spider monkeys built holistic representations of human face stimuli. In contrast the same hypothesis was not supported by the results of Experiment 2, specifically the striking absence of a composite effect for human faces (a familiar primate). Instead, it appears that the rhesus monkeys built holistic representations of chimpanzee faces (see Figure 4b). Although it is possible that this result is a sampling error, there are alternative explanations that warrant further investigation. For example, perhaps chimpanzee faces are similar to rhesus faces, and the holistic strategy was easily transferred. Another interesting result is that the composite effect for chimpanzee faces survived poor performance. In other words, a misaligned advantage was present in a condition where overall performance was low (see Figure 4a). This result converges with the mounting evidence in the human literature that holistic processing can be decoupled from the excellent discrimination of faces and this holds important implications for future research (see Hole, et al., 1999; Rossion & Boremanse, 2008; McKone & Crooks, 2007; Taubert, 2009).
Admittedly, the results of Experiment 2 would have been more convincing if a conspecific condition had been included, however this does not detract from the significance of the observation that spider monkeys show a composite effect for humans faces, and rhesus monkeys do not. Since visual experience cannot account for this observation, alternative explanations need to be entertained. The simplest explanation is differential development, either at the individual or species level. Individual development could account for the results of Experiments 1 and 2 if it is assumed that experience with human faces differed more between the two samples than within the two samples.
An important consideration, therefore, may be a difference in the meaningful encoding of human faces. The ‘meaningful encoding hypothesis’ assumes that faces are only recognised at the individual-level if they belong to the social in-group (McKone & Crookes, 2007; also see Bernstein, Young, & Hugenberg, 2007; Hugenberg & Corneille, 2009; Shriver, Young, Hugenberg, Bernstein, & Lanter, 2008). As a result, the mere presence of human faces in the visual environment may not be sufficient for the development of holistic processing which would require that human faces to be socially relevant. This explanation may account for the inconsistencies of the past where captive populations of nonhuman primates have not always demonstrated face-like processing for human faces (see McKone & Crookes, 2007). Accordingly, an important consideration when comparing the results Experiment 1 and 2 might be group size. The spider monkeys that were tested in Experiment 1 were living as a single bonded pair with limited experience of other conspecifics, if any. The rhesus monkeys that were tested in Experiment 2, however, were born in a large colony and were housed in a room where they had visual and auditory contact with a large number of conspecifics. Given this difference between the experimental subjects, it is impossible to rule out the role of meaningful encoding. However, meaningful encoding cannot explain why rhesus macaques experienced holistic interference in the chimpanzee face condition.
Turning to a more interesting theoretical account, the differences reported for the human face condition (Experiments 1 and 2) might be explained better by the assumption that different selection pressures, operating in the separate lineages, have changed the behavioural manifestation of holistic processing. Given that faces are salient social signals, one potentially important difference between spider and rhesus monkeys is social organisation. Spider monkeys, like chimpanzees and orangutans, organise themselves into social groups that are subject to both fission and fusion. While these groups are characterized by frequent change, male philopatry and female dispersal, the features of a rhesus monkey society include stronger group cohesion and matrilineal rank-order. How these differences in social organisation would translate into pressure on the face recognition is unclear, but the proposal is supported by the observation that the performance of rhesus monkeys in face recognition experiments has often been incompatible with the conclusions drawn from studies of other primate species. For example, captive rhesus monkeys, with substantial contact with humans, have previously failed to discriminate human faces in a novelty preference task (Pascalis & Bachevalier, 1998). A similar study found that cotton-top tamarins, raised in captivity, showed a novelty preference for human faces that was nearly as strong as the novelty preference for tamarin faces (Neiworth, et al., 2007). In recent research, there is evidence that rhesus monkeys do not perform well on face recognition tasks (Parr, et al., 2008) and that juveniles show a differential pattern of attention towards face stimuli compared with chimpanzees or humans (Paukner, Ferrari, & Suomi, 2008). Could differences in social complexity account for the inconsistencies we see in face recognition performance across the primate order? Although we are still a long way from understanding the origin and development of face recognition, these current results imply that visual experience alone cannot predict the composite effect across the primate order.
This research was supported by the Yerkes Center base grant No. RR-00165 awarded by the Animal Resources Program of the National Institutes of Health and the BBE postgraduate program awarded by Macquarie University. We also thank Bruno Rossion and two anonymous reviewers for their helpful comments.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.