|Home | About | Journals | Submit | Contact Us | Français|
Surprisingly little is known about the eye movements of chimpanzees, despite the potential contribution of such knowledge to comparative cognition studies. Here, we present the first examination of eye tracking in chimpanzees. We recorded the eye movements of chimpanzees as they viewed naturalistic pictures containing a full-body image of a chimpanzee, a human or another mammal; results were compared with those from humans. We found a striking similarity in viewing patterns between the two species. Both chimpanzees and humans looked at the animal figures for longer than at the background and at the face region for longer than at other parts of the body. The face region was detected at first sight by both species when they were shown pictures of chimpanzees and of humans. However, the eye movements of chimpanzees also exhibited distinct differences from those of humans; the former shifted the fixation location more quickly and more broadly than the latter. In addition, the average duration of fixation on the face region was shorter in chimpanzees than in humans. Overall, our results clearly demonstrate the eye-movement strategies common to the two primate species and also suggest several notable differences manifested during the observation of pictures of scenes and body forms.
Eye tracking enables the direct assessment of eye movements and has been an important method in studies of human and non-human primates, particularly monkey species. Eye movements can potentially reveal a variety of cognitive and emotional processes, from visual-spatial attention to social information processing and motivational change. Several comparative cognitive eye-tracking studies have revealed similarities in the eye movements of monkeys and humans (Keating & Keating 1982; Nahm et al. 1997; Gothard et al. 2004). However, these studies only indirectly compared the eye movements of these species. Very little research has made direct comparisons under similar conditions (but see Kobayashi & Kohshima 2001). Thus, little is currently known about the similarities and differences in eye-movement patterns of non-human and human primates. Furthermore, despite the critical importance of ape species in comparative cognitive science (Matsuzawa et al. 2006), no studies have compared eye-tracking between apes and humans. In the light of these issues, we studied eye-tracking in humans and our most closely related species, the chimpanzee (Pan troglodytes).
Early studies of eye tracking have noted that human gaze control is highly regular; viewers tend to concentrate their fixations on semantically informative regions when shown pictures of scenes or faces (Buswell 1935; Yarbus 1967). For example, when viewing a picture containing human figures, humans were more likely to fixate on human figures compared with the background, on faces compared with other parts of the body, and on the eyes compared with other facial features. Subsequent studies have repeatedly demonstrated that the human gaze is drawn not only to visually salient regions in terms of colour, contrast and edge orientation, but also to interesting or meaningful regions based on the viewer's knowledge (Henderson & Hollingworth 1999a). This active, top–down guidance of picture viewing has also been observed in monkeys and chimpanzees. In eye-tracking studies, monkeys intensively fixated on the eye region when viewing images of the faces of conspecifics (Keating & Keating 1982; Nahm et al. 1997; Gothard et al. 2004), as has been observed in humans. In other studies, chimpanzees allowed to choose various pictures of scenes by pressing buttons or levers preferentially selected pictures containing human figures (Fujita & Matsuzawa 1986; Tanaka 2003). In a study using the same procedure, monkeys exhibited visual preferences for pictures containing the face region compared with those featuring various other body parts, suggesting the importance of the face in body viewing (Fujita 1993; see also Tomonaga 1994). These observations suggest similarities in viewing behaviour among monkeys, apes and humans when looking at pictures containing scenes, bodies and faces. However, information on both similarities and differences in viewing patterns of primates is important more specifically to delineate evolutionary elements of gaze behaviour. Therefore, we made a direct, quantitative comparison of eye movements between two primate species using eye-tracking methodology.
The purpose of our study was to expand on the above-mentioned research by applying a comparative eye-tracking method to chimpanzees and humans. Of particular interest was the direct comparison of scene viewing by the two species. In our experiment, both chimpanzee and human participants were tested using the same experimental conditions and were presented with identical pictures portraying chimpanzees, humans and non-primate mammals. To promote natural gaze behaviour, the pictures consisted of complex scenes, i.e. full-body images of animals against naturalistic backgrounds. We predicted that, in accordance with previous human studies, both chimpanzees and humans would look at the informative regions disproportionately longer compared with other features of the pictures; specifically, they would look at the animal figures longer than at the background and at the face regions longer than at other parts of the body. A second goal of our study was to observe how the two species scan images of the body. The face was of particular interest, because this region is highly informative for these social primates. Studies of non-verbal communication suggest that the face conveys a wealth of information, such as identity, sex and emotion, in both species (Argyle 1967; Van Hooff 1967).
In addition to the traditional index of looking behaviour and looking duration, we also examined the sequence of fixations as participants viewed the pictures. In human eye tracking, the face image within a scene is detected immediately after picture onset, suggesting rapid processing of person information (Fletcher-Watson et al. 2008). The same phenomenon was confirmed in a visual search task (Hershler & Hochstein 2005). We replicated this visual search task in a related study and observed the rapid detection of the faces of conspecifics in chimpanzees (Tomonaga in press). In this related study, the three adult chimpanzees showed quicker response time to faces than to artificial object targets. The present study addressed the same issue using an eye-tracking paradigm under free-viewing conditions. We predicted that first fixations were more likely to be located on the face than on other features of the pictures. We also examined the effect of prior visual experiences on face detection. The chimpanzee and human participants had less (or no) experience of observing non-primate mammals than of viewing their own or closely related species. The goal of this comparison was to determine how patterns of fixation on faces when viewing animals with familiar morphology are generalized when viewing animals with unfamiliar morphology.
Six chimpanzees (P. troglodytes; one male juvenile, two female juveniles and three female adults) and 21 humans (five males, 16 females; all adults) participated in this study. The chimpanzees live within a social group of 14 individuals in an environmentally enriched outdoor compound and an attached indoor residence (Matsuzawa 2006). The chimpanzees have been familiar with humans since birth. The care and use of the chimpanzees adhered to the 2002 version of the Guidelines for the Care and Use of Laboratory Primates by the Primate Research Institute, Kyoto University. The experimental protocol was approved by the Animal Welfare and Care Committee of the same institute. The human participants were graduate and undergraduate students (all Japanese), who participated in the experiment voluntarily. Thirteen of the 21 humans had extensive experience with observations of chimpanzees.
The chimpanzee and human participants used exactly the same apparatus to allow for direct comparisons. A corneal reflection technique was used to measure eye movements. We used a table-mounted eye tracker (60Hz; Tobii X120, Tobii Technology AB) with a wide-angle lens (±40° in the semicircle above camera), which allowed for relatively large head movements of participants during tracking and also obviated the necessity to restrain them (figure 1). All participants sat in an experimental booth (2.5m wide×2.5m deep×2.1m high), with the experimenter and the participants separated by transparent acrylic panels. The participants viewed a 17-inch LCD display (1280×1024 pixels), and the eye tracker captured their eye movements through the acrylic panels. The eye tracker and the display were mounted on a movable platform (0.6×0.6×0.4m). The distance between the platform and the participants was adjusted to the point at which the gaze was most accurately recorded (approx. 60cm). One degree of gaze angle corresponded to approximately 1cm on the screen at a 60cm viewing distance.
Although the eye morphologies of the two species are somewhat different (Kobayashi & Kohshima 2001), the accuracy of gaze data was quite similar. In preliminary recordings, we confirmed that the average error when viewing the screen (the distance between measured and intended gaze points) was less than 0.5° in both species. Although the eye tracker sometimes lost the participants' eyes because of postural changes or eye blinks, these values were similar for both species (chimpanzees: 5.9±1.3%; humans: 7.1±1.5% in 3s presentations (±s.e.m.)). Thus, no special corrections of the raw tracking data were necessary in this study.
Non-primate mammals (hereafter, ‘mammal’), chimpanzees and humans were used as picture models. The pictures consisted of single, foregrounded, full-body images of an individual with a naturalistic, complex background (figure 2 for examples of the pictures shown). We prepared 24 pictures of mammals representing 19 species that are typically seen at zoos (e.g. bear, rhinoceros, giraffe and elephant). We also prepared 24 chimpanzee pictures and 24 human pictures of 12 individuals each (six each of individuals familiar and unfamiliar to all chimpanzee participants and to 11 out of 21 human participants). The pictures were manipulated so that the variance of the size of animal models was small. Pictures were then converted to 1000×800 pixels with surrounding grey frames (1280×1024 pixels in total).
Before presenting the pictures, we conducted habituation training of the participants and calibration of the eye tracker to obtain accurate gaze recordings. Both species followed nearly the same procedure with some exceptions. Below, the procedure for chimpanzees is described first.
In the initial training, chimpanzees were required to face the screen so that their corneal reflections in both eyes were captured by the eye tracker. The trial succeeded if they maintained this posture for 1.5s. After being well habituated to this procedure, a two-point calibration was conducted for each chimpanzee. To attract their attention, a small video clip (200×200 pixels) was presented at each calibration point. Once successfully calibrated, the chimpanzees were required to look at the fixation point (a red square), which appeared at a random position on the screen. The purpose of this training was to evaluate calibration accuracy. The trial succeeded if the participant fixated on the fixation spot for at least 1s within a distance of 3°. The calibration was repeated if it was not initially accurate enough. The chimpanzees were rewarded with a piece of apple for successful trials. The performance criterion to advance to the next training/calibration phase was set to 90 per cent of trials. All chimpanzees finished the entire procedure within two weeks of daily sessions of 10–15min each.
The trial was initiated by the subjects looking at a fixation point that appeared at a random position on the screen. If the chimpanzees held the fixation position for 250ms, a picture was presented. Once the picture appeared, they could freely move their eyes to look at the picture. Each picture was presented for 3s. The chimpanzees were rewarded after the presentation regardless of their viewing behaviours. The purpose of this reward was to maintain the chimpanzees' motivation to participate in the testing. The entire testing of chimpanzees was conducted over a 10d period. A daily session lasted for 5–10min with presentations of only seven or eight pictures, to keep their interest. These pictures were randomly drawn from each stimulus group. Each chimpanzee viewed 72 pictures in total. Before testing on each day, we evaluated the calibration record of the individual on five fixation spots and conducted a recalibration if necessary.
The procedural differences between species were as follows: (i) no reward was given to humans during the experiment, (ii) humans were verbally instructed to view the pictures as they normally would, and (iii) all training and testing in humans were conducted within a single day and lasted for 20–30min.
We divided each picture into several features (areas of interest; AOI) to quantitatively analyse participants' viewing patterns. Two scene features were defined: background and entire body. For more detailed analysis, entire body was divided into face and other parts of the body. The other parts of the body region was further divided into three regions: torso, arms and legs. For mammal pictures, the arms and legs regions were combined into limbs. Each feature was mapped on a diagram within a frame (figure 3 for examples). To avoid errors in gaze estimation, the frame was drawn slightly larger than the actual outline (approx. 20 pixels on the edge). The recorded eye-tracking samples were added to the AOI region if they were within the frame. If two or more AOIs were duplicated, the samples were added to the upper AOI.
We used five dependent variables indicating attention: gaze time (the sum of fixation duration); proportion of fixations (the proportion of pictures in which any AOI was the target of the fixation); number of; fixation duration; and distance between fixations (saccade size, i.e. the size of the rapid eye movement that shifts the gaze from one fixation to another). A fixation was scored if the gaze remained stationary (within a radius of 50 pixels) for at least 75ms (more than five measurement samples). Otherwise, the recorded sample was defined as part of a saccade. In order to limit analysis to the visual information actually available to the participants, we excluded the samples recorded during the first 200ms, thereby eliminating fixations that followed the offset of the fixation spot. We also excluded samples recorded during saccades.
For statistical analyses, we distinguished between within-species and between-species comparisons. For within-species comparisons, we tested for differences in viewing patterns over scene/body features within each species. For between-species comparisons, we tested for interactions between species and scene/body features in viewing patterns. These comparisons were tested using repeated-measures analysis of variance (ANOVA) or paired t-tests in SPSS 13.0. Greenhouse–Geisser's epsilon was used for conservative adjustments to the degrees of freedom when sphericity did not hold. Post hoc comparisons were conducted using Dunnett tests for within-species comparisons and t-tests with Bonferroni correction for between-species comparisons. Alpha was set at p<0.05 for all analyses, although Bonferroni correction adjusted the alpha level for the number of between-species comparisons of scene/body features.
Typical scan paths of chimpanzees and humans illustrate that the two species exhibited similar but distinctive patterns of eye movement (figure 2). Eye-movement patterns were quantified by dividing each picture into several scene/body features, as follows.
Consistent with our prediction, both chimpanzee and human participants looked at the entire body for longer than at the background region when viewing pictures of mammals (t5=6.19, p=0.002; t20=23.1, p<0.001, respectively), chimpanzees (t5=4.51, p=0.006; t20=20.5, p<0.001, respectively) and humans (t5=1.82, p=0.128 (n.s.; see next paragraph for this exception); t20=9.30, p<0.001, respectively; figure 3). Focusing on the entire body, we confirmed a non-random pattern in gaze time on each body feature (face, torso, etc.) by both chimpanzee and human participants when viewing pictures of mammals (F2,10=51.2, p=0.003; F1.2,24=733, p<0.001, respectively), chimpanzees (F2,10=33.6, p<0.001; F1.4,28=584, p<0.001, respectively) and humans (F2,10=13.5, p=0.014; F1.1,22=153, p<0.001, respectively; figure 3). Post hoc tests (Dunnett tests with the face region as a control) revealed that the face was the most extensively inspected region among all body features in both species, which was consistent with our predictions.
The sum of gaze times was not equal in chimpanzees and humans for two reasons. First, chimpanzees made more saccades than humans (see below), and the gaze time did not include these saccades (§2). Second, chimpanzees sometimes glanced outside of the pictures (120ms on average), while humans rarely did so. As an interesting exception, one chimpanzee participant looked at the entire body for a shorter period than at the background region when viewing pictures of humans (707 versus 1649ms), which was opposite to the tendency of the other participants. The other five chimpanzees viewed the entire body for longer than the background (1585±123 versus 562±120ms (mean±s.e.m.); t4=4.26; p=0.013). To examine the effect of model familiarity on gaze time in chimpanzee/human pictures, we tested the interactions between model familiarity and scene features (background, face and other parts of the body) for each species. However, no significant interaction was found either in chimpanzee or human participants (n=6, n=11, respectively; see ‘Stimuli’ for details) when viewing pictures of chimpanzees (F2,10=1.86, p=0.20; F2,20=1.93, p=0.17, respectively) and humans (F2,10=0.79, p=0.48; F2,20=0.28, p=0.75, respectively). Thus, this factor was not further considered in this study.
In the between-species comparisons, a significant interaction was found between species and scene features (entire body and background) when participants were shown pictures of mammals (F1.1,29=19.8, p<0.001), chimpanzees (F1.0,27=9.29, p=0.004) and humans (F1.0,25=5.22, p=0.03; figure 3). Similarly, the interactions between species and body feature (face, torso, etc.) were also significant when subjects were shown pictures of mammals (F1.2,31=13.1, p<0.001), chimpanzees (F1.1,29=31.8, p<0.001) and humans (F1.5,39=10.8, p<0.001; figure 3). Post hoc tests (t-tests with Bonferroni correction) revealed that human participants looked at the face region for longer compared with chimpanzees when viewing any pictures. On the other hand, chimpanzees looked at some parts of the body for longer compared with humans, e.g. the torso and legs in pictures of chimpanzees and the arms and legs in pictures of humans. Although humans looked at the entire body region for longer compared with chimpanzees, this difference can be attributed to their longer viewing of the face region (figure 3).
Since we were primarily interested in general differences in these variables between species, these analyses only involve between-species comparison. To avoid redundancy, all pictures were pooled, and torso, arms and legs (or limbs) regions were combined into other parts of the body. In general, chimpanzees exhibited a greater number of fixations (F1,25=278, p<0.001), longer fixation durations (F1,25=289, p<0.001) and longer distances between fixations (t25=6.54, p<0.001) compared with humans (figure 4). These results indicate that chimpanzees shifted their fixation location more quickly (i.e. showed more saccades) and more broadly than did humans. There were significant interactions between species and scene/body features in the number of fixations (F1.3,34=7.34, p=0.006) and fixation duration (F1.1,28=17.0, p<0.001; figure 4). Post hoc tests (t-tests with Bonferroni correction) confirmed that, compared with humans, chimpanzees exhibited a greater number of fixations on other parts of the body (t25=5.84, p<0.001) and on the background region (t25=3.15, p=0.004), but not on the face (t25=1.32, p=0.198; figure 4a). In addition, chimpanzees exhibited shorter fixation durations on the face (t25=4.52, p<0.001) and on other parts of the body (t25=2.57, p=0.016), but not on the background (t25=1.92, p=0.065; figure 4b) compared with humans.
We confirmed a non-random pattern in the first fixation on each scene/body feature by both chimpanzee and human participants when shown pictures of mammals (F3,15=19.1, p<0.001; F1.1,29=92.6, p<0.001, respectively), chimpanzees (F1.5,7.7=14.0, p=0.004; F1.2,25=292, p<0.001, respectively) or humans (F1.2,6.1=11.7, p=0.012; F1.0,21=89.2, p<0.001, respectively; figure 5a). Post hoc comparisons (Dunnett tests with face region as a control) revealed that both species were most likely to attend to the face region during the first fixation when shown pictures of chimpanzees and humans (figure 5a). Interestingly, we observed differential first fixation patterns between species when they were shown pictures of mammals. The chimpanzees were less likely than humans to locate their first fixation on the face when shown pictures of mammals (t-tests with Bonferroni correction; t25=3.04 p=0.005; figure 5a).
The high frequency of first fixations on the face was further examined by comparisons with subsequent fixations on this region (figure 5b). Both chimpanzees and humans differentially fixated on the face across time when viewing pictures of mammals (F4,20=10.7, p<0.001; F4,80=56.3, p<0.001, respectively), chimpanzees (F1.3,6.6=6.10, p=0.002; F4,80=39.0, p<0.001, respectively) and humans (F4,20=10.6, p<0.001; F2.4,47=46.5, p<0.001, respectively; figure 5b). Post hoc comparisons (Dunnett tests with first fixation as a control) confirmed that the peak proportions of fixations on the face occurred with the same timing in both species: during second fixations for animal pictures and during first fixations for chimpanzee and human pictures.
This study provides the first eye-tracking data for chimpanzees viewing pictures. In comparing results for chimpanzees with those for humans, we discovered striking similarities in the gaze patterns of the two species, especially in terms of non-random patterns of scene viewing and the rapid detection of the face within the scene. We also observed several interesting differences between the species. Chimpanzees exhibited more rapid and broader shifts of the fixation location as well as shorter fixation durations in regard to the face region.
When shown pictures, the chimpanzees focused their fixations on informative regions, such as the body or face, in a manner very similar to humans (figures 3 and and4),4), suggesting their active, voluntary control of gaze. These results were consistent with classic studies of humans (Buswell 1935; Yarbus 1967) and extend the observed paradigm to another species, the chimpanzee. Furthermore, both species repeatedly fixated on these informative regions, rather than scanning the entire area of the picture (figures 3 and and5).5). In humans, this type of repetitive looking is known to indicate gaze control based on semantic informativeness of the region (Henderson & Hollingworth 1999a).
On the other hand, the two species also exhibited differences in their viewing patterns. The general pattern of chimpanzee eye movements was characterized by fixations that were greater in number, shorter in duration and more widely spread in space compared with those of humans (figure 4). Two possible explanations for these interspecific differences involve the control and the function of eye movements. In terms of the mechanisms controlling eye movements, these differences may be influenced by the difference in high-level cognitive functioning between species. For example, the weaker inhibitory control of saccades in chimpanzees may have led to more frequent saccades and shorter fixations, similar to those observed in human infants and in some clinical patients (Hainline et al. 1984; Karatekin 2007). In addition, the shallower extraction of information at each fixation point may have resulted in shorter fixation durations in chimpanzees, because fixations tend to remain in semantically informative regions longer, as has been shown in studies of humans (Henderson & Hollingworth 1999a). With regard to the functions of eye movements in the natural habitat, chimpanzees and humans may possess different strategies for processing scene information. Given that in humans, close or direct fixation is typically necessary to identify objects in scenes and to perceive their visual details (Henderson & Hollingworth 1999b), the more frequent and wider rotation of eyes that characterizes chimpanzees may represent a strategy for more quickly and widely retaining scene information.
The face region was most intensively viewed by both species compared with other parts of the body (figure 3). This suggests that chimpanzees and humans share a common strategy of body viewing that is influenced by the informative quality of the face. Notably, longer durations of looking at the face occurred even when participants viewed pictures of mammals with various body/face morphologies, highlighting the generality of visual preferences for the face by both species. However, humans viewed the face for longer compared to chimpanzees. This difference was attributed to longer fixation durations on the face in humans, because the number of fixations on this region did not differ between the species (figure 4). The longer duration of fixations on the face in humans was explained in part by humans' longer fixation durations on the entire picture; however, there was a significant interaction between species and each feature of the pictures (figure 4). In social interactions, long fixation on a face is perhaps a more intense signal of threat in non-human primates than in humans (Argyle 1967; Thomsen 1974; Mendelson et al. 1982). Thus, the observed pattern in chimpanzees—repetitive and brief looks at the face—may be explained by a higher likelihood of avoiding direct gaze contact. In addition, interestingly, chimpanzees looked at some parts of the body longer than humans did when viewing pictures of chimpanzees and humans (figure 3). Furthermore, chimpanzees exhibited a larger number of fixations on other parts of the body compared with humans (figure 4). Together, these observations may also indicate that humans retrieve animal information primarily from the face, whereas chimpanzees gain relatively more information from the entire body.
The face region was detected at first sight by both species when they were shown pictures of chimpanzees and of humans (figure 5a,b). This result echoes previous findings in humans (Hershler & Hochstein 2005; Fletcher-Watson et al. 2008) and chimpanzees (Tomonaga in press), and indicates that rapid processing of complex socially relevant visual information also occurs in chimpanzees. By contrast, when shown pictures of mammals, the face was attended to at second rather than at first fixation by both species (figure 5b), perhaps because both chimpanzees and humans responded less sensitively to the relatively unfamiliar body/face morphology of other mammals compared with the familiar morphology of their own or closely related species. Interestingly, when shown pictures of mammals, chimpanzees were less likely than humans to locate their first fixations on the face (figure 5a). Because these chimpanzees have fewer opportunities than humans for observing non-primate mammals, these results further underscore the effects of prior experience on the rapid detection of faces. Our findings may be related to the ‘expertise effect’ for face detection, which has previously been found in humans (Hershler & Hochstein 2005).
Further studies are necessary to address the hypotheses generated from our experiments. This study successfully provided a direct, quantifiable measurement of the eye movements of chimpanzees within a comparative cognitive context. What we could learn using this paradigm is promising: it guides an understanding of what details are informative and attractive for chimpanzees in their environments, as well as their serial processing of information; in other words, where and when their saccades go. Also, by presenting visual stimuli with other sensory cues such as their vocalizations, this paradigm can access how chimpanzees react to stimuli based on their preconception or motivation, which is analogous to task-dependent gaze behaviour in humans (Yarbus 1967). In conjunction with recently developed non-invasive event-related potential (ERP) measurement in chimpanzees (Ueno et al. 2008), it may open ways to examine when and how information is processed in the chimpanzee brain. Even for studies in more elusive concepts, such as in theory of mind, the eye-tracking paradigm may provide a more objective measurement by compensating for the traditional looking time paradigm or using, for example, an anticipatory looking paradigm (Southgate et al. 2007).
This research was financially supported by the Japan Society for the Promotion of Science (JSPS) and the Ministry of Education, Culture, Sports, Science and Technology (MEXT) of Japan Grants-in-Aid for Scientific Research (nos. 16002001, 19300091, 20002001) and JSPS/MEXT global COE programmes (D07 and A06). We thank Dr T. Matsuzawa, Dr M. Tanaka, Dr I. Adachi, Mr C. Martin and the staff of the Language and Intelligence Section for their help and invaluable comments. The Higashiyama Zoo and Chimpanzee Sanctuary Uto kindly allowed us to use photographs of their animals. We also thank the Center for Human Evolution Modeling Research at the Primate Research Institute for daily care of the chimpanzees.