|Home | About | Journals | Submit | Contact Us | Français|
The own-race bias (ORB) is a well-known finding wherein people are better able to recognize and discriminate own-race faces, relative to cross-race faces. In 2 experiments, participants viewed Asian and Caucasian faces, in preparation for recognition memory tests, while their eye movements and pupil diameters were continuously monitored. In Experiment 1 (with Caucasian participants), systematic differences emerged in both measures as a function of depicted race: While encoding cross-race faces, participants made fewer (and longer) fixations, they preferentially attended to different sets of features, and their pupils were more dilated, all relative to own-race faces. Also, in both measures, a pattern emerged wherein some participants reduced their apparent encoding effort to cross-race faces over trials. In Experiment 2 (with Asian participants), the authors observed the same patterns, although the ORB favored the opposite set of faces. Taken together, the results suggest that the ORB appears during initial perceptual encoding. Relative to own-race face encoding, cross-race encoding requires greater effort, which may reduce vigilance in some participants.
It is well documented that people are better able to distinguish among own-race faces, relative to faces from different, less familiar races (e.g., Meissner & Brigham, 2001; Slone, Brigham, & Meissner, 2000). This empirical finding in face learning and recognition, the own-race bias (ORB; also called the own-race effect, cross-race effect, and outgroup homogeneity effect) is observed across various tasks and is reliable across cultural and racial groups (Ng & Lindsay, 1994; Teitelbaum & Geiselman, 1997). Theoretical accounts of the ORB have posited both social and cognitive mechanisms. Early explanations, such as effects of social attitudes or magnitudes of physiognomic differences across races, do not adequately explain the ORB (Meissner & Brigham, 2001). For example, studies have consistently failed to find a relationship between racial attitudes and memory for cross-race faces (Slone et al., 2000; Swope, 1994). With respect to potential physiognomic differences, Goldstein (1979) found no differences in the magnitudes of physiognomic variability among Japanese, Black, and Caucasian faces.
Various hypotheses have addressed the potential influence of interracial contact as an engine of perceptual learning (Meissner & Brigham, 2001; Sporer, 2001). Multidimensional scaling (MDS) or “face space” models (Byatt & Rhodes, 2004; Valentine, 1991; Valentine & Endo, 1992) all share common mechanisms suggested by Gibson (1969), who proposed that developing perceptual expertise in any domain involves “an increase in the ability to extract information from the environment, as a result of practice” (p. 3). In the MDS framework, faces are stored in a hypothetical perceptual-cognitive space, with dimensions defined by values of features or learned configurations of features (Rhodes, Brake, Taylor, & Tan, 1989). The dispersion of exemplars in face space reflects prior experience: Perceptual learning optimizes the “attention weights” for different features (Nosofsky, 1986), allowing an observer to better appreciate subtle differences across faces. In MDS terms, greater experience with own-race faces will reduce the spatial density of their representations. In contrast, cross-race faces are more tightly grouped in space, owing to less optimal featural or configural encoding. In the present investigation, we directly assessed whether the ORB might reflect differences in immediate perceptual encoding of faces. We did this by examining eye movements and changes in pupil dilation during own-race and cross-race face learning.
Theoretical accounts of the ORB typically focus on either race-based differences in perceptual encoding or differences in face storage and retrieval. Levin (1996, 2000; MacLin & Malpass, 2001) proposed an explanation for the ORB wherein race is processed as a basic visual feature (cf. Treisman & Gelade, 1980); cross-race faces are processed at a “shallow” level, with more categorical (and less individuating) information. Conversely, own-race face processing is hypothesized to rely more on precise featural information (Anthony, Copper, & Mullen, 1992; Ostrom, Carpenter, Sedikides, & Li, 1993; Valentine & Bruce, 1986). Along similar lines, Sporer (2001) proposed an in-group/out-group model, suggesting that perception of own-race faces automatically initiates a deep-encoding process. This process guides attention toward any facial features that help distinguish the target from similar faces in memory. In contrast, in cross-race face perception, race categorization precedes any deeper processing. For a variety of reasons (e.g., economy of effort, social biases), this automatic categorization may signal that little effort should be dedicated to individuation.
Although the preceding accounts differ in key regards, they are difficult to discriminate from one another empirically, particularly when testing recognition memory. Although it is well known that cross-race faces elicit poor recognition (usually with inflated false alarms), such findings are not particularly revealing with respect to processing. For example, do such recognition differences arise during perceptual encoding (as the in-group/out-group model would predict), or do they arise in storage and retrieval from long-term memory (as MDS would predict)? If cross-race faces all look alike to viewers (Sporer, 2001), will people make a special effort to encode them? Or, alternatively, might they adopt the less strenuous strategy of shallow race encoding (Levin, 1996)? In the present research, we replicated the classic ORB in a discrete recognition-memory procedure: Participants viewed a series of Asian and Caucasian faces, under instructions to memorize them for a later test. While participants were viewing the faces, we continuously monitored their eye movements and changes in pupil dilation. Both measures were considered indices of mental effort during visual information-gathering: By examining them in tandem, we directly assessed differences in perceptual learning for own-race and cross-race faces. To our knowledge, no prior study has simultaneously assessed these indicators of face learning, nor have they been applied in tandem to the ORB (although Blais, Jack, Scheepers, Fiset, and Caldara, 2008, recently examined the ORB using eye movements).
From any theoretical perspective, the selection of information during face encoding should be an effective predictor of later recognition accuracy. Tracking eye movements is an effective means to study information intake, as eye movements to facial features provide an index to the allocation of visual attention (Findlay & Gilchrist, 2003; Yarbus, 1965). Henderson, Williams, and Falk (2005) suggested that eye movements may affect face learning in at least two ways. First, fixations on (or near) specific features help encode those features and their interrelations. Second, the lengths of saccades among features may provide direct information about their relative distances, such that eye movements may themselves be encoded as face-specific information. In either case, given the powerful ORB in recognition memory, it is important to establish whether patterns of information selection differ across own-race and cross-race faces. To evaluate this hypothesis, we measured the frequency and durations of fixations to various areas of interest (AOIs) in faces (Williams & Henderson, 2007), and we calculated several global indices, such as saccade lengths and number of regressions. We then tested whether these visual behaviors differed when participants studied own-race versus cross-race faces. As described later, all eye movement measures told similar stories. As such, we primarily focused on fixations to different AOIs and on a cumulative measure of “distance traveled” by the eyes during encoding.
Analyses of eye movements were conducted in three stages. First, we simply compared indices across sets of Asian and Caucasian faces, averaging all participants together. We also compared eye movement patterns across encoding trials that would lead to eventual hits and misses during recognition. To foreshadow, we found reliable differences in fixation patterns across races and differences as a function of later accuracy. In our second stage of analysis, we selected subgroups of participants and trials, allowing direct comparison. Specifically, we identified participants who were relatively good or poor at cross-race face recognition, and we examined their eye movements to photographs that eventually led to hits or misses. These analyses showed that, irrespective of accuracy in any given trial, better memorizers expended greater effort (i.e., made more extensive eye movements) during encoding. In our third stage of analysis, we again compared relatively good and poor performers, this time examining patterns of eye movements across learning trials. Our findings were surprising: As a group, participants who were poor at cross-race recognition showed a pattern of progressively diminishing efforts in their eye movements, gathering less visual information to cross-race faces as the experiment unfolded in time.
In this article, we propose that eye movements, as a proxy for information gathering, can be used to estimate cognitive effort during encoding. To provide converging evidence for this hypothesis, we also collected continuous measures of participants’ pupil diameters. It has long been reported that when people perform more difficult cognitive operations, their pupils dilate (Porter, Troscianko, & Gilchrist, 2007). As reviewed by Beatty (1982), the pupil response has many attractive qualities as a dependent measure, prompting Kahneman (1973) to adopt it as his primary index of mental processing load in his theory of attention. As Beatty (1982, p. 276) wrote:
Kahneman proposed three criteria for any physiological indicator of processing load: It should be sensitive to within-task variations in task demands produced by changes in task parameters; it should reflect between-task differences in processing load elicited by qualitatively different cognitive operations; finally, it should capture between-individual differences in processing load as individuals of different abilities perform a fixed set of cognitive operations.
Following this description of Kahneman’s (1973) criteria, Beatty (1982) reviewed evidence showing that pupil dilation satisfied each. In the present research, we examined pupil responses in all three manners that Kahneman described, comparing within- and between-subjects changes in pupil diameters across conditions. Taken together with eye movements, this allowed a direct assessment of the hypothesis that the ORB reflects relatively shallow encoding of cross-race faces (e.g., Levin, 1996). Our results suggest tight connections between mental effort and pupil dilation, and convergent evidence that eye movements, at least in the present procedure, can be used to index mental effort during learning. Our results suggest that people expend greater effort while encoding cross-race faces but that some participants selectively reduce such effort after a few trials.
Although pupillometry is a sensitive measure of cognitive processing, pupils also change reflexively, based on visual input. As discussed by Porter et al. (2007), variations in luminance across stimuli, sudden onsets of stimuli, and variations in color can all induce pupil responses. For these reasons, some researchers (e.g., Goldwater, 1972) cautioned against using pupillometry with pictorial stimuli. To surpass these challenges, researchers may create experiments wherein visual materials are identical across conditions, manipulating only the participants’ tasks (e.g., Brown et al., 1999). Alternatively, Porter and Troscianko (2003) found that pupil reflexes could be minimized by using relatively low stimulus contrast, avoiding colored stimuli, and using relatively long exposure times. The experiment by Porter et al. (2007) is particularly germane to the present research. In their study, participants performed visual search to varying displays (e.g., small vs. large set sizes), with natural eye movements. They observed a tight correspondence between reaction time measures of search difficulty and pupil dilation.
In the present research, we applied a combination of methods to minimize the influence of pupils’ visual reflexes. Because our experiments inherently involved presentation of different visual images (Asian and Caucasian faces), we naturally had variations across stimulus displays. However, our materials were well controlled: We used photographs from Ekman and Matsumoto (1993), which were created under constant conditions of lighting and background, with all models wearing identical shirts. We presented all photos in grayscale, for relatively long periods (5 or 10 s). Most important, we conducted analyses across conditions, and across participants within conditions, wherein the same images were viewed. Indeed, because we collected eye fixation patterns, we also conducted pupil dilation analyses wherein participants’ eyes were focused on the same AOIs across conditions. Finally, to be certain that our observations reflected the ORB, rather than properties of the stimulus photographs, we conducted complementary experiments with Caucasian participants (Experiment 1) and Asian participants (Experiment 2). Together, these procedures allowed us to assess whether the ORB might appear during perceptual encoding of faces.
In a meta-analytic review covering 30 years of research, Meissner and Brigham (2001) noted that study time strongly affects the ORB, particularly through increased false alarms to cross-race faces when the study time is brief. In Experiment 1, our goal was to replicate the standard ORB, contrasting recognition memory to Asian and Caucasian faces, as a function of study time. Volunteers viewed a series of faces, all with neutral expressions, with explicit instructions to memorize them for a later test. These materials had previously been found to induce a strong ORB (Kleider & Goldinger, 2006). Other than the implicit (within-subjects) contrast of face races, we included a between-subjects manipulation of exposure time during initial learning, with values of 5 and 10 s. During the study period, eye movements and pupil diameters were continuously monitored. Face memory was later assessed in a standard recognition test. We had several expectations for Experiment 1. First, we expected most participants to actively gather information (move their eyes) throughout a 5-s exposure but that such efforts might diminish over a 10-s exposure. Second, we expected changes in pupillary responses to corroborate the effort indicated by eye movements. Third, we expected these indices of effort to reliably predict recognition accuracy. Fourth, we expected all indices to reflect differences in depicted races.
The initial sample included 46 Arizona State University students, all volunteers who received course credit. All participants reported either normal or corrected vision. From the original sample, data were excluded from six students. Two were Asian, 3 had excessive periods of eye-tracking failure, and 1 failed to complete the recognition test. The final sample included 40 students, with 20 per study-time condition. All were Caucasian and reported no special familiarity with Asian faces.
The stimuli were 52 facial photographs from Ekman and Matsumoto (1993), with 26 Japanese and 26 Caucasian models, each with 13 men and 13 women. The original digitized photographs (860 × 600 pixels) were converted to grayscale, set to equal mean luminance, and embedded in a black background (1024 × 768 pixels) for presentation. The faces were approximately the size of real faces (approximately 20 cm in height), viewed at a distance of 60 cm. They subtended average visual angles of 7.97° horizontal and 10.85° vertical. For each participant, the learning phase included 26 faces, and the recognition phase included all 52. To minimize potential effects of list composition, we created four quasi-random study lists and used them equally (for 10 participants each). All faces were used equally often as “old” or “new.”
A 2 × 2 mixed-model design was used, with the within-subjects factor Race (Caucasian, Asian) and the between-subjects factor Time (5 s, 10 s).
The stimuli were displayed at a resolution of 1024 × 768 pixels on a Tobii 1750 17-in. (43.18-cm) monitor. We monitored eye movements and pupil dilation at 50 Hz using the built-in Tobii Eye Tracker 1750 (Falls Church, VA), with Clear-View software managing the experiment and collecting the data. A chin rest maintained the participant’s viewing position and distance at 60 cm. Both eyes were tracked continuously throughout the experiment.
Participants were first given a description of the experiment. The chin rest was adjusted, the eye-tracking task was explained, and any questions were answered. Participants were informed that the experiment comprised two parts, a learning session and a recognition test. Participants were then asked to settle onto the chin rest. To calibrate the eye tracker, participants fixated a series of blue dots presented on the display monitor. The calibration establishes the mapping between a person’s known gaze position on the display and the eye tracker’s estimate of that position. Participants were recalibrated whenever the criterion defined by the eye tracker was not met. Nearly all (39) participants were successfully calibrated on the first or second try. After calibration, participants viewed a series of faces, and their eye movements and pupil diameters were recorded, with samples taken every 20 ms. The presentation time of each photo was 5 s for 20 participants and 10 s for another 20 participants. In both groups, each trial began with the presentation of a central fixation cross for 1,500 ms, followed by a blank screen (gray, with an RGB value of 150) for 1,000 ms, then the photograph. After the offset of each photograph, a 2,000-ms interstimulus interval was initiated. Pupil diameters were continuously collected during the interstimulus interval, fixation, and blank screens, allowing estimates of their baseline states.
After the study phase, participants played a computer game called Escape for 3 min. Then participants were asked to distinguish among all 52 faces, randomly ordered, pressing keys corresponding to “new” or “old.” For recognition, we used the original, color photographs, reducing the total visual overlap between “old” study and test items. Each photo was presented on the screen until a response was entered or until 10 s elapsed.
We first analyzed recognition to verify that the ORB occurred. Table 1 shows hit and false-alarm rates and four associated indices. For each participant, we calculated signal-detection measures for sensitivity d′ and the intersection bias measure C. As discussed by Feenan and Snodgrass (1990), C is preferable to β for recognition memory, as β may correlate with discrimination. C is centered around zero: Positive values represent a conservative bias; negative values represent a liberal bias. The other two indices were Pr and Br, the sensitivity and bias measures from the “two-high threshold” model of recognition (Snodgrass & Corwin, 1988). Pr is a common accuracy score, representing the difference of hits and false alarms. Br is defined as the probability of responding “old” despite uncertainty, Br = FA/(1 – Pr), and is centered around .5. Values lower than .5 reflect a conservative bias; values above .5 reflect a liberal bias. Although we show d′ and C values in Table 1 (given their familiarity), our analyses focused on Pr and Br, with simple contrasts among hits and false alarms.
The recognition data were analyzed in 2 × 2 mixed-model analyses of variance (ANOVAs), with Race as the within-subjects factor and Time as the between-subjects factor. In sensitivity, we observed a robust ORB: Mean Pr scores were higher to Caucasian faces (.75) than to Asian faces (.45), F(1, 39) = 66.12, MSE = 0.07, . (Throughout this article, all statistical tests assume α = .05.) A main effect of Time was also observed, F(1, 39) = 39.01, MSE = 0.08, , with Pr scores increasing (by .19) when participants had more time to encode the faces. The interaction of Race × Time was not reliable, F(1, 39) = 1.95, ns. Planned contrasts showed a surprising degree of symmetry among hits and false alarms. Although the ORB is often considered a false-alarm effect, we observed higher hit rates to Caucasian faces in both the 5-s (by 6.7%) and 10-s (by 8.5%) conditions, Fs(1, 19) > 15.0, p < .001. As is normally reported, we also found increased false alarms to the Asian faces in both the 5-s (by 13.8%) and 10-s (by 7.9%) conditions, Fs(1, 19) > 18.0, p < .001.
As shown in Table 1, performance on the recognition test was generally unbiased. There was no main effect of race, F(1, 39) = 1.02, ns. There was, however, a main effect of Time, F(1, 39) = 9.55, MSE = 0.05, , as participants showed a more liberal criterion in the 10-s condition. Despite the apparent trend in Table 1, the Race × Time interaction was not reliable F(1, 39) = 2.77, ns.
To simplify the analyses of eye fixations, we divided each face into 11 nonoverlapping regions corresponding to principal features (Henderson, Falk, Minut, Dyer, & Mahadevan, 2001; Minut, Mahadevan, Henderson, & Dyer, 2000). These AOIs corresponded to the eyes, nose, cheeks, mouth, chin, forehead, ears, and hair. An example, with overlaid eye movements for one participant is shown in Figure 1. With these AOIs as a basis, we generated fixation counts per AOI (defined as the eyes remaining in a 30-pixel area for at least 100 ms),1 the distance traveled (in pixels) by each valid saccade, the number of unique AOIs visited per face, the number of returns to previously visited AOIs (regressions), and the average fixation times per AOI. For analysis, valid saccades were defined as eye movements that ended somewhere within the borders of the face or hair, rather than the background. Only 1.1% of all saccades were rejected as invalid.
With respect to eye movements, our first general question was whether participants showed systematic differences while gathering information from own-race and cross-race faces. Table 2 shows means for total fixations, fixation times, distances traveled, number of unique AOIs, and regressions. To ensure that all comparisons were not confounded with later recognition accuracy, we divided the results into trials leading to eventual hits and misses. In comparing the upper and lower halves of Table 2, it is apparent that eye movements were suppressed during study trials leading to eventual misses. This difference was reliable for all measures, Fs(1, 39) > 36.0, p < .0001. Thus, more wide-ranging eye movements during learning were associated with superior later recognition. Unless otherwise stated, the remaining analyses of Experiment 1 focus on study trials leading to eventual hits.2
Having verified that eye movements were related to accuracy, we focus on the upper half of Table 2, the study trials leading to recognition hits. The eye movement measures were analyzed in 2 × 2 repeated measures ANOVAs, with factors Race (Asian, Caucasian) and Time (5 or 10 s). Because the eye movement measures were accumulated during the study exposures, there were large main effects of Time. Given more time, participants naturally made more fixations, moved their eyes farther, etc. The statistical contrasts of 5 versus 10 s were robust for all measures, Fs(1, 39) > 25.00, p < .001; these are not presented in detail, given their obvious nature. Of greater interest, we observed large main effects of Race for all five measures. Participants made more fixations to Caucasian faces, relative to Asian faces, F(1, 39) = 90.35, MSE = 0.02, . While viewing Caucasian faces, they also made shorter fixations, F(1, 39) = 25.02, MSE = 15.6, , and moved their eyes farther, F(1, 39) = 53.17, MSE = 24.0, . Caucasian faces elicited fixations to more unique AOIs, F(1, 39) = 11.89, MSE = 0.02, , and fewer regressions, F(1, 39) = 40.68, MSE = 0.02, . In separate ANOVAs conducted on the 5- and 10-s groups, all 10 simple effects of Race were reliable, Fs(1, 19) > 12.00, p < .01. Generally, these effects indicate that participants made more rapid, wide-ranging eye movements while viewing Caucasian faces.
In addition to the main effects, we observed reliable interactions of Time × Race for number of fixations, F(1, 39) = 39.19, MSE = 0.03, ; fixation duration, F(1, 39) = 15.08, MSE = 12.5, ; and distance traveled, F(1, 39) = 7.20, MSE = 19.9, . In every case, the disparities between Caucasian and Asian faces grew larger when participants had more time for encoding.
As noted, our first question was whether people showed differences while gathering information from own-race and cross-race faces. The global analyses of eye movements show that participants were more active in examining Caucasian faces, relative to Asian faces. A related question is whether any qualitative differences emerged in the specific features chosen for examination. To assess this, we examined fixation proportions to each AOI. As shown in Figure 2, participants (in both time conditions) focused their attention on different features, depending on the depicted race. Given Caucasian faces, more fixations were recorded to the eyes and hair; given Asian faces, more attention was given to the nose and mouth. (As reported in Experiment 2, this pattern was reversed in a sample of Asian participants.) Examined separately, both the 5- and 10-s conditions produced reliably different frequency patterns across races, χ2(10) = 19.02 and χ2(10) = 88.30, respectively. Relative to the Asian faces, more fixations were recorded to the left and right eyes for Caucasian faces in both the 5-s condition, t(39) = 6.77 (left eye) versus t(39) = 4.16 (right eye), and the 10-s condition, t(39) = 10.45 (left eye) versus t(39) = 22.06 (right eye), with larger differences in the latter condition. The difference for hair was marginal in the 5-s condition, t(39) = 1.63, p < .06, but was reliable in the 10-s condition, t(39) = 7.40. In complementary fashion, the Asian faces attracted relatively more fixations to the nose, t(39) = 3.99 (5 s) versus t(39) = 9.14 (10 s), and the mouth, t(39) = 2.89 (5 s) versus t(39) = 12.19 (10 s). Thus, eye movements differed across face races, both in sheer quantity and in qualitative patterns.
To this point, we have addressed two questions: whether the ORB was observed in recognition and whether eye movements were generally different for own- and cross-race faces. Both answers were yes. The remaining analyses of Experiment 1 were focused on indices of cognitive effort: Did participants dedicate similar levels of effort to encoding own-race and cross-race faces? We addressed this question first via eye movements, then via changes in pupil dilation.
In a pilot version of this experiment (He, 2005), we found an unexpected pattern, most notably in the 10-s encoding condition. Although all participants moved their eyes vigorously for the first few seconds of encoding each face, many seemed to taper off as the exposure time stretched on. Although this seems natural, we also found that some participants gave up sooner than others, despite the brief experiment (26 learning trials). We later determined that such withdrawal of effort mainly occurred to cross-race faces. To assess this behavior in Experiment 1, we focused on the distance-traveled measure. Figure 3 shows mean distances traveled by the eyes, across all trials of the 10-s condition (including trials leading to hits and misses). Four functions are shown, denoting Caucasian and Asian faces, divided as a function of participant groups. Specifically, we split the participants into two groups of 10, based on their recognition sensitivity scores to Asian faces (high scoring, d′ = 2.51; low scoring, d′ = 1.29).
As Figure 3 shows, all participants moved their eyes to similar degrees while viewing Caucasian faces. (Indeed, in a parallel analysis, we divided participants according to their sensitivity only to the Caucasian faces, with high- and low-scoring d′ values of 3.19 and 2.44, respectively. Despite using their performance to Caucasian faces as the selection factor, we still found no differences in the groups’ eye movements to those faces.) By contrast, eye movements to Asian faces showed clear changes over trials, as a function of participants’ later recognition performance. For the high-scoring group, eye movements were statistically equivalent to Asian and Caucasian faces, although participants’ recognition results still showed the ORB, with d′ values of 2.51 to Asian faces and 2.97 to Caucasian faces, F(1, 9) = 12.20, p < .01, . For the low-scoring group, eye movements were equivalent to the Asian and Caucasian faces for the first few trials, but these participants appeared to selectively withdraw effort to the Asian faces as the experiment wore on. We analyzed these data in a 2 × 2 × 13 ANOVA, with factors Group (high, low), Race, and Trial Number. As suggested by Figure 3, the critical three-way interaction was robust, F(1, 19) = 77.18, MSE = 23.6, . At the level of individual learning trials, the behavior of the low-scoring group to Asian faces appeared as a brief period of active eye movements, tailing off with several seconds remaining in the 10-s learning interval. The weaker memorizers apparently decreased their encoding effort over time. As discussed next, the analysis of pupillary changes corroborated this observation.
As noted earlier, pupil diameters were measured between trials (and while the fixation cross was shown) to establish baseline estimates, then during photograph viewing for comparison. The pupil data were analyzed by first removing missing observations owing to blinks or signal loss, filling those gaps by linear interpolation. This resulted in less than 4% of data repair for all participants. Another 0.7% observations were replaced, in the same manner, for values falling more than 2.5 standard deviations from their 10 immediate neighbors. For each person, we then selected the better eye (i.e., with fewer corrected observations) for analysis. We first report an overall analysis of the experiment, then an analysis to complement the “effort reduction” findings just described. The overall results are shown in Figures 4A and 4B. In each figure, pupil diameters are shown in the baseline period (centered at zero), followed by the period of photograph viewing, with separate functions for Asian and Caucasian faces.
For analysis, the pupil data were averaged into a series of 200-ms time windows for each trial. For both the 5- and 10-s conditions, we conducted two repeated measures (Race × Time Window) ANOVAs. Two were focused on the baseline period, and two were focused on the periods of photograph viewing. In the baseline periods, there were no reliable main effects and no interactions, all Fs(1, 19) < 2.5, ns. In the postonset period of the 5-s condition (see Figure 4A), we observed a main effect of Race, F(1, 19) = 39.50, MSE = 3.1, , with greater dilation elicited by the Asian faces. An effect of Time Window was observed, F(1, 19) = 116.13, MSE = 3.1, , as dilation increased over time. The Race × Time Window interaction was also reliable, F(1, 19) = 9.75, MSE = 3.1, , mainly reflecting the first 1.5-s postonset, as the Race effect emerged.
In the 10-s condition (see Figure 4B), a different pattern was observed. Once again, there were reliable effects of Race, F(1, 19) = 77.03, MSE = 3.4, , and Time Window, F(1, 19) = 120.09, MSE = 3.4, , and a reliable interaction, F(1, 19) = 41.51, MSE = 3.4, . However, the interaction was now driven by two inflections: We observed a divergence between Asian and Caucasian faces early in viewing, followed by a convergence toward the end of viewing. As shown in the next analysis, this convergence was largely driven by the lower-scoring participants.
As shown in Figure 3, when participants were grouped according to cross-race face recognition accuracy, the low-scoring group appeared to give up on the Asian faces about halfway through the experiment. This was observed only in the 10-s exposure condition and appeared as a reduction of eye movements to the Asian faces, particularly in the later trials. Because pupil dilation is a well-known indicator of cognitive effort (Beatty, 1982; Porter et al., 2007), we asked whether a similar pattern would appear when pupil responses were examined between the same groups. As before, we focused on the 10-s condition, a period that would more likely encourage truncated efforts before time expired. The results, collapsed across all trials, are shown in Figure 5. As shown, the pattern from eye movements was also found in pupil dilation: When viewing Caucasian faces, the high- and low-scoring groups had nearly identical profiles of pupil changes. (As before, this was also statistically true when the groups were rearranged based on performance to the Caucasian faces, although the patterns were less similar.) When viewing the Asian faces, however, the groups again diverged from approximately 5 s postonset until the end of the encoding period. We analyzed these results in a repeated measures ANOVA with factors Group (low, high), Race, and Time Window. There was a main effect of Time Window, F(1, 19) = 140.17, MSE = 5.9, , as dilations generally increased over time. The main effect of Race was also reliable, F(1, 19) = 91.65, MSE = 5.9, , as Asian faces elicited more dilation. The main effect of Group was marginal, F(1, 19) = 4.26, MSE = 5.9, p < .07, , with more dilation in the high-scoring group. However, this trend was limited to the Asian faces. As a result, the Group × Race interaction was reliable, F(1, 19) = 38.11, MSE = 5.9, . Because the groups diverged over time, the three-way Group × Race × Time Window interaction was also reliable, F(1, 19) = 29.21, MSE = 5.9, .
As was observed in the eye movement data, the pupil-diameter results suggest that low-scoring participants initially put effort into encoding Asian faces but did not maintain such effort for the full period. This was not a general trait of the participants; their average pupil dilation to Caucasian faces was almost identical to that of the high-scoring group. In eye movements, we found that the low-scoring group initially behaved like the high-scoring group but decreased its apparent effort over learning trials. To assess this possibility in the pupil data, we generated group averages of dilation (i.e., changes from baseline) across all 10 s of photograph viewing, considering only the Asian faces. The results (shown in Figure 6) were analyzed in a repeated measures ANOVA with factors Group and Trial. There was a main effect of Group, F(1, 19) = 10.39, MSE = 1.1, , but no main effect of Trial. Of primary interest, the Group × Trial interaction was reliable, F(1, 19) = 25.40, MSE = 1.1, , as the group difference in average dilation increased in later trials of the experiment.
With respect to the analyses of pupil dilation, one potential concern is that differences between the low- and high-scoring groups may reflect not only cognitive effort but also the particular facial features that were attended, as pupil dilation is affected by low-level visual features (Goldwater, 1972). Because our participant groups moved their eyes differently across faces (especially over time), there may be imbalances in the examined features. To address this possibility, we conducted two final analyses. In the first, we tested whether Race effects on pupil dilation would remain, once potential variance owing to eye movements was removed. For this analysis, we reset the fixation criterion such that fixations were counted as the eyes remaining in a 10-pixel area for at least 40 ms (as opposed to a 30-pixel area for 100 ms). This increased the sensitivity of analysis such that small movements within AOIs were classified as separate fixations. The pupil data from each group were tested in a stepwise regression, with forced entry of variables. The effect of Race on pupil diameter (collapsed across all 10 s) was assessed after removing variance from number of fixations, unique AOIs, distance traveled, and eccentricity from center. In each case, the Race effect remained robust, but the percentage of explained variance in the high-scoring group, R2 = .51, F(1, 9) = 27.55, p < .001, was approximately double the percentage in the low-scoring group, R2 = .26, F(1, 9) = 10.46, p < .01. This suggests that group differences in pupil dilation were not driven completely by differences in eye movements.
In addition to the foregoing statistical correction for eye movements, we conducted a more direct, empirical comparison. We selected three AOIs (left eye, right eye, and mouth) that were reliably fixated for at least 1 s in every 10-s trial. By coordinating the eye movement and pupil data, we collected pupil diameters based on two conditions. First, the participant’s gaze had to remain in an AOI for at least 400 ms, continuously (Beatty, 1982). Second, we only considered pupil diameters beginning 200 ms after the selected AOI was fixated. In this manner, we compared pupil diameters across participants, knowing that the eyes were holding steady on the same visual information. The results are summarized in Table 3.
These fine-grained analyses were highly consistent with the overall patterns. For the left eye, high-scoring participants had greater overall dilation, relative to low-scoring participants, F(1, 19) = 7.17, MSE = 0.5, . There was also a Group × Trial interaction, F(1, 19) = 9.01, MSE = 0.5, , reflecting the greater disparity between groups in later trials. For the right eye, high-scoring participants again had greater overall dilation, relative to low-scoring participants, F(1, 19) = 6.84, MSE = 0.4, , with the same interaction, F(1, 19) = 10.35, MSE = 0.4, . Finally, for the mouth, the findings were again replicated: High-scoring participants had greater dilation, F(1, 19) = 29.02, MSE = 0.4, , and another Group X Trial interaction, F(1, 19) = 26.75, MSE = 0.4, , reflected the greater disparity between groups in later trials.
Although Experiment 1 produced many data, the results are easily summarized. First, the classic ORB was replicated, with superior recognition of own-race faces, relative to cross-race faces (Byatt & Rhodes, 2004; Kleider & Goldinger, 2006). Second, during learning, there were clear differences in eye movement patterns across own- and cross-race faces. Considering the entire sample, in both the 5- and 10-s conditions, we observed less information gathering when participants viewed Asian faces, relative to Caucasian faces, with fewer fixations, longer fixations, and more regressions. There were also qualitative differences: Participants focused more on the eyes when studying Caucasian faces and more on the nose and mouth when studying Asian faces.3 Henderson et al. (2005; Althoff & Cohen, 1999) suggested that eye movements are themselves encoded in face learning. The present results may suggest that greater perceptual expertise (in this case, own-race faces) supports more effective learning patterns. Third, when participants examined Asian faces, their pupils showed greater dilation, relative to the examination of Caucasian faces. This suggests that cross-race face encoding required greater effort than own-race face encoding (Beatty, 1982; Porter et al., 2007).
The results from both eye movements and pupillometry suggest that effort during perceptual encoding predicts the magnitude of the ORB. Participants in the 10-s condition were divided into groups, according to their acumen in cross-race face recognition. We then examined one global measure of eye movements, total distance traveled by the eyes across stimulus faces, on a trial-by-trial basis. Among the better memorizers, information gathering was equivalent to Asian and Caucasian faces and was steady throughout the experiment. The weaker cross-race memorizers showed a different pattern: Although their eye movements to Caucasian faces were equivalent to those of the high-performing group, their eye movements to Asian faces steadily declined over the (relatively brief) course of the experiment (see Figure 4). After seeing four or five Asian faces, these participants apparently stopped expending full effort during encoding. This decline of effort was corroborated by pupil dilation: As the experiment wore on, lower-scoring participants showed progressively less dilation while encoding Asian faces.
Given these findings in eye movements and pupil dilation, it is natural to ask whether the low-scoring participants’ apparent withdrawal of effort had any impact on their face learning. To address this, at least in preliminary fashion, we calculated two hit rates for each participant in the 10-s condition. One hit rate corresponded to the first five Asian faces encountered by the participant; the other corresponded to the last five. In the high-scoring group, these hit rates were nearly identical (.86 and .88, respectively). In the low-scoring group, hits to the first five Asian faces (.82) greatly exceeded hits to the last five Asian faces (.68), t(9) = 2.85, p < .03. Despite the relatively modest sample size, with only 50 observations per cell, this result suggests that changes in encoding effort affected later face recognition.
The results of Experiment 1 strongly suggest that the ORB arises during face encoding (Levin, 1996; Lindsay, Jack, & Christian, 1991) and is not purely from retrieval errors. We replicated the typical ORB and found that participants’ eye movements and pupil dilations were sensitive to race differences across faces during encoding. We also found race-based differences in the patterns of attended facial features. Perhaps most important, by contrasting good versus poor cross-race face memorizers, we confirmed that eye movements and pupil dilations during encoding predicted later recognition accuracy, including the novel observation that some participants gradually withdrew effort when studying Asian faces. One potential concern with Experiment 1, however, is its exclusive sampling of Caucasian participants. Many investigations of the ORB (e.g., Byatt & Rhodes, 2004; Kleider & Goldinger, 2006) are nonsymmetric in design, including only one race as participants, with little impact on the interpretation of results. In the present case, a more conservative stance seems necessary. Because eye movements and pupil dilation are so intimately tied to the stimulus photographs, it is difficult to confidently attribute the results of Experiment 1 to the ORB. Although the contrasts of low- and high-performing groups help rule out stimulus-based accounts, stronger evidence is desirable.
In Experiment 2, we conducted a partial replication of Experiment 1, using only the 10-s condition, with an entirely Asian sample. Eighteen participants were recruited (as paid volunteers); all were from Asian nations (15 from mainland China or Taiwan), and none had lived in the United States longer than 1 year (M = 6.8 months). The participants completed the same task described in Experiment 1, using the 10-s exposure duration. If the preceding results were due to the ORB, we should observe similar results in Experiment 2, now in favor of the Asian faces.
The sample included 18 Arizona State University students, all natives of Asian countries (11 from mainland China, 4 from Taiwan, 1 from Japan, and 1 from Thailand), recruited entirely by word of mouth, for example among students in the Chinese Undergraduate Student Association at the university. The participants had all resided in Arizona for under 1 year (range: 1.75–11.25 months, with average time of 6.8 months). None reported any special prior exposure to Caucasian faces.
The stimuli were identical to those of Experiment 1.
The design and procedure were identical to the 10-s condition of Experiment 1.
The recognition data, shown in Table 4, were analyzed in separate one-way ANOVAs, with Race as the within-subjects factor. In sensitivity, we observed a robust ORB, with higher mean Pr scores to Asian (own-race) faces, F(1, 17) = 29.44, MSE = 0.03, . As in Experiment 1, there were relatively symmetric effects in hits and false alarms, with more hits to Asian faces, F(1, 17) = 11.12, MSE = 0.05, , and more false alarms to Caucasian faces, F(1, 17) = 15.07, MSE = 0.05, . As shown in Table 4, performance was generally unbiased, with no effect of race, F(1, 17) = 0.12, ns.
The analyses of eye movements followed the same protocol as Experiment 1, with the same AOIs and dependent measures. As before, valid saccades were defined as eye movements that terminated somewhere in the face or hair, rather than the background. Only 0.9% of saccades were rejected as invalid. Table 5 summarizes the eye movement results, shown as a function of depicted race and eventual recognition accuracy. Eye movements were relatively suppressed during study trials leading to misses, with reliable effects of Accuracy across all measures, Fs(1, 17) > 16.5, p < .01, , except regressions. As in Experiment 1, wider ranging eye movements during learning were associated with superior later recognition.
Our analyses focused on trials leading to hits (see upper half of Table 5). Although these study trials all produced successful recognition, differences were still observed in eye movements as a function of depicted race. In one-way ANOVAs, we observed reliable Race effects in nearly all measures, with a marginal effect in regressions. Relative to Caucasian faces, the participants made more fixations, F(1, 17) = 59.27, MSE = 0.03, , and shorter fixations, F(1, 17) = 38.05, MSE = 21.9, , and moved their eyes farther, F(1, 17) = 113.41, MSE = 42.5, , while viewing Asian faces. Own-race faces also elicited fixations to more unique AOIs, F(1, 17) = 7.67, MSE = 0.02, , and marginally fewer regressions, F(1, 17) = 4.39, MSE = 0.02, p < .06.
The global analyses of eye movements show that the Asian participants were generally more active in examining Asian faces, relative to Caucasian faces. As before, we next examined whether any qualitative differences emerged, in terms of specific features fixated. We examined the fixations to each AOI (including only those study trials leading to hits). Average fixation proportions, as a function of AOI and race, are shown in Figure 7. As shown, participants focused their attention on different features, depending upon race. Given Asian faces, more fixations were recorded to the eyes and hair; given Caucasian faces, more attention was given to the nose and mouth, χ2(9) = 41.69. These results mirror those of Experiment 1. Relative to Caucasian faces, more fixations were recorded to both eyes for Asian faces, t(17) = 7.10 (left eye) versus t(17) = 5.98 (right eye), and to the hair, t(17) = 5.75. The Caucasian faces attracted more fixations to the nose, t(17) = 7.09, and the mouth, t(17) = 6.24. Thus, eye movements differed, both in sheer quantity and in qualitative patterns, now in an own-race pattern that favored the Asian faces.
As in Experiment 1, we next examined the distance-traveled measure across learning trials. We divided the participants into two groups (N = 9), based on their recognition scores to Caucasian faces (high scoring, d′ = 2.34; low scoring, d′ = 0.74). The groups did not differ in recognition of Asian faces, with d′ values of 2.29 and 2.05, respectively. Figure 8 shows mean distances traveled by the eyes, including trials leading to hits and misses. Four functions are shown, denoting Caucasian and Asian faces, divided as a function of participant groups. As shown, all participants moved their eyes to similar degrees while viewing Asian faces.4 By contrast, eye movements to Caucasian faces showed clear changes over trials, as a function of group. In the high-scoring group, eye movements were statistically equivalent to Asian and Caucasian faces. The low-scoring group performed in a manner very similar to their counterparts in Experiment 1: Eye movements were equivalent to Asian and Caucasian faces for the first few trials but gradually tapered off to the Caucasian faces. In a 2 × 2 × 13 ANOVA (with factors Group, Race, and Trial Number), the critical three-way interaction was robust, F(1, 17) = 50.79, MSE = 31.6, . Once again, lower cross-race recognition was associated by decreasing effort in the learning phase.
As in Experiment 1, pupil diameters were measured between trials (and while the fixation cross was shown) to establish baselines, then during face viewing for comparison. We removed missing observations owing to blinks or signal loss, filling gaps by linear interpolation, which resulted in 3.7% data repair. Another 0.8% observations were replaced for values falling more than 2.5 standard deviations from their 10 immediate neighbors. For each participant, we again selected the better eye for analysis. In an omnibus ANOVA, we observed greater dilation to Caucasian faces, relative to Asian faces, F(1, 17) = 40.29, MSE = 35.85, , suggesting that cross-race faces required greater effort during encoding.
In eye movements, it appeared that low-scoring participants exerted diminishing effort to the encoding of cross-race faces. The pupil dilation data corroborated this suggestion. Average pupil dilation, collapsed across trials, is shown in Figure 9. When viewing Asian faces, the high- and low-scoring groups produced nearly identical profiles of pupil changes. When viewing the Caucasian faces, the groups again diverged from approximately 6 s postonset until the end of the encoding period. A repeated measures ANOVA (with factors Group, Race, and Time Window) showed a main effect of Time Window, F(1, 17) = 169.05, MSE = 6.7, , as dilations generally increased over time. The main effect of Race was also reliable, F(1, 17) = 78.50, MSE = 6.7, , as Caucasian faces elicited more dilation. The main effect of Group was reliable, F(1, 17) = 11.91, MSE = 6.7, , with more dilation in the high-scoring group. The Group × Race interaction was reliable, F(1, 17) = 53.88, MSE = 6.7, , as was the three-way interaction of Group × Race × Time Window, F(1, 17) = 15.60, MSE = 6.7, .
Figure 10 shows average trial-by-trial dilation, across all 10 s of photograph viewing. Average dilation to Asian faces was equivalent across groups and was quite steady across trials. By contrast, dilation to Caucasian faces declined across trials, primarily in the low-scoring group. We observed a large main effect of Race, F(1, 17) = 259.80, MSE = 37.05, , with greater dilation to Caucasian (cross-race) faces. Neither the main effect of Group nor the main effect of Trial was reliable, although Trial was marginal, F(1, 17) = 2.59. The Group × Race interaction was reliable, F(1, 17) = 6.11, MSE = 37.05, , as the high-scoring group had a greater mean Race effect. The Group × Trial interaction was also reliable, F(1, 17) = 31.19, MSE = 37.05, , reflecting the larger Trial effect in the low-scoring group. The interaction of Race × Trial was also reliable, F(1, 17) = 18.35, MSE = 37.05, , reflecting the larger Trial effect in the Caucasian encoding trials. Finally, the three-way interaction was reliable, F(1, 17) = 49.14, MSE = 37.05, , as changes across Caucasian encoding trials were larger in the low-scoring group.
We next conducted a 2 × 13 (Group × Trial) ANOVA on the Asian learning trials. Neither main effect nor the interaction was reliable (both Fs < 1). As suggested by the lower two functions (with triangular symbols) in Figure 10, there were no appreciable differences in pupil dilation across groups or trials. We conducted a parallel ANOVA on the Caucasian learning trials, finding main effects of Group, F(1, 17) = 15.16, MSE = 39.14, , and Trial, F(1, 17) = 25.75, MSE = 39.14, . These effects reflect greater dilation in the high-scoring group and an overall decrease in dilation across trials. A Group × Trial interaction, F(1, 17) = 136.80, MSE = 39.14, , reflected the increasing disparity across groups in later trials. We again examined the pupil data in a stepwise regression, testing the Race effect (per group) after removing variance from eye movement variables. Race effects remained robust, but the percentage of explained variance in the high-scoring group, R2 = .45, F(1, 8) = 29.03, p < .001, was high, relative to the low-scoring group, R2 = .25, F(1, 8) = 13.19, p < .01, again suggesting that group differences in pupil diameters were not completely driven by differences in eye movements.
Finally, as in Experiment 1, we isolated pupil data from cross-race trials, focusing on periods of gaze to three AOIs (left eye, right eye, and mouth) that were reliably fixated for at least 1 s per trial. As before, we stipulated that the participant’s gaze had to remain in an AOI for at least 400 ms, and we only considered pupil diameters beginning 200 ms after fixation. In this manner, we compared pupil diameters across participants, knowing that their eyes were holding steady on the same visual information. The results are summarized in Table 6. We analyzed data from each AOI using 2 × 13 (Group × Trial) ANOVAs. Results from the left eye followed the overall results: High-scoring participants had greater overall dilation, relative to low-scoring participants, F(1, 17) = 21.44, MSE = 0.4, , and a Group × Trial interaction, F(1, 17) = 14.15, MSE = 0.4, , reflected the greater disparity between groups in later trials. Similar patterns were observed for the right eye and mouth, with reliable main effects of Group (both Fs > 12, p < .01) and Group × Trial interactions (both Fs > 20, p < .01).
Experiment 2 was intended to verify and complement Experiment 1, ensuring that its findings—particularly the selection of features during face learning—were truly related to the ORB, rather than idiosyncratic properties of the stimulus materials. In a sample of Asian students, the results of Experiment 1 were reproduced, only in manners now favoring Asian (i.e., own-race) faces. We observed the ORB, different qualitative and quantitative patterns of eye movements, and differences in pupil dilation, all contrasting own-race and cross-race faces. Moreover, the effort-reduction pattern from Experiment 1 (in both eye movements and pupil dilation) was replicated in Experiment 2. Participants were again divided into groups, according to cross-race face recognition scores. Among the better memorizers, distance traveled was equivalent to Asian and Caucasian faces and was steady throughout the experiment. Among the weaker memorizers, eye movements to Caucasian faces steadily declined across trials. After seeing four or five cross-race faces, these participants apparently withdrew encoding effort. As before, this observation in eye movements was mirrored by pupil dilation: As the experiment wore on, lower scoring participants showed progressively less dilation while encoding Caucasian faces.
As noted earlier, Blais et al. (2008) recently examined eye movements during face learning (and also during recognition and categorization), directly comparing patterns of face scanning across Western Caucasian and East Asian observers. As in the present research, they also used photographs of both Asian and Caucasian faces. Blais et al. found that participants from the different cultural backgrounds showed different characteristic manners of examining faces, with Western Caucasian participants focusing mainly on the eyes and East Asian participants focusing on the nose and mouth. This was true regardless of the depicted races. As shown in Figure 2 and Figure 7 in the present article, it is apparent that we did not replicate this pattern. Instead, we found that all participants favored the eyes during own-race encoding and favored the nose and mouth during cross-race encoding. At present, it is not entirely clear why the results differ across studies. Both studies used similar participant groups, photographs of the same approximate size, and the same intentional face-learning task. There were several differences across studies, however, that might explain the discrepant results. First, Blais et al. presented faces only for 5-s study periods; we tested Asian participants only with 10-s exposures. To assess whether this difference was responsible, we examined fixation proportions from Experiment 2, limiting the analyses to the first 5 s of learning. The results were not substantially different from those shown in Figure 7: We still found approximately 15% more fixations to the eyes, relative to central features.
As another minor difference, Blais et al. (2008) presented faces in four quadrants of the screen, so participants could not predict where they would appear. In the present research, all faces were presented centrally; participants’ eyes were fixed on the nose upon stimulus onset. It is unclear whether this procedural difference would produce different outcomes across cultural groups. The most likely source of the disparate results was the stimulus photographs. In the Blais et al. study, the faces presented for learning varied in emotional expressions (neutral, happy, angry, or disgusted). It is possible that the cultural differences in face examination were related to societal norms regarding eye contact during emotional displays.
The present investigation focused on the own-race bias (Slone et al., 2000) in face recognition, with special emphasis on the information gathering that people exhibit while learning own-race and cross-race faces. At issue is a central question: Does the recognition ORB reflect race-based differences in face processing during encoding, at least in part? If no differences in visual behavior were observed, it would suggest a retrieval account of the ORB. By examining both eye movements and pupillary responses during face encoding, we assessed the relative cognitive demands imposed by different faces and related encoding effort to memory. We used photographs of Caucasian and Asian men and women (Ekman & Matsumoto, 1993), materials that previously elicited the ORB (Kleider & Goldinger, 2006) and are standardized in appearance, which is important when testing pupil responses. In Experiment 1, we replicated the ORB, with better discrimination of Caucasian faces, relative to Asian faces. In Experiment 2, we tested a sample of Asian participants, all relative newcomers to the United States, finding another ORB, in the complementary direction. Notably, in both experiments, the ORB appeared only as reduced sensitivity, whereas many similar experiments show differences in bias (Meissner & Brigham, 2001), with more liberal responding to cross-race faces.
Having observed the ORB in recognition, we, as our primary goal, sought to examine information-gathering behavior during learning. Considering first eye movements, widespread differences emerged between own- and cross-race face processing. In quantitative terms, when participants studied own-race faces, their eye fixations were brief and plentiful. Relative to cross-race trials, own-race trials elicited more fixations to facial features, briefer gaze times per fixation, more fixations to unique features, and fewer regressions. All these findings were reflected in an index of total distance traveled by the eyes during encoding, which we used for most analyses. The differences in eye movements were not an artifact of recognition accuracy: The same patterns were observed in subsets of learning trials leading only to eventual hits. In qualitative terms, participants favored different features across races (see Figure 2 and Figure 8). When examining own-race faces, they paid greater attention to the eyes and hair (see Williams & Henderson, 2007, for similar profiles of selected features in face learning). When examining cross-race faces, they spent more time examining the nose and mouth (cf. Blais et al., 2008).
As suggested by Henderson et al. (2005; Falk, Henderson, Hollingworth, Mahadevan, & Dyer, 2000), feature sampling is an important aspect of face learning, even in tasks that encourage holistic processing. Williams and Henderson (2007) noted that relations among features may be learned by foveal analysis and that eye movements may be informative (and thereby encoded) in their own right, as a means to index relations among features (Walker-Smith, Gale, & Findlay, 1977). In the present research, participants fixated on fewer features in cross-race faces, fixations were relatively long, and there were systematic differences in the selected features. People may tacitly assume that different features (e.g., eyes) vary in diagnostic value, as a function of race (McClelland & Chappell, 1998).
Previous research suggests that own-race and cross-race faces differ in degrees of holistic processing, possibly reflecting levels of perceptual expertise (Sporer, 2001). Diamond and Carey (1986) suggested that more experienced viewers rely on configural properties of faces, whereas less experienced viewers rely on separate features (see also Lindsay et al., 1991; Schwarzer, Huber, & Dummler, 2005). Similar suggestions arise with respect to familiarity. Hancock, Burton, and Bruce (1996; Megreya & Burton, 2006) argued that familiar faces allow more holistic encoding, relative to novel faces. Such familiarity effects are broadly consistent with cross-race comparisons. Michel, Caldara, and Rossion (2006; Michel, Rossion, Han, Chung, & Caldara, 2006) reported that own-race faces are encoded holistically; cross-race faces require more featural encoding. This processing difference also appears in brain responses to faces. Golby, Gabrieli, Chiao, and Eberhardt (2001) found that the activity in the fusiform face area is weaker when people view cross-race faces, relative to own-race faces.
Considered alone, the observed differences in eye movements to own- and cross-race faces are difficult to interpret. When viewing own-race faces, participants gave an impression of working harder, moving their eyes more and examining a wider range of features. However, when eye movements are considered in concert with pupillometry, the opposite interpretation is suggested. In both experiments, pupil dilation was greater with cross-race faces. As dilation is a well-known indicator of cognitive effort (Beatty, 1982; Granholm, Asarnow, Sarkin, & Dykes, 1996; Kahneman, 1973), this suggests that cross-race face encoding required more work by participants. Taken together, the results suggest that more difficult encoding leads to less vigorous eye movements, a counterintuitive finding that requires further investigation.
Although the present results support an encoding-based view of the ORB, they are neutral with respect to the role of memory storage and retrieval. Indeed, by many theories, perceptual encoding and later retrieval are inextricably linked. As noted in the introduction, a popular account of the ORB combines a multidimensional scaling framework with exemplar memory models (Bruce, Burton, & Dench, 1994; Lee, Byatt, & Rhodes, 2000; Valentine & Endo, 1992). In this approach, greater perceptual expertise (in any domain) tends to optimize attention allocation to features that differentiate among exemplars. Thus, increasing expertise will “spread” exemplars in perceptual space, allowing finer discrimination in classification and retrieval from memory. In the present context, cross-race faces would be densely clustered in face space, leading to poor performance. Indeed, Byatt and Rhodes (2004) created MDS representations of participants’ face spaces, finding that cross-race faces were densely clustered, relative to own-race faces. After replicating the ORB, Byatt and Rhodes applied Nosofsky’s (1986) generalized context model (GCM) to their data. The GCM relates perceptual and memorial performance to psychological distance among stored exemplars; it produced excellent quantitative fits to Byatt and Rhodes’s data.
The application of Nosofsky’s (1986) GCM to face recognition is particularly interesting when considered in tandem with eye tracking. A central assumption of GCM is that, with experience, people learn to focus attention on perceptual dimensions (or features) that will maximize the discrimination among stored exemplars. Such attentional optimization is critical, for example, in ALCOVE (Kruschke, 1992), a connectionist implementation of GCM. This assumption was tested by Rehder and Hoffman (2005a, 2005b), who examined participants’ eye fixations to different stimulus features in category learning. Across a range of different category structures and stimuli, they found that people follow a gradient-descent pattern: Early in learning, people fixate all stimulus dimensions equivalently; with experience, they gradually focus on features that predict category membership, with impressive sensitivity to statistical diagnosticity. (In Rehder and Hoffman, 2005b, participants never achieved an optimal allocation strategy, but the authors theorized that optimality would have been realized, given sufficient experience.) In the present research, participants sampled features with different priorities across races. Perhaps with sufficient exposure to cross-race faces, such differences in fixations (and in apparent cognitive effort) would eventually vanish.
In this study, one of our goals was to replicate a finding that we observed in a preliminary study (He, 2005). That study was very similar to Experiment 1, producing all the main results reported here, including the ORB, differences in eye movements, and differences in pupil responses as a function of depicted race. We also found a pattern of reduced effort in eye movements among the participants with inferior recognition of cross-race faces. In Experiments 1 and 2, we replicated this pattern: In high-scoring groups, eye movements and pupil dilations were relatively constant across learning trials, regardless of race. By contrast, in low-scoring groups, both indices were equivalent to own- and cross-race faces for the first four or five learning trials. From that point, eye movements and pupil responses to cross-race faces gradually tapered off, regressing toward those used for own-race faces. These differences in effort were reflected in eventual recognition performance.
As noted earlier, when studying the first five Asian faces of Experiment 1, the high- and low-scoring groups appeared equivalent, with similar eye movements, pupil responses, and eventual hit rates (.86 and .82, respectively). We therefore cannot suggest that one group was merely more dedicated or was inherently superior from the outset. Instead, participants in the low-scoring group became low scoring as a result of their own diminishing efforts. What might account for this behavior? We might liken their waning effort to “learned helplessness” (cf. Seligman & Maier, 1967), such that low-scoring participants may have lacked meta-cognitive confidence, perhaps underestimating their own learning of cross-race faces. After several trials of confusable faces, their motivation may have diminished. However, the exact opposite hypothesis might also apply: Perhaps low-scoring participants experienced overconfidence while studying cross-race faces and thus began to conserve energy. Although this seems unlikely, it would account for the results. A third possibility relates to overall cognitive effort and motivation: As evidenced by pupil dilation, participants expended greater effort in cross-race learning trials, but low-scoring participants did not maintain effort throughout the 10-s encoding trial. According to Granholm et al. (1996), the pupil dilation response reverses when people reach cognitive overload. Perhaps the underlying difference between groups was their sensitivity to sustained effort.
Although we cannot rule out any of the preceding interpretations, we conducted another experiment (with Caucasian volunteers) that seems to contradict the overconfidence hypothesis. Specifically, we presented the same test materials as in Experiment 1, but we doubled the learning trials, adding highly salient faces with mixed emotional expressions. We expected that given a longer study phase with emotional faces, participants would find the neutral faces less distinctive and interesting. We thus predicted that dedication to study Asian neutral faces would wane further, relative to Experiment 1. This was indeed the result. We replicated the effort-reduction in the low-scoring group, and it also reliably emerged (albeit to a lesser degree) among the high-scoring group. As before, the effect was shown in convergent measures of eye movements and pupil dilations. Given the challenging nature of this experiment, it seems unlikely that we accidentally increased participants’ confidence. Instead, by making the key encoding trials feel relatively difficult, we apparently discouraged participants from exerting full effort during learning.
Clearly, if participants selectively withdraw effort from encoding cross-race faces, the ORB will be increased. We do not imagine, however, that the classic ORB is entirely due to reticent participants. Even among the low scorers, pupil dilation was high to cross-race faces, relative to own-race faces, indicating considerable effort. Moreover, few prior experiments used exposure durations of 10 s, as in the present study. Nonetheless, our results do suggest a potential vicious cycle: Typical ORB experiments afford little opportunity for participants to adjust their perceptual weights or to improve their performance. Participants see many cross-race faces, without any feedback, and the faces typically appear highly confusable (Byatt & Rhodes, 2004; Johnson & Fredrickson, 2005; Sporer, 2001). This may reflect shallow, categorical encoding, as suggested by Levin (1996, 2000), or it may reflect densely clustered faces in a multidimensional space (Valentine & Endo, 1992). In either case, perception, attention, and memory would be naturally intertwined, as a feedback system: Shallow perceptual encoding leads to overlapping traces in memory. Poorly differentiated memory traces cannot support optimization of attention to relevant facial cues, which continues the trend toward shallow perceptual encoding. Participants may be implicitly aware of this cycle, discouraging their efforts as the experiment continues. In future investigations, we hope to better understand the relations among attention, memory, and motivation in cross-race face learning.
Support was provided by National Institutes of Health Grant R01-DC04535-08 to Stephen D. Goldinger. We thank John Henderson for valuable comments on a previous version of this article.
1The reported settings for determining fixations were the recommended defaults for the Tobii eye tracker, when pictorial stimuli are used. We also conducted analyses using stricter criteria (e.g., requiring the eye to stay in a 15-pixel area for 100 ms), with several combinations of area and dwell time. Because such changes had no effect on qualitative or statistical results, we report analyses based on the default settings.
2In several of the analyses reported in this article, we tested behavior (e.g., eye movements) at the trial-by-trial level for individual participants. This necessarily meant that by excluding study trials leading to misses, we had missing data in those ANOVAs. In all cases, we had a minimum of five observations per cell and robust effect sizes. We therefore report the ANOVAs, based exclusively on study trials leading to hits, despite the missing data, because these trials offer the most straightforward interpretations. To be cautious, however, we also conducted parallel ANOVAs including all study trials, regardless of later recognition accuracy. In every case, the patterns of results, both numerically and statistically, were identical to those reported in the Results sections.
3Note that in this experiment face recognition is equated with picture recognition, which may not perfectly reflect face-recognition processes (Megreya & Burton, 2006). Although this approach to testing the ORB is common (e.g., Byatt & Rhodes, 2004), it may not be ideal. Nonetheless, in the present study, our primary data came from the study sessions, which are unaffected by the nature of the test session.
4As in Experiment 1, we conducted a parallel test, dividing participants according to their own-race recognition scores. Despite using performance to Asian faces as the selection factor, we again found no differences in eye movements to those faces.
Stephen D. Goldinger, Department of Psychology, Arizona State University.
Yi He, Department of Psychology, Yale University.
Megan H. Papesh, Department of Psychology, Arizona State University.