By necessity, we must be able to generalize across many natural sources of image variation in order to recognize a face, including changes in distance and viewpoint as we move around each other and changes in lighting as we move in and out of different environments. We rarely view faces from directly in front (known as the “full face” or 0° view). Most commonly, our visual experience of faces falls within a range of viewpoints rotated away from 0° by up to 45° to the left or right (about the vertical axis; yaw), above or below (about the horizontal axis; pitch), and clockwise or anticlockwise (about the depth axis; roll). Within this range of variation in viewpoint, face recognition is remarkably good, even for unfamiliar faces (Favelle et al., 2011
; Hill et al., 1997
; Schyns and Bülthoff, 1994
; Liu and Chaudhuri, 2002
; Stephan and Caine, 2007
). However, face recognition noticeably deteriorates for rotations greater than 45°. Face recognition performance for a yaw-rotated profile face view is poorer than that for both full face and yaw-rotated three-quarter (i.e., 45°) face views (Hill et al., 1997
; Liu and Chaudhuri, 2002
; McKone, 2008
). Similarly roll and pitch rotations greater than 45° typically result in poorer face recognition performance than that for upright full faces (Favelle et al., 2007
; Favelle et al., 2011
; Martini et al., 2006
; Valentine and Bruce, 1988
). Why does face recognition become more difficult outside of this limited viewpoint range (0°–45°)?
Upright faces are recognized not as a collection of individual features, but as a “whole” percept. According to Rossion (2009
), this holistic processing generates a “simultaneous perception of the multiple features of an individual face, which are integrated into a single global representation” (p. 305). Holistic processing is applied to all of the available cues to the face (both local ones arising from discrete features, as well as those based on the distances and spatial relationships between features), which are fused together into a single perceptual representation. If faces are both perceived and represented holistically then not only should the perception of a given facial feature depend on the perception of the face as a whole, but slight differences between faces should be discriminated very quickly and efficiently (at least under normal viewing conditions). While “configural processing” has often been used interchangeably with holistic processing (e.g., McKone, 2008
), or as an umbrella term that includes holistic processing as a sub-type (e.g., Maurer et al., 2002
), “configural information” is typically used to refer to the spacing or distance between the discrete features of a face (e.g., between the two eyes or between mouth and nose). The idea being that faces can be recognized based on differences in either configural (e.g., interocular distances) or featural (e.g., mouth surface reflectance or shape) information. For present purposes we shall use “holistic” to refer to the perceptual process and “configural information” to refer to the spatial distances and relationships between nameable facial features (e.g., Rossion, 2008
A pillar of the face perception literature, the face inversion effect (FIE) is the observation that the inversion (180° roll or picture-plane rotation) of faces dramatically impairs recognition compared to upright faces, and that this impairment is disproportionately larger for faces than objects (Yin, 1969
). Because the inversion manipulation preserves the low-level visual properties of the face present in the upright stimulus, the FIE can be attributed to high-level/cognitive processes used differentially for upright and inverted faces. Accordingly, inversion has been widely used as a control condition in behavioral studies (Valentine, 1988
; Rossion and Gauthier, 2002
; Rossion, 2008
; Tanaka and Gordon, 2011
; also see McKone et al., submitted).
It is now generally accepted that turning a face upside-down disrupts face-specific holistic processing. For example, if one creates a composite face by aligning the photographs of the top and bottom halves of two different faces, the obligatory holistic processing of the new “whole” face will impair naming accuracy and increase reaction times (RTs) for each half face (compared to when these half faces are misaligned – the “Composite effect”; Young et al., 1987
). Similarly, studies have found that the memory for a facial feature is more accurate when it is subsequently presented in the context of the whole studied face than when it is presented on its own (the “Part-whole effect” – Tanaka and Farah, 1993
). However, while both these Composite and Part-whole effects are strong for upright faces they disappear when the face stimuli are inverted (but see McKone et al., submitted).
Whether inversion produces a qualitative or a quantitative reduction in holistic face processing is currently a hotly debated topic (Richler et al., 2011
; Rossion and Boremanse, 2008
; Rossion, 2009
). For example, based on their recent findings, Richler et al. (2011
) claim that: (i) both upright and inverted faces are processed holistically; and (ii) the well-known FIE performance decrement arises because holistic processing is less efficient/successful for inverted faces (due to our limited experience with inverted faces). Other researchers argue that what is “lost” in an inverted face is the sensitivity to configural information and that featural information remains relatively unaffected (e.g., Carey, 1992
; Freire et al., 2000
; Maurer et al., 2002
). However, when featural and configural information are equated for discriminability in upright faces, inversion appears to disrupt sensitivity to both types of information in a similar manner (McKone and Yovel, 2009
; McKone and Robbins, 2011
; Riesenhuber et al., 2004
; Yovel and Kanwisher, 2004
). These findings suggest that configural information may not always have a special status in face perception/recognition. In fact, according to Rossion (2009
) all aspects of the face are “configural” when the face is being processed holistically. That is, the face has to be processed holistically to make the best use of both the available featural and configural information.
As noted above, the FIE is typically explained in terms of a disruption to holistic processing. It is possible to explain the poorer recognition for face views rotated more than 45° in pitch (from the full-face view) in a similar manner. During picture-plane rotations, Rossion and Boremanse (2008
) found a non-linear drop in the holistic processing of faces (as measured with the composite face illusion) for roll rotations of 90° or more. They argued that the poor performance for faces at these unusual views (including the FIE) is based on the inability to match the incoming visual stimulus to an experience-derived holistic internal representation (i.e., a template) that is centered on the full-face view. Consistent with the idea that visual experience plays a key role in processing upright faces, Laguesse et al. (2012
) found that adults trained to individuate a set of inverted faces showed a reduced FIE on a set of novel faces. If it is the case that holistic processing is reduced/impaired for views in the picture-plane with which we have less experience, this should also be the case for less experienced views rotated in the pitch and yaw axes.
To date there has been little investigation of holistic processing in faces rotated in pitch and yaw (as opposed to the extensive investigation of roll/picture-plane rotation on holistic processing, see Rossion and Boremanse, 2008
). One exception is a study by McKone (2008
) who examined performance with composite faces (made by aligning the half faces of two different individuals) rotated in yaw. She found that while identification of the individual face halves was poorer at profile views than full-face or three-quarter (45°) views, holistic processing was insensitive to view changes in yaw (as measured by both the “composite face” and “peripheral inversion” tasks – see McKone, 2004
). These findings suggest that yaw viewpoint effects are driven by a disruption to featural processing. That is, profile views provide poor information about face parts but, despite the occlusion of half of the face, do provide adequate holistic information. In apparent conflict with the predictions of Rossion and Boremanse’s (2008
) experience-only template theory outlined above, natural view frequency was found to have no effect on holistic processing in this study.
No studies have investigated holistic processing in pitch rotated views of faces, presumably because of the difficulty in applying the composite face task typically used to tap into this information. Favelle et al. (2011
) used a scrambled/blurred paradigm to isolate the configural and featural information contained in faces following yaw, pitch, and roll rotations (up to 75° from the full-face view). They found that performance in a sequential face matching task based on configural information was best following roll camera rotations, poorer for yaw camera rotations, and poorer still for pitch camera rotations. While performance in this same task based on featural information was much poorer, it also showed similar patterns of viewpoint dependent decline in pitch and yaw, and no decline in roll camera rotations.
Two notable points arise from these findings. First, it appears that while both configural and featural information are utilized in recognizing faces across different views (at least views rotated up to 75°), configural information appears to be more useful. Second, pitch rotation disrupts configural information to a greater degree than yaw or roll rotation. Compared to rotations about other axes, pitch camera rotations result in a greater foreshortening and occlusion of features as well as a general reduction in the amount of available “face” information. Thus, Favelle et al.’s (2011
) findings of a greater cost to face recognition following pitch camera rotations may be due to participants having to rely more heavily on parts- or object-based processing (as opposed to configural information and more face-specific, holistic processing) than in yaw. The aim of the experiment reported here is to investigate the idea that these viewpoint axis effects can be explained by differences in the degree to which parts/object-based or holistic processing is engaged.
) found evidence for holistic processing in views of faces at 0°, 45°, and 90° of yaw rotation. Her study investigated face identification ability at different views (i.e., learning and testing at the same view) with the results suggesting that the functional role of holistic processing is to support reliable face identification across different images. Here we are considering the contribution of holistic processing to the transfer of learning across views (i.e., learning a face at one view and testing at another). Because of the difficulty in using a composite task for pitch rotated views, the current study used the FIE as an indicator of the disruption to holistic processing in faces (Rossion, 2008
). Specifically we examined the FIE for matching unfamiliar, undistorted whole faces rotated in either yaw or pitch (see Figure ). Views which contain sufficient “face” information to support holistic processing should generate a FIE. Thus, we expect to find FIEs for all yaw viewpoints (McKone, 2008
; Favelle et al., 2011
; Hills et al., 2012
. However, based on previously observed face recognition difficulties with upright 75° pitch-up and -down rotated views, we may find little evidence of FIEs for these uncommon viewpoints. This would demonstrate (for the first time) that we are unable to access any holistic information contained in these particular pitch rotated face images.
Figure 1 Example of a set of the face stimuli. Views are taken from rotations of 0°, 15°, 45°, and 75° in the yaw axis (top row), pitch axis above horizontal (middle row), and pitch axis below horizontal (bottom row). To view the (more ...)