FFA Activity Is Correlated with V1 Activity
Instead, here, we found that at least 4 lower-order image dimensions produced systematic variation in FFA responses. Moreover, this response variation was highly correlated with that in V1. Those V1–FFA correlations were as follows: r = 0.956 for size, 0.999 for position, 0.957 for contrast gain, and 0.957 for rotation in depth. Accordingly, all these results in FFA were closely fit by a model based on V1 function. The model assumed only a response to local contrast corrected by the CMF; it did not include any additional face-selective component.
In the hierarchy of primate visual cortical areas, V1 is the information bottleneck (
Van Essen et al. 1992;
Distler et al. 1993); most or all the input to higher-tier cortical areas (presumably including FFA) passes through V1. For simplicity, we measured lower-level visual cortical activity in V1. Although we did not systematically analyze data from other visual areas (e.g., V2, etc.), our activity maps (e.g.,
Supplementary Fig. 3) revealed that many additional lower-level visual cortical areas responded similarly to V1 and FFA. This eased the burden of proof for our lower-level influences in FFA, and it eased certain technical constraints. For instance, if our “V1” ROI encroached slightly into neighboring V2, this presumably had little effect on our data; both V1 and V2 are lower-level visual cortical areas, functionally similar at the level of fMRI.
Facial Size
Variations in face size (e.g., in degrees of visual angle) strongly affected responses in FFA. In fact, size was the strongest lower-level influence that we found in FFA, given the experimental ranges tested here. This size-driven response variation was relatively surprising because several previous fMRI (
Grill-Spector et al. 1999;
Andrews and Ewbank 2004;
Sawamura et al. 2005) and single-unit (
Rolls and Baylis 1986;
Ito et al. 1995) studies have reported size invariance in inferior temporal cortex. Instead, our results are more consistent with psychophysical studies (
Kolers et al. 1985;
Fiser et al. 2001) showing significantly longer reaction time when sequentially matching faces of dissimilar size compared with faces of same size. A recent fMRI study using fast event-related adaptation also showed a significant size effect in right FFA (
Xu et al. 2009).
What accounts for this apparent discrepancy between results in the present data, relative to some previous reports? One factor may arise from differences in the stimulus ranges tested. Specifically, some previous studies tested relatively limited ranges of stimulus size (e.g., 2- or 4-fold in diameter). By comparison, here, we tested a 26-fold range of face sizes. These differences in sampling range could account for the different conclusions in those studies: Analogous 2- or 4-fold size variation in our own data also produced statistically insignificant differences (i.e., size invariance). Here, the wider range of stimulus sizes revealed the size tuning function in FFA; this would not have been uncovered by tests using more limited test ranges.
Foveal Bias in FFA
It has been reported (
Levy et al. 2001;
Hasson et al. 2003) that FFA responds more to foveal stimuli, relative to adjacent area PPA, which responds relatively more to peripheral stimuli. However, that information alone is ambiguous. Relative to classically retinotopic visual cortical areas such as V1, does this FFA/PPA comparison reflect a foveal bias in FFA, or a peripheral bias in PPA, or both?
The present data suggest that FFA is not biased for foveal stimuli relative to V1 because FFA and V1 share a common CMF. However, FFA could be biased for foveal stimuli relative to PPA if we assume that PPA has a peripheral bias. Evidence for such a peripheral bias can be seen in
Supplementary Figure 4a: The size gain function shows a steeper slope in PPA compared with that in FFA and V1, when all responses exceeded baseline. Thus, in PPA, our evidence is broadly compatible with previous reports (
Levy et al. 2001;
Hasson et al. 2003;
Schwarzlose et al. 2008).
Facial Position
Given these visual field variations, and the CMF influence in FFA (e.g., ), it would be surprising if otherwise-equal faces did not produce decreased activity at progressively greater visual field eccentricities. Although the converse conclusion (position invariance) has also been reported in posterior Fusiform gyrus (pFs) (analogous to FFA) (
Grill-Spector et al. 1999), the stimuli in that study varied over a narrower visual field range (5.6°) compared with the stimuli used here (17.6°). Thus, again, testing a smaller range of stimulus variation may explain a conclusion of stimulus invariance. In fact, the CMF influence predicts that one could deliberately create an even more dramatic “position invariance” by comparing only the responses with faces that are positioned far apart in the visual field, equidistant from the center of gaze, along the vertical meridian (e.g., ).
Schwarzlose et al. (2008) reported that foveal stimuli produced slightly larger responses compared with peripheral stimuli in FFA even when stimulus size was corrected by a CMF. If the CMF completely explained the variations in FFA response, then the CMF-corrected stimuli should have produced equal responses in FFA. However, the CMF in
Schwarzlose et al. (2008) was extrapolated from values in the literature; concurrent measurements were not analyzed from V1 to confirm the CMF values in that specific subject pool. If there is a discrepancy between our data and those of
Schwarzlose et al. (2008), this raises an interesting question: Is size versus position information encoded along different (vs. equal) “CMF” functions? From single-unit recordings in macaque V1, either conclusion is possible (e.g.,
Van Essen et al. 1984, but see
Hubel and Wiesel 1974).
Facial Contrast Level (Gain)
The contrast gain function imposes significant constraints for the computation of face/object processing. Most importantly, does contrast invariance exist in FFA? One fMRI study (
Avidan et al. 2002) concluded that pFs (FFA) showed an “increasing tendency toward contrast invariance” in LO (and pFs, the FFA equivalent) relative to V1. However, when those earlier data are replotted on a conventional logarithmic scale, it also showed a near-linear increase in pFs (FFA) similar to that presented here. The similarity between contrast gain functions in these 2 studies is notable, considering the many technical differences between them. For instance, the earlier fMRI study was based on-responses to luminance variations of line drawings on a constant luminance background, rather than the equal-luminance contrast variations in the gray-level faces tested here. Another fMRI study also reported contrast-varying responses in nearby region “LO” (
Murray and He 2006). Overall, these results suggest that contrast invariance cannot be assumed in FFA nor likely in other ventral stream areas.
Facial Viewpoint (Rotation in Depth)
Several changes occur as the head rotates in depth (e.g., ). The size and averaged eccentricity of the face increases and decreases during the head rotation, like the waxing and waning of the bright side of the moon during the lunar cycle. In our head stimuli, the averaged local contrast also covaried with these size/eccentricity variations because local contrast was concentrated on the face. Thus, all 3 stimulus variables (size, eccentricity, and contrast) predicted higher responses to frontal views of a face and minimum responses to the back of the head in both FFA and V1. This prediction was formalized in our lower-order model.
Our results closely matched the model predictions in both FFA and V1 (). The model even accounted for the slightly decreased FFA response to a sphere compared with the back of the head. Again, a face-selective component was not required to account for the FFA activity variation.
Several fMRI studies (
Grill-Spector et al. 1999;
Fang et al. 2007;
Xu et al. 2009) have tested the effects of rotation in depth in FFA, using sparser sampling compared with the present study.
Xu et al. (2009) showed that FFA was sensitive to rotation angle as small as 20°. One study (
Tong et al. 2000) also tested responses to the back of the head including hair. In general, those studies showed FFA results similar to ours: Overall, activity decreased as viewpoint diverged progressively from frontal views. However, previous studies did not measure the rotation in as much detail in FFA nor V1 responses in comparison with those in FFA. The V1 measurement especially shaped our ultimate conclusion.
Conceivably, head stimuli with hair might yield a different result because hair adds a fine-grained contrast. However, hairless faces (as used here) are commonly used in studies of face perception because such stimuli avoid confounding cues due to hair (e.g., gender, race, culture, and age).
The Overall Role of Lower-Level Influences in FFA
Why would FFA show such a strong lower-level influence in these experiments? First (and simplest), many previous studies have not compared activity in FFA with that occurring in V1; thus, some close V1–FFA correlations may have remained undetected.
Second, our experimental task was deliberately designed to minimize attention to higher-order facial characteristics (e.g., identity, gender, etc.) by requiring subjects to attend to a competing lower-level feature (dot detection). Thus, lower-level influences may have been relatively uncovered in FFA compared with other possible tasks that elicit higher-order influences (e.g.,
Grill-Spector et al. 2004). In any event, FFA has been historically defined and localized based on its sensory (face) selectivity (
Puce et al. 1995;
Kanwisher et al. 1997;
Halgren et al. 1999)—not on its higher-order properties (but see Gautier et al. 2000a).
Third, our measurements spanned a greater stimulus range compared with previous studies, for all 4 stimulus dimensions tested. Such extended test ranges increased our statistical power to uncover variations in responses function, which could have gone undetected with a more restricted test range.
A fourth possible explanation arises from the position of FFA in the cortical visual hierarchy. Based on monkey data (
Van Essen et al. 1992;
Distler et al. 1993;
Nakamura et al. 1993;
Rajimehr et al. 2009), information could presumably get from human V1 to FFA via as few as 1 or 2 intervening areas (e.g., V1 > V4 > FFA, or V1 > V4 > TEO [temporo-occipital] > FFA). This suggests that FFA occupies a middle (not an upper) tier in the visual cortical hierarchy (e.g., higher than V1 but lower than anterior TE). Thus, FFA should show some residual generalized influence from lower-tier areas.
Lack of Stimulus Invariance
A common view is that FFA responds selectively to faces as a distinct category (reviewed in
Kanwisher and Yovel 2006). However, faces vary infinitely in detail: Does FFA respond invariantly to all faces, despite this variation in individual face images? Here, we documented that FFA activity varies a great deal in response to 4 important face parameters: size, position, contrast, and viewpoint.
The relative strength of each lower-level parameter cannot be easily reduced to a single number because the strength of that variation depends on the range of variation tested. For instance, in our measurements, variations of face size amounted to more than half of the FFA response to the most effective face; in that case, the lower-level influence dominated the response of the test faces relative to uniform gray baseline. By comparison, the influence of face position was weaker in our position data. However, if we had been technically able to test faces at a wider range of visual field positions, presumably this would have produced a correspondingly larger influence of position in FFA in accord with a CMF-like function.
In a further experiment (e.g., ), a combination of 2 parameters was influential enough to wholly reverse the category selectivity of FFA from faces to objects. At least in that case, the lower-order influences were even stronger when combined. Our data ( and ) also suggest that a similar (though smaller) “preference” for the “blob” (nonface) stimuli could have been achieved by manipulating only one parameter (size).
This emphasizes that face selectivity in FFA is parameter dependent, not absolute. This has implications at the level of a population code: Cortical neurons at higher levels cannot simply respond according to a face/nonface threshold because the response range to faces overlaps the response range to nonface objects. At most activity levels, some object-driven activity will be above a given threshold and some face-driven activity will be below it.
Implications for Neural Models of Face Processing
Limitations
To minimize parameter explosion across our many stimulus conditions, we tested only a single dependent measure of the BOLD response. Given that constraint, we focused on the amplitude of the classic on-response rather than fMRI adaptation, nonlinear classifier, or other measurement. This allowed more direct comparisons between our results relative to the on-responses in the single-unit literature and in the original fMRI reports.
Using different approaches, other studies may come to different conclusions. For instance, single-unit techniques may reveal functional distinctions that cannot be distinguished using fMRI. Evolutionary differences between humans and macaques may also temper the current conclusions. Variations in the nature of the attention task might change the shape of the response curves (e.g.,
Murray and He 2006;
Li et al. 2008;
Castelo-Branco et al. 2009). Further fMRI analyses (e.g., based on adaptation, multivoxel pattern analysis, and/or event-related approaches) may yield additional insights compared with the direct on-responses measured here.
As described above, the relative strength of each stimulus dimension also depends crucially on the range of stimulus parameters tested.
Though V1 and FFA activity correlated highly in all the amplitude-normalized comparisons (e.g., ), the slope of the gain functions was sometimes higher in V1 compared with FFA, when based on raw fMRI signal levels (e.g., ). Multiple unknown factors could underlie this difference in slope, including 1) larger receptive fields in FFA, 2) the significant residual response to nonface stimuli in FFA (e.g., ;
Grill-Spector et al. 1999; Gauthier et al. 2000;
Tong et al. 2000;
Tsao et al. 2003,
2006;
Caldara et al. 2006;
Yue et al. 2006;
Tootell et al. 2008), and 3) known differences in the physiology and anatomy of V1 relative to extrastriate cortex (e.g., a higher cell packing density and denser vasculature [
Perry and Cowey 1985], a specialized laminar structure, and high spontaneous and driven single-unit activity). It is possible that a lower slope in FFA could indicate “an increasing tendency toward invariance,” in LO or FFA, relative to V1, as described by
Avidan et al. 2002. However, only a strict invariance (not an “increasing tendency toward” invariance) for a given variable will aid the computation of faces/objects by allowing the computation to ignore that variable; whether the slope is higher or lower is relatively moot for the computation. The fact that we were able to reverse FFA selectivity for faces versus objects by manipulating only lower-level properties () strongly suggests that any “tendency toward invariance” remains incomplete at the level of FFA.
Conclusions
These results emphasize that FFA activity can be strongly affected by lower-level stimulus parameters. Receptive field properties similar to those known in V1 can account for essentially all the response variation we found in FFA. Given the strength of these effects, it is possible that previous controversies about function in FFA were inadvertently complicated by lower-level stimulus differences. At least, our results reemphasize the importance of specifying and standardizing stimulus parameters, and of acquiring control measurements in V1, in studies of higher-order function.
These results do not rule out the presence of an apparently face-selective component in the FFA response. Although we were able to systematically reverse the normal category selectivity in FFA by manipulating lower-level parameters, relatively atypical parameters were used for those nonoptimal faces. Moreover, the face-driven activity in FFA was always much higher than that in adjacent area PPA, as expected. Thus, overall, our results are consistent with some face selectivity in FFA in addition to the sensitivity to lower-level features emphasized here.