|Home | About | Journals | Submit | Contact Us | Français|
In the present study, we investigated if attention to faces results in sensory gain modulation. Participants were cued to attend to faces or scenes in superimposed face-scene images while face discriminability was parametrically manipulated across images. The face-sensitive N170 event-related potential component was used as a measure of early face processing. Attention to faces modulated N170 amplitude, but only when faces were not highly discriminable. Additionally, directing attention to faces modulated later processing (~230–300 msec) for all discriminability levels. These results demonstrate that attention to faces can modulate perceptual processing of faces at multiple stages of processing, including early sensory levels. Critically, the early attentional benefit is present only when the “face signal” (i.e., the perceptual quality of the face) in the environment is suboptimal.
A typical scene is cluttered, containing far more information than the visual system’s limited processing capacity can handle at one time. It is under these highly cluttered conditions that our attention system is most necessary for guiding behavior by selecting a behaviorally relevant subset of the available information for further processing. Of the numerous studies that have investigated the neurobiological mechanisms of selection, the majority have focused on top-down attention to locations in space. These studies suggest that selection results from processing biased in favor of stimuli occurring within relevant versus irrelevant locations of the visual scene (see Mangun, 1995). A key finding in this research is that selection of relevant locations occurs during early visual processing, via an increase in the sensory gain of attended relative to unattended channels (Hillyard & Mangun, 1987; Hillyard, Vogel, & Luck, 1998). This increase in sensory gain may result from enhancement of the sensory signal, suppression of task-irrelevant external noise, or a combination of the two (Hopf et al., 2006; Luck et al., 1994). More recent studies have aimed to determine if attention operates similarly when directed to complex objects such as faces (e.g., Downing, Liu, & Kanwisher, 2001).
Faces are a biologically relevant stimulus category with immense social import that may have distinct processing requirements from other stimulus categories (Farah, 1996; Farah, Wilson, Drain, & Tanaka, 1995). Yet, functional MRI (fMRI) studies have found a great deal of correspondence between the neural activity patterns seen during spatial attention tasks and tasks requiring attention to faces. Several studies have now established that focusing attention on a location increases activity in visual areas that code for that particular location (Mangun, 1995). Similarly, face-sensitive perceptual modules within occipito-temporal cortex have been reported to demonstrate greater activity when attention is directed to faces versus nonface objects (Lepsien & Nobre, 2007; O’Craven, Downing, & Kanwisher, 1999; Serences, Schwarzbach, Courtney, Golay, & Yantis, 2004; Wojciulik, Kanwisher, & Driver, 1998). While compelling, these studies are bound by the temporal limitations of fMRI, and consequently cannot distinguish between attention-induced modulation of early sensory processing and later modulation due to reentrant signals from attentional control regions.
The millisecond temporal resolution of event-related potential (ERP) and event-related magnetic field (ERMF) methods allows us to better assess the level of processing at which attentional modulations occur. It is well-established that early sensory processing, as indexed by the P1 and N1 ERP components, is modulated by spatial attention (Hillyard & Anllo-Vento, 1998). In addition, electrophysiological investigations in nonhuman primates have revealed that attentional effects can occur during feedforward stages of sensoriperceptual analysis during spatial attention tasks (e.g., Luck, Chelazzi, Hillyard, & Desimone, 1997). Reynolds and colleagues recorded from V4 neurons as they presented macaque monkeys with gratings of various contrasts (Reynolds, Pasternak, & Desimone, 2000). When the grating appeared within the attended region of space, the cell’s firing rate increased. Interestingly, this gain was observed early in time (before 200 msec) only when the grating was at a suboptimal contrast. At the cell’s optimal or “preferred” contrast, attention only increased the gain during later (after 200 msec) neuronal responses (Reynolds et al., 2000; see also Martinez-Trujillo & Treue, 2002; Ekstrom, Roelfsema, Arsenault, Bonmassar, & Vanduffel, 2008; but see Williford & Maunsell, 2006). The authors note that the absence of early gain enhancement for the optimal contrast grating was not due to an overall saturation of the neuronal response, but instead may have been due to a saturation of the neuronal response for that particular stimulus (Reynolds et al., 2000). Consistent with findings in monkeys, evidence from human behavioral studies illustrates a strong interaction between the perceptual quality of a stimulus and the effects of spatial attention. Hawkins, Shafto, and Richardson (1988) reported that the attention-related improvement in target detection sensitivity is greater for low-luminance targets than for high-luminance targets. Thus, the magnitude of attention’s influence on early sensoriperceptual processing appears to be more robust when the signal of interest is low versus high (e.g., non-preferred category, low luminance). A related view that task demands or the amount of perceptual information present during task performance may influence the degree of early attentional selection has been convincingly demonstrated by Lavie and colleagues (Lavie, 1995; Lavie & Tsal, 1994; see also Handy & Mangun, 2000).
The temporal resolution of ERP and ERMF also enables the examination of attentional effects on early sensory face processing. Both methods display characteristic neural signals approximately 170 msec following the presentation of a face (Bentin, Allison, Puce, Perez, & McCarthy, 1996; S. T. Lu et al., 1991; Watanabe, Kakigi, Koyama, & Kirino, 1999), which have been localized to similar regions of visual cortex (Deffke et al., 2007). These event-related components (N170 for ERP, M170 for ERMF) are sensitive to physical manipulations (e.g., inversion, scrambling of internal features) of face stimuli (George, Evans, Fiori, Davidoff, & Renault, 1996; Halgren, Raij, Marinkovic, Jousmaki, & Hari, 2000; Liu, Higuchi, Marantz, & Kanwisher, 2000; Rossion et al., 2000) but largely insensitive to higher-order influences such as familiarity (Bentin & Deouell, 2000; Eimer, 2000b; Schweinberger, Pickering, Jentzsch, Burton, & Kaufmann, 2002). Consequently, these components are widely thought to index an early phase of face processing – in particular, the feedforward ascending phase of processing (Bentin et al., 1996; Bentin & Deouell, 2000; Carmel & Bentin, 2002; Liu et al., 2000; but see Reiss & Hoffman, 2007 for evidence that local lateral or reciprocal influences modulate N170 amplitude). Inquiries into the impact of face-directed attention on these early components have thus far yielded conflicting results; despite a few demonstrations of M170/N170 modulations (Downing et al., 2001; Eimer, 2000a), the majority of studies have produced null results (Carmel & Bentin, 2002; Cauquil, Edmonds, & Taylor, 2000; Furey et al., 2006; Lueschow et al., 2004).
One view that has been invoked to explain the absence of early selection effects for faces (e.g., Cauquil et al., 2000; Lavie, Ro, & Russell, 2003) is that faces enjoy a privileged status and are fully processed automatically (Farah et al., 1995). A complementary view is that while face processing may be impervious to attentional influences during early stages of sensoriperceptual analysis, attention may influence later face processing. This perspective garners support from a recent report that face selection first appears later in the processing stream, approximately 250–300 milliseconds following face presentation, and may correspond with previously observed fMRI effects of attention within face-sensitive fusiform gyrus (Furey et al., 2006). Neither an automatic face processing (Cauquil et al., 2000) view nor a late face selection view (Furey et al., 2006) offers a sufficient explanation for empirical demonstrations of M170/N170 modulations (Downing et al., 2001; Eimer, 2000a). Our study aims to reconcile these discrepant results and clarify the role of attention in face processing.
In this study, we explored the possibility that attention’s influence on face processing may mirror mechanisms of spatial attention; that is, the attention-related signal gain during early sensoriperceptual analysis may be contingent on physical characteristics of the attended and unattended stimuli themselves. To investigate this hypothesis, we parametrically manipulated the discriminability of faces to test the prediction that attention to faces results in early modulations that are most prominent when the signal of the attended face is not prominent. We presented participants with superimposed images of faces and scenes (Downing et al., 2001; Furey et al., 2006; O’Craven et al., 1999) during ERP recording and directed their attention to the face or the scene. The superimposed images occupied the same spatial extent, minimizing confounds due to spatially directed attention. At the same time, we independently manipulated the “face signal” in the face-scene overlays by varying the discriminability of the face. Our prediction was that when face discriminability was low (i.e., low face signal), early selection mechanisms would modulate early face processing, resulting in larger amplitude N170 when participants directed their attention to the face relative to the scene. When face discriminability was high (i.e., high face signal), we predicted that the impact of selection on early face processing would be obscured by the high face signal, resulting in minimal change in the N170 amplitude when attention was directed to the face relative to the scene.
Sixteen volunteers (7 female; 14 right-handed) ranging in age from 19 to 33 years (mean age = 24 years) participated in this experiment. All participants had normal or corrected-to-normal vision. The University of Pennsylvania Institutional Review Board approved this study, and informed consent was obtained from all participants.
The stimuli used in this experiment were overlays consisting of one face and one scene image. 528 face images (equal number of male and female) and 528 scene images (equal number of indoor and outdoor) were used to create the overlays1. All faces were judged to be emotionally neutral. Scenes were chosen to avoid including any images of people or animals. All images were converted to grayscale and cropped using an oval template, so that peripheral information such as hair and ears was removed from all face images. Each image was presented only once during the course of the experiment.
Behavioral pre-testing (n = 8) confirmed that the gender of each face could be determined with 100% agreement, and that there was 100% agreement on whether a scene was an indoor or outdoor scene.
All images were luminance-adjusted to a mean luminance of approximately 220 cd/m2. Faces and scenes were superimposed, and the relative discriminability of the face and the scene was adjusted by manipulating the opacity of each layer in the overlay image. Each pixel in the resulting flattened overlay image was a weighted average of the corresponding pixels in the two original face and scene layers. Three stimulus types were created: in 1/3 of all overlays, the face was at 70% discriminability and the scene was at 30% discriminability (high face discriminability stimulus); in 1/3 of the overlays, the face and scene were both at 50% discriminability (medium face discriminability stimulus); and in the rest of the overlays, the face was at 30% discriminability and the scene was at 70% discriminability (low face discriminability stimulus). Figure 1 displays examples of the 3 stimulus types used in the experiment. Male/outdoor, male/indoor, female/outdoor, and female/indoor overlays were distributed equiprobably across stimulus types. Adobe Photoshop was used for all image processing.
In a dimly lit, sound-attenuated booth, participants sat in front of a computer monitor at a distance of approximately 70 cm. The experiment was divided into 6 blocks, each of which lasted approximately 3 minutes. At the beginning of each block, an instruction screen indicated whether participants were to attend to faces or to scenes. Participants were then presented with a series of face-scene overlay images. On blocks in which participants were instructed to attend to faces (face blocks), their task was to respond to each face-scene overlay with a button press indicating whether the face in the overlay was male or female. On blocks in which participants were instructed to attend to scenes (scene blocks), they were to respond indicating whether the scene in the face-scene overlay was an indoor or outdoor scene. The experiment was preceded by two brief practice blocks – one face block and one scene block.
On all trials, the overlay was centrally presented for 500 msec and followed by a 1300-1700 msec intertrial interval during which a central fixation cross was presented. Participants received general feedback about their performance at the end of each experimental block, but no indication was given about whether their response on a given trial was correct or incorrect.
The two manipulations of interest were (1) attention instruction (attend to faces or attend to scenes) that appeared prior to each experimental block, and (2) discriminability of the face in the overlay (high face discriminability, medium face discriminability, and low face discriminability overlay stimuli). All three stimulus types were equally likely to occur in scene blocks and face blocks. Random presentation of the three stimulus types ensured that participants’ top-down attention was deployed equivalently at the start of each trial, and prevented shifts of attention in anticipation of particular stimulus types (see Discussion). Thus, there were three stimulus types and two attention conditions in the experiment.
Electroencephalographic (EEG) activity was recorded from a custom cap with Ag-AgCl electrodes distributed over 64 scalp locations in a modified 10–20 montage. EEG was referenced to an electrode placed on the left mastoid. Electrooculogram (EOG) was recorded from electrodes placed at the outer canthi of both eyes and above and below the left eye to assess horizontal and vertical eye movement, respectively. All channels were amplified using a pair of SynAmps (Neuroscan, El Paso, TX) amplifiers at a band-pass of 0.1–100 Hz and digitized with a 500 Hz sampling rate. Electrode impedances were kept below 5 kΩ.
Prior to segmentation, all channels were re-referenced offline to an average of all scalp electrodes. Next, EEG and EOG were epoch-averaged to a period beginning 100 msec before stimulus onset to 700 msec following stimulus onset. Following baseline correction, trials containing eye movement artifact larger than 100μV or associated with incorrect behavioral responses were removed from analysis. Data averaging was performed after sorting by stimulus type (high face discriminability, medium face discriminability, and low face discriminability) and attention condition (attend to faces or attend to scenes). Averages were filtered using a band-pass from 0.5 to 20 Hz (24 dB/octave), and were exported to a spreadsheet for further statistical analyses.
Mean amplitude and peak latency values for ERP components were entered into separate repeated-measures analyses of variance (ANOVAs) to determine the effect of face discriminability and attention on component amplitude and latency. Topographic maps were created using the EEGLAB toolbox (Delorme & Makeig, 2004) in MATLAB v 7.1 (Mathworks, Natick, MA).
Response time (RT) and accuracy measures were collected from all participants. Attention condition and face discriminability were entered as factors in separate repeated- measures ANOVAs for RT and accuracy.
Four participants were removed from all analyses due to excessive eye movements (greater than 1.5% of trials rejected). The remaining 12 participants (4 female; 20 to 33 years of age) demonstrated negligible eye movements (average percentage of trials rejected due to eye movement = 0.4%).
A negative deflection of the ERP signal was seen across bilateral posterior temporal electrodes (PO5, PO7, PO6, PO8, P5, P7, P6, and P8) approximately 188 msec following stimulus onset. A grand average topographic map indicated a focus in right parieto-occipital electrodes (PO6 and PO8; see Figure 2A). Both the timing and topographic distribution of this component corresponded with numerous previous reports of the N170 (e.g., Bentin et al., 1996; Bentin & Deouell, 2000; Itier & Taylor, 2004). Due to slight inter-subject variability in peak N170 electrode, we used an electrode of interest (EOI) approach to analyze the N170 data (analogous to sensor of interest approach; see Downing et al., 2001; Liu et al., 2000). For each subject, the electrode that evinced the largest amplitude N170 across conditions was selected for analysis2. Mean N170 amplitude was calculated over a 40 msec time window that was centered at the peak N170 amplitude across conditions (188 msec). N170 amplitude measures are reported in Table 1.
N170 amplitude was strongly affected by attention condition and face discriminability (Figure 3A and 3B). Consistent with previous research demonstrating the sensitivity of the N170 to attention (Eimer, 2000a), participants displayed an increased amplitude N170 to the overlay stimuli during face blocks, when they were attending to the face, relative to scene blocks, when they were attending to the scene [F(1,11) = 13.47, p < 0.005]. As face discriminability increased, N170 amplitude increased substantially [F(2,22) = 18.01, p < 0.001]. Most critically, there was also a significant interaction between attention condition and face discriminability [F(2,22) = 5.192, p = 0.01]. Planned pairwise comparisons to explore this interaction indicated that increased N170 during attention to faces was observed most robustly in the low face discriminability condition (two-tailed t test; p < 0.005), less robustly but still significant in the medium face discriminability condition (p < 0.05), and absent in the high face discriminability condition (p > 0.94; see Figure 4). To further explore this interaction, we compared the N170 attention effect (attend to face - attend to scene) for high and low face discriminability stimuli, and found that the effect of attention of N170 amplitude was significantly greater for low relative to high face discriminability stimuli (two-tailed paired t test; p < 0.05).
An ANOVA for peak N170 latency revealed that both face discriminability and attention condition significantly affected peak N170 latency. The main effect of face discriminability [low face discriminability M = 190 msec, SD = 9 msec; medium face discriminability, M = 187 msec, SD = 10 msec; high face discriminability, M = 185 msec SD = 10 msec; F(2,22) = 20.34, p < 0.001] indicated that high face discriminability resulted in shorter N170 latency, while low face discriminability resulted in longer N170 latency. The main effect of attend condition indicated that attending to the face in the overlay decreased N170 latency relative to attending to the scene [attend to face, M = 186 msec, SD = 9 msec; attend to scene, M = 189 msec, SD = 11 msec; F(1,11) = 6.19, p < 0.05]. The interaction between face discriminability and attention condition was not significant for N170 latency (p > 0. 67). These data are consistent with previous reports demonstrating N170 latency modulations as a function of attention (Gazzaley, Cooney, McEvoy, Knight, & D’Esposito, 2005).
A late negativity (LN) was observed in occipital and temporal electrodes at approximately 292 msec that also appeared to be modulated by attention to faces. The scalp distribution of the LN peaked in occipital electrodes; however, the effect of attention during this time range was maximal in PO8 (compare Figures 2B and 2C). The LN attention effect showed similar inter-subject variability as the N170. All subsequent analyses of the LN therefore focused on the electrodes chosen for the N170 EOI analyses3. An ANOVA for mean amplitude of the LN (measured over the interval from 272 to 312 msec) confirmed that this component was sensitive to attention to faces, with greater amplitude when participants were attending to the face in the overlay than when they were attending to the scene [F(1,11) = 47.80, p < 0.001; Figure 3A]. Face discriminability had the opposite effect on the LN as it did on N170 amplitude, with greatest amplitude for the low face discriminability stimuli and decreasing amplitude as face discriminability increased [F(2,22) = 7.52, p < 0.005; see Figure 3B]. Unlike N170 amplitude, the effect of attention on the LN did not differ with face discriminability [F(2,22) = 1.45, p > 0.2; see Figure 4]. The LN showed similar topographic distribution and sensitivity to face-directed attention to the late face-sensitive response associated with feedback face processing in a previous ERMF study (Furey et al., 2006).
Attention and face discriminability significantly impacted the latency of the LN. LN latency was decreased during attention to faces (M = 286 msec, SD = 17 msec) relative to attention to scenes [M = 297 msec, SD = 26 msec; F(1,11) = 10.42, p < 0.01], and decreased as face discriminability increased [low face discriminability, M = 297 msec, SD = 21 msec; medium face discriminability, M = 289 msec, SD = 25 msec; high face discriminability, M = 288 msec, SD = 21 msec; F(2,22) = 5.85, p < 0.01]. There was no significant interaction between face discriminability and attention condition observed for component latency (p > 0.5).
Our behavioral task (male/female judgments during attention to faces, indoor/outdoor judgments during attention to scenes) was orthogonal to our electrophysiological measures of interest, and was designed to ensure that participants attended to the appropriate stimulus domain. Importantly, behavioral measures indexed participants’ response to the attended item in the overlay stimulus, while the N170 indexed participants’ response to the face in the overlay stimulus, whether it was attended or unattended. Despite this incongruence between behavioral and electrophysiological measures, we report a detailed analysis of our behavioral data below for completeness (see Table 2 for accuracy and RT data for all 6 experimental conditions), and demonstrate that our ERP data was not influenced by behavioral differences between conditions.
Overall accuracy (88.5% correct) and RT (649 msec) measures indicated that participants successfully focused their attention on the appropriate image in the overlay. Attention condition (attend to faces or attend to scenes) significantly affected behavioral measures. There was a significant main effect of attention on RT [F(1,11) = 14.9, p < 0.01], with shorter RT latencies when participants attended to faces relative to when they attended to scenes. Conversely, accuracy was significantly higher when participants attended to scenes [F(1,11) = 37.9, p < 0.001]. There was no main effect of face discriminability on either behavioral measure (Fs < 2, ps > 0.17). Unsurprisingly, we found an interaction between attention condition and face discriminability for both measures; accuracy [F(2,22) = 6.4, p < 0.01] and RT [F(2,22) = 71.8, p < 0.001] improved as the discriminability of the attended stimulus was increased.
To confirm that our N170 effects did not correspond with behavioral differences between conditions, we calculated the correlation between participants’ behavioral attention effect (attend to face – attend to scene) and their N170 attention effect (attend to face – attend to scene) in each face discriminability condition for both RT and accuracy. Both correlations were non-significant [Pearson’s correlations, two-tailed; r(36) = −0.117, p >0.49 for accuracy; r(36) = 0.032, p > 0.74 for RT], suggesting that the N170 attention effect was not driven by behavioral differences between the attend to face and attend to scene conditions4.
In the current study we investigated the hypothesis that attentional selection effects may be influenced by face discriminability during attention to faces. That is, if attention increases sensory gain to distinguish signal from noise (Hawkins et al., 1990; Hillyard et al., 1998; Luck et al., 1994), its beneficial effects may be minimal when the signal-to-noise ratio is very high, but greater when the signal-to-noise ratio is lower. While this hypothesis has found support in studies of spatial attention (Ekstrom et al., 2008; Hawkins et al., 1988; Martinez-Trujillo & Treue, 2002; Reynolds et al., 2000), this is the first study to investigate it in the context of attention to faces. We independently manipulated attention to faces and face discriminability, and found that early selection (evidenced by N170 modulation) was present for low and medium – but not high – discriminability faces, while selection at a later stage (evidenced by LN modulation) was comparable for all levels of discriminability. These results demonstrate that selection of faces can operate on early sensory processes (Downing et al., 2001; Eimer, 2000a), as well as later processing, and further illustrate that the degree of early selection is contingent on bottom-up properties such as face discriminability. Below we consider alternative interpretations of our results and discuss our findings in the context of prominent theories of attention.
A plausible account for the absence of N170 modulation for high discriminability faces in our study is that N170 response was saturated, obscuring any effect of attention. In their study of spatial attention, Reynolds and colleagues (2000) ruled out a similar interpretation of their results by presenting gratings at suboptimal orientations. This ensured that the overall neuronal response was not saturated; the absence of attentional modulation to high contrast stimuli could therefore be attributed to a ceiling in the response to that particular type of stimulus. Although the nature of our data precludes us from distinguishing between an overall saturation of the N170 response and a ceiling effect of the N170 to high discriminability faces, our behavioral task required basic-level categorizations of faces and scenes, which previous work has shown to elicit less robust sensory ERP responses than tasks requiring subordinate level judgments (Tanaka, Luu, Weisbrod, & Kiefer, 1999). We therefore argue that our results are unlikely to be due to a saturation of the N170. Nonetheless, we do not consider a saturation account to be incompatible with our hypothesis; either a ceiling or saturation interpretation implies that attention preferentially increases sensory gain for suboptimal but not optimal faces.
A second alternative explanation we address is the idea that differential N170 modulation across stimulus types was driven by different degrees of top-down attention. While we did not examine parietal and prefrontal sources of attentional control (Serences et al., 2004; Yantis & Serences, 2003), our experiment was designed to rule out differential contributions from top-down sources across stimulus type. Discriminability was randomized within blocks so that participants were unable to predict the discriminability of the face or scene on the upcoming trial. Furthermore, any differential strategic allocation of top-down attention in response to stimulus presentation would not be indexed by early sensory components, as has been shown consistently in cueing studies of spatial attention (Mangun, 1995; Mangun & Hillyard, 1991). We confirmed that N170 differences could not be due to stimulus-induced shifts of spatial attention by examining P1 amplitude. P1 amplitude was not influenced by any of our experimental factors – in particular, stimulus type had no effect on P1 amplitude (p > 0.36), suggesting that spatial attention was not strategically varied across stimulus types.
In addition to demonstrating attentional modulation of early face processing, our results provide a possible alternative explanation for previous data suggesting that the M170/N170 is insensitive to attention. Our high face discriminability condition replicated previous work wherein early face responses were not modulated by attention (Carmel & Bentin, 2002; Cauquil et al., 2000; Lueschow et al., 2004), but our medium and low discriminability conditions corroborated evidence that these responses are influenced by attention (Downing et al., 2001; Eimer, 2000a). As such, the current results are incompatible with claims that face processing is impervious to attentional modulation during early stages of sensoriperceptual analysis, but indicate that stimulus factors may have obscured early attentional effects in some previous studies. As our task differed from previous investigations, we acknowledge the possibility that task factors may have combined with our stimulus manipulation to illuminate the effects of attention on the N170. However, we stress that our effects cannot be due to the task itself, as we successfully replicated previous null effects of attention in our high face discriminability condition. Other cognitive and perceptual factors may similarly modulate the effects of face-directed attention on sensory processing.
One perceptual factor that has been shown to influence early sensory processing is perceptual load (Handy & Mangun, 2000). Perceptual load theory is an often-cited and elegant reconciliation between early and late theories of attentional selection that posits that early selection occurs when visual capacity is taxed or when processing demands are high, and that late selection occurs when visual information does not reach capacity or when processing demands are low (Lavie, 1995; Lavie & Tsal, 1994). The present results are consistent with the spirit of this line of inquiry, as perceptual properties determined the degree of early selection. Interestingly, in the present study, early selection was modulated by discriminability, which has been empirically distinguished from perceptual load manipulations in the context of spatial attention tasks (Lavie & De Fockert, 2003). Our results complement the results from perceptual load studies and extend them by illustrating that the degree of early selection may be determined more broadly by the signal-to-noise ratio of the information in the environment. This may be achieved by changes in the quantity or quality of relevant or irrelevant information.
A recent theoretical account by Dosher and Lu has outlined complementary mechanisms by which selective attention improves the signal-to-noise ratio of visual input and specified the stimulus contexts in which these mechanisms are likely to operate (Dosher & Lu, 2000b; Z.-L. Lu & Dosher, 1998). Their model states that under conditions of minimal external noise, selection improves behavioral sensitivity via ‘signal enhancement’ – by enhancing processing of the relevant stimuli. When high levels of noise are present, their model proposes that selection promotes behavioral sensitivity through a ‘noise exclusion’ mechanism, which suppresses the processing of irrelevant information (Z.-L. Lu & Dosher, 1998). Empirical work targeting the specific conditions under which each of these mechanisms operates indicated that external noise exclusion is the primary mechanism by which selective attention facilitates behavioral sensitivity during spatial (Dosher & Lu, 2000a) and object-based tasks (Han, Dosher, & Lu, 2003). In the context of the current study, a noise exclusion view would predict that when a face represents a high level of noise (i.e., when participants attend to the scene in a high face discriminability trial), attentional mechanisms should exclude or suppress face processing to improve scene discrimination performance. Noise exclusion would presumably be reflected in reduced neural activity tied to face processing during the attend to scene condition relative to the attend to face condition. Yet, there were no differences across attention conditions in the N170 during high face discriminability trials. While at first blush, this null result might be viewed as evidence against the noise exclusion mechanism, it is important to note that the N170 is a face-sensitive but not a face-selective component. That is, categories of stimuli other than faces evoke an N170, albeit to a lesser degree than faces (Itier & Taylor, 2004). We note that this study was not intended to formally test the noise exclusion hypothesis in the context of nonspatial attention, but that this may be a fruitful direction for future studies. Using a method in which input-specific neural activity can be confirmed, such as single-unit recording within perceptual cortices, may help delineate the role of noise exclusion and signal enhancement mechanisms in the neural profile of early selection.
Although the models outlined above contrast early versus late selection and signal enhancement versus noise exclusion respectively, the influential biased competition model of attention (Desimone & Duncan, 1995) addresses both of these concepts. Specifically, the biased competition model proposes that selection is the result of the biasing of competitive interactions in capacity-limited visual cortex. Bias signals result in the relative enhancement of attended inputs, but only when there is competition for processing resources. This idea is consistent with our observed results of relative enhancement of early processing only when face signal was suboptimal. Further, the biased competition model predicts enhancement of attended inputs at multiple levels of the processing stream (Desimone & Duncan, 1995), which corresponds to our observation of selection at the level of both the N170 and the LN.
Based on the timing of the LN, its sensitivity to face-directed attention, its sensitivity to face discriminability, and prior documentation of event-related components with similar properties (Furey et al., 2006; Lueschow et al., 2004), we speculate that the LN indexes feedback face processing from higher visual areas. This may explain why LN amplitude increased as face discriminability decreased: object recognition mechanisms may require increased feedback processing to distinguish stimuli of poor quality (Bar, 2003). That sensory information and attention did not interact at the level of the LN suggests that attentional selection facilitates multiple types of information in the visual processing stream. Future studies may benefit from explicitly investigating the interrelationships between selection at early and later levels of processing as a function of perceptual, cognitive, and response-level manipulations.
In sum, our findings parallel the results from behavioral and single unit studies of spatial attention suggesting that the degree of early selection is determined by the signal quality of perceptual information in the environment. These results complement a growing body of literature demonstrating that activity in fronto-parietal sources and perceptual sites exhibits similar properties in spatial-, object-, and feature-based attention (Corbetta, Miezin, Dobmeyer, Shulman, & Petersen, 1990; Muller et al., 2006; Roelfsema, Lamme, & Spekreijse, 1998; Serences et al., 2004; Slagter, Kok, Mol, & Kenemans, 2005; Valdes-Sosa, Bobes, Rodriguez, & Pinilla, 1998), as well as computational work implying common mechanisms for attention across a variety of domains (Tsotsos et al., 1995).
The authors thank Pauline Baniqued, Anish Mehta, and Wen Liu for their assistance in creating the overlay stimuli used in the experiment. We also thank Tom Busey, Allen Osman, Zev Rosen, Ling Wong, and two anonymous reviewers for their helpful comments. K.K.S. is supported by National Institute of Health Grant T32 MH017168.
The authors declare no conflict of interest.
1Face and scene images were obtained with permission from: the Productive Aging Laboratory Face Database (Minear & Park, 2004); the Psychological Image Collection at Stirling (http://pics.psych.stir.ac.uk/); the FG-NET Frank Wallhoff Facial Expressions and Emotion Database http://www.mmk.ei.tum.de/~waf/fgnet/feedtum.html, Technische Universitat Munchen, 2006; the VALID database, http://ee.ucd.ie/validdb (Fox, O’Mullane, & Reilly, 2005); the Face-Place Face Database Project (http://www.face-place.org), Copyright 2007, Michael J. Tarr, funding provided by NSF award 0339122; the Georgia Tech Face Database, ftp://ftp.ee.gatech.edu/pub/users/hayes/facedb/; the CMU Pose, Illumination, and Expression Database (Sim, Baker, & Bsat, 2003); the McGill Calibrated Colour Image Database (Olmos, A. and Kingdom, F. A. A. (2004) McGill Calibrated Colour Image Database, http://tabby.vision.mcgill.ca)
2Analyses using data from PO8 and PO6 yielded qualitatively similar results.
3Supplementary analyses confirmed that LN data from occipital electrodes (the topographic focus of the LN) and PO8 (the focus of the LN attention effect) yield statistically similar results.
4We further confirmed that tradeoffs between speed and accuracy during male/female judgments vs. indoor/outdoor judgments were not responsible for our ERP effect by comparing the ERP results of the three subjects with the smallest speed and accuracy differences (attend to scene – attend to face) across attention conditions (mean accuracy difference = 3.5%, mean RT difference = 11 msec), to those of the three subjects with the largest speed and accuracy differences (mean accuracy difference = 8.0%, mean RT difference = 90 msec). These two groups of subjects showed qualitatively similar ERP results, with no N170 modulation in the high face discriminability condition, and a substantial N170 modulation in the low face discriminability condition.