|Home | About | Journals | Submit | Contact Us | Français|
The biased competition theory of selective attention has been an influential neural theory of attention, motivating numerous animal and human studies of visual attention and visual representation. There is now neural evidence in favor of all three of its most basic principles: that representation in the visual system is competitive; that both top-down and bottom-up biasing mechanisms influence the ongoing competition; and that competition is integrated across brain systems. We review the evidence in favor of these three principles, and in particular, findings related to six more specific neural predictions derived from these original principles.
Suppose that you are looking for a face in a crowd. Two basic phenomena occur while processing that scene. First, not all faces can be processed at the same time, that is, there is limited processing capacity. Second, while processing a particular face, one is able to filter out the unwanted information in the scene, that is, there is selectivity. The biased competition theory of selective attention rests on three general principles that conceptualize these basic observations further (Duncan, 1996). First, of the many brain systems that represent visual information (sensory and motor, cortical and subcortical), most are competitive. Within each system, a gain of representation for a particular visual object will be at the expense of other objects’ representations. Such competitive interactions among multiple objects (such as the faces in a crowd) occur automatically and operate in parallel across the visual field. Second, competition is controlled within and across brain systems. If one looks for a particular object (e.g. a friend’s face), units matching the internal ‘template’ of that object will be pre-activated and therefore gain an advantage by receiving an increased processing weight. Thus, such top-down mechanisms introduce bias signals that help resolve the ongoing competition. The competition among multiple objects can also be biased by bottom-up mechanisms that separate figures from their background, or constitute objects by principles of perceptual organization. And third, the competition between systems is integrated. As a visual object gains dominance in representation within one system (e.g. visual cortex), it will tend to gain similar dominance in other systems (e.g. higher order frontal and parietal areas). An example is given by representations of visual space. All units that represent a certain location in multiple spatial maps will be activated together, when the object at that location gains dominance in the system.
In the following, we review the literature on biased competition theory as they relate to these three general principles and the basic neural tenets (Desimone, 1998) that build on these theoretical principles. In particular, we review the literature as it pertains to six more specific predictions regarding visual processing in the cortex, each of which is discussed in its own section. The first two predictions relate to Duncan’s first general principle that visual information is processed in a competitive manner. The next three relate to the principle of control: competition can be controlled, or biased in favor a particular object. The final prediction is related to Duncan’s third principle that competition is integrated across systems, such that when an object’s representation gains dominance in one system, it will gain dominance throughout the cortex. These 6 tenets and Duncan’s three more general principles provide a useful framework in which to integrate a diversity of findings from single-cell recording, human fMRI, human behavioral data, and micro-stimulation studies.
The first and most fundamental prediction of biased competition theory is that objects compete for neural representation in visual cortex. A large body of evidence from both single-cell physiology and neuroimaging suggests that multiple stimuli present at the same time within a neuron’s receptive field (RF) are not processed independently, but interact with each other in a mutually suppressive way. In single-cell physiology studies (Britten & Heuer, 1999; Luck, Chelazzi, Hillyard & Desimone, 1997; Miller, Gochin, & Gross, 1993; Recanzone, Wurtz & Schwarz, 1997; Reynolds, Chelazzi & Desimone, 1999; Snowden, Treue, Erickson & Andersen, 1991; Rolls & Tovee, 1995), neural responses to a single visual stimulus presented alone in a RF were compared to the responses evoked by that stimulus when a second one was presented simultaneously within the same RF. The responses to the paired stimuli were found to be smaller than the sum of the responses evoked by each stimulus individually and turned out to be a weighted average of the individual responses (Reynolds et al., 1999). These suppressive interactions among multiple stimuli present simultaneously in the visual field are consistent with the idea that these stimuli are competing for representation by single neurons in visual cortex. Suppressive interactions among multiple stimuli have been found in several visual areas in the monkey brain, including V2, V4, MT, MST, and IT (Miller et al., 1993; Recanzone et al., 1997; Reynolds et al., 1999; Snowden et al., 1991).
In the human brain, evidence for neural competition has been found using an fMRI paradigm (Kastner, De Weerd, Desimone & Ungerleider, 1998; Kastner, De Weerd, Pinsk, Elizondo, Desimone & Ungerleider, 2001; Beck & Kastner, 2005, 2007) in which four colorful and patterned visual stimuli, which optimally activate ventral visual cortex, were presented in four nearby locations to the periphery of the visual field, while subjects maintained fixation. Critically, these stimuli were presented under two different presentation conditions, sequential and simultaneous. In the sequential presentation condition, each stimulus was presented alone in one of the four locations, one after the other. In the simultaneous presentation condition, the same four stimuli appeared simultaneously in the four locations. Thus, integrated over time, the physical stimulation parameters were identical in each of the four locations in the two presentation conditions. However, suppressive (competitive) interactions among stimuli within RFs could take place only in the simultaneous, not in the sequential presentation condition.
The four peripheral stimuli were irrelevant to the subjects’ task. Instead, subjects were asked to count the occurrences of targets in a stream of letters and symbols presented at fixation. This demanding task not only ensured subjects’ fixation, but also meant that attention was engaged at fixation and not drawn to the peripheral stimuli. Thus, competitive interactions, as indexed by the difference in response evoked by sequentially and simultaneously presented stimuli, could be assessed under well-controlled attentional conditions and in the absence of directed attention to the peripheral stimuli.
Consistent activations evoked by the visual presentations of the colorful pattern stimuli as compared to blank periods were found in areas V1, V2/VP, V4, TEO, V3A, and MT, which were determined on the basis of retinotopic mapping. Consistent with the predictions from biased competition and the single-cell literature, simultaneous presentations evoked weaker responses than sequential presentations in all activated visual areas. The response differences were smallest in V1 (Fig. 1A) and increased in magnitude towards ventral extrastriate areas V4 (Fig. 1A) and TEO, and dorsal extrastriate areas V3A and MT. The suppressive interactions studied with the sequential/simultaneous paradigm have been interpreted as a neural correlate for competition among multiple objects in the human visual cortex. It is important to note that the suppressive interactions across visual cortex occurred automatically and in the absence of attentional allocation to the stimuli. Thus, neural competition appears to be a constantly ongoing process in the representation of natural visual scenes.
Another critical prediction for the neural implementation of biased competition theory is that the competitive interactions should be strongest when the stimuli activate the same local region of cortex. If stimuli are competing for representation by a particular neuron, then the competitive interactions should be most apparent when the stimuli fall within the RF of that same neuron. The fMRI experiments described above provided the first evidence in support of this hypothesis (Kastner et al., 1998, 2001; Beck & Kastner, 2005, 2007). Each of the four stimuli presented in the periphery subtended approximately 2° of visual angle, and the entire peripheral display subtended 4 × 4° and was presented at 6–10° in the periphery of the visual field. As mentioned above, the difference between the sequentially and simultaneously presented stimuli, indicating the degree of competition among stimuli, increased in magnitude with each subsequent area (Fig. 1A) consistent with the idea that the competitive interactions among the four stimuli scaled with the increasing RF sizes across visual cortex. In V1 and V2, where small RFs would encompass only a small portion of the 4 × 4° display, very little evidence of competition was found, whereas competition was greatest in V4, TEO, V3A, and MT, where RFs are large enough to encompass all four stimuli. Moreover, this pattern of results was seen independent of the particular stimulus type, both for complex images (Kastner et al., 1998, 2001) and also for more simple Gabor patches (Beck & Kastner, 2005, 2007).
Two other predictions follow directly from the idea that competitive interactions are scaled to RF size across cortex. First, the strength of competitive interactions among multiple stimuli will be modulated as a function of the spatial dimension of the array in a given visual area. Thus, changing the display size of multiple stimuli arrays should induce strong competition in all visual areas that encompass the display. For instance, a 2 × 2° display will induce competitive interactions in an early visual area with small RFs, while a larger display of 4×4° size or greater will induce much weaker neural competition in early visual cortex. Second, the degree of competitive interactions will also change as a function of the spatial separation of the competing stimuli in the array. According to the RF hypothesis, the magnitude of the suppressive interactions should be inversely related to the degree of spatial separation among the stimuli.
These predictions were directly tested in a second study, in which 2×2° displays instead of 4×4° displays were used and the spatial separation among the stimuli was parametrically modulated from 0.5° to 7°. In agreement with the first prediction, suppressive interactions were twice as strong in early visual areas V1 and V2 with the 2×2° display as compared to those induced with the 4×4° display. In fact, the degree of competition induced with the smaller display size was similar in areas V2, V4 and TEO, where the RF sizes were sufficiently large to encompass the entire display (Kastner et al., 2001). In agreement with the second prediction, separating the stimuli by 4° abolished suppressive interactions in V2, reduced them in V4, but did not affect them in TEO. Separating the stimuli by 6° led to a further reduction of suppression effects in V4, but again had no effect in TEO. This is shown for a single subject in Figure 2. Simultaneously presented stimuli induced strong suppressive interactions in area V4 with a 2 × 2° display, but not with a 7 × 7° display, whereas the differences in display size did not affect the activity evoked in area TEO. These results confirmed the hypothesis that competitive interactions occur at the level of the RF. Further, by systematically varying the spatial separation among the stimuli and measuring the magnitude of suppressive interactions, average RF sizes at an eccentricity of about 5° were estimated to be less than 2° in V1, in the range of 2 – 4° in V2, about 6° in V4, and larger than 6° but still confined to a quadrant in TEO. These numbers may underestimate RF sizes in the human visual cortex due to additional suppressive influences from beyond the RF, which cannot be distinguished from interactions within RFs in this experimental paradigm. It was striking, however, that these estimates of RF sizes in human visual cortex, as determined on the basis of hemodynamic responses, are similar to those measured in the homologous visual areas of monkeys, as defined at the level of single cells (e.g. Desimone & Ungerleider, 1989).
The above results showing that the magnitude of the competitive interactions vary with the likelihood that the stimuli will fall within the same RF also speaks against alternative explanations that seek to explain the differential activity evoked by sequential and simultaneous presentations without appealing to spatial interactions among the stimuli. If the differential activity was due to non-spatial differences between the presentation conditions (eg. there are more visual onsets in the sequential condition than the simultaneous) then we would expect to see the differences regardless of visual area or the distance between stimuli. Although it is difficult to extrapolate from single-cell data to the BOLD (Blood Oxygenation Level Dependent) signal, which not only reflect responses at the population but also depends on the subtleties of the hemodynamic response, the fMRI data presented thus far, as well as the data discussed in the subsequent sections, all agree qualitatively with the data at the single-cell level: multiple stimuli presented simultaneously in the visual field are not processed independently but instead interact with each other in a competitive way.
Thus far, we have discussed evidence supporting the view that visual stimuli compete for neural representation in multiple areas of visual cortex. Moreover, there is evidence both from single-cell and fMRI studies suggesting that this competition is occurring at the level of the RF. When multiple stimuli are presented in nearby locations, and thus are likely to fall within the same RF, the result is suppression of the neural response, suggesting a weakened representation of individual items. Importantly, these competitive interactions occur automatically and represent therefore a default state for the visual system to process information. Because our visual world is typically cluttered with items that necessarily fall within neighboring locations on cortex, it is likely that the visual system is often in a state of suppression, with less than optimal neural representations of individual objects. We next address ways in which this ongoing competition among multiple stimuli can be resolved.
The second tenet of biased competition theory is that the competition can be controlled by introducing biases that favor the processing of a particular stimulus at the expense of competing stimuli. Moreover, biased competition theory posits multiple biasing mechanisms. Competition can be biased via top-down control mechanisms or via bottom-up stimulus driven mechanisms. These two mechanisms are discussed separately in following sections, although in everyday life they are likely to interact. Both our goals and the properties of the stimulus interact to determine what subset of the visual world will be selected for further processing.
We would also like to note that in his formulation of biased competition theory, Duncan (1996) originally suggested that attention was an emergent property of the various mechanisms (top-down or bottom-up) that converge to select an item for further processing. However, in accordance with many other authors, we reserve the term attention for the selection that occurs due to top-down directed biases, and use the more general term selection when talking about the result of either or both top-down and bottom-up biases.
The term top-down is used to refer to biases that are generated by the cognitive demands of the task, and not by the competing stimuli themselves. Top-down biases are thought to exert there influence of visual cortex, at least initially, via feedback mechanisms from frontoparietal cortex. The top-down mechanism that has been the subject of most studies thus far, and that we will focus on in this review, is spatially directed attention to a location or feature of a stimulus. However, one should keep in mind that other top-down mechanisms related to memory processes, or emotional and motivational behavior to name just a few can introduce top-down biases as well. Again, both single-cell physiology and fMRI evidence supports the second tenet of biased competition theory.
Single-cell recording studies have shown that top-down spatially directed attention can bias the competition among multiple stimuli in favor of the attended stimulus by modulating competitive interactions. When a monkey directed attention to one of two competing stimuli within a RF, the responses in extrastriate areas V2, V4 and MT to the pair of stimuli were heavily weighted in favor of the attended stimulus; that is, responses to the pair of stimuli were similar to responses evoked by the attended stimulus presented alone (Luck, Chelazzi, Hillyard & Desimone, 1997; Reynolds et al., 1999; Recanzone & Wurtz, 2000). In other words, attention counteracted the suppressive influence of the competing stimulus. The attentional effects were less pronounced when the second stimulus was presented outside the RF, which presents another indication that competition for processing resources within visual cortical areas takes place most strongly at the level of the RF. These findings imply that attention may resolve the competition among multiple stimuli by counteracting the suppressive influences of nearby stimuli, thereby enhancing information processing at the attended location. This may be an important mechanism by which attention filters out information from nearby distracters (Desimone & Duncan, 1995).
A similar mechanism appears to operate in the human visual cortex. Kastner et al. (1998) studied the effects of spatially directed attention on multiple competing visual stimuli in a variation of the paradigm, described in the last section. In addition to the two different presentation conditions, sequential and simultaneous, two different attentional conditions were tested, attended and unattended. During the unattended condition, attention was directed away from the peripheral visual display by having subjects count letters at fixation. In the attended condition, subjects were instructed to attend covertly to the peripheral stimulus location in the display closest to fixation and to count the occurrences of a target stimulus. Directing attention to this location enhanced activity to sequentially and to simultaneously presented stimuli in extrastriate areas V2/VP, V4, TEO, V3A, and MT, with increasing effects from early to later stages of visual processing. However, in accordance with the monkey physiology data showing that attention reduced suppressive interactions among stimuli, directed attention led to greater increases of fMRI signals to simultaneously presented stimuli than to sequentially presented stimuli in areas V4 (Fig. 1B) and TEO. The magnitude of the attentional effect scaled with the magnitude of the suppressive interactions among stimuli, with the strongest reduction of suppression occurring in ventral extrastriate areas V4 and TEO, suggesting that the attention effects scaled with RF size similar to the competition effects (see also Bles et al., 2006). These findings support the idea that directed attention enhances information processing of stimuli at the attended location by counteracting suppression induced by nearby stimuli. This may be an important mechanism by which unwanted information is filtered out from nearby distracters.
Importantly, these data and the biased competition framework suggest that areas at intermediate levels of visual processing such as V4 and TEO are important sites for the filtering of unwanted information by counteracting competitive interactions among stimuli at the level of the RF. This notion is further supported by studies in a patient with an isolated V4 lesion and in monkeys with lesions of areas V4 and TEO (Gallant, Shoup & Mazer, 2000; DeWeerd et al., 1999). In these studies, subjects performed an orientation discrimination of a grating stimulus in the absence and in the presence of surrounding distracter stimuli. Significant performance deficits were observed in the distracter-present, but not in the distracter-absent condition suggesting a deficit specifically in the efficacy of the filtering of distracter information. This interpretation was further corroberated by recordings from TE in the same lesioned monkeys (Buffalo et al., 2005). In an intact visual quadrant, TE cells responded in a similar way to an attended target in distractor-present displays as they did to an attended target in distractor-absent displays, suggesting that filtering occurred at or before area TE. However, TE responded very differently to targets in the distractor-absent and distractor-present displays in a lesioned quadrant, suggesting that without V4 and TEO, distractors were not being filtered. Interestingly though, when the target and distractors were separated by a greater distance, such that they no longer fell within a typical V4 RF, both behavioral performance and attentional selectivity were restored in TE. These data suggest that the level at which filtering occurs may depend on where the stimuli first compete; as the distance between targets and distractors increases filtering is pushed to later areas with larger RFs.
Finally, this filter mechanism is compatible with the descriptive notion that directed attention to a stimulus may cause the RF to shrink around the attended stimulus, thereby leaving the unattended stimuli in nearby locations outside the RF (Moran & Desimone, 1985). Given that the magnitude of suppressive interactions scaled with RF size in the fMRI studies (Kastner et al., 2001), the RF sizes in V4 and TEO were estimated during directed attention to the display. The reduced suppressive interactions in V4 and TEO during directed attention were similar in magnitude to the suppressive interactions obtained in area V2 in the unattended condition (Kastner & Pinsk, 2004). Hence, directed attention can be described as causing a constriction of RFs in V4 and TEO from 4–8 degrees to about 2 degrees, thereby presumably enhancing spatial resolution. This interpretation is compatible with behavioral studies showing that spatial attention improves acuity (Yeshurun & Carrasco, 1998) and is also supported by recent physiology findings of dynamic RF properties during the allocation of visual attention in macaque area MT (Womelsdorf et al., 2006; see also Hopf et al., 2006). We should like to emphasize, however, that the concept of a shrinking RF is simply a convenient description of the data. Presumably, this apparent shrinking is the result of attention changing the weighting of the input to the cells, such that those cells that correspond to the attended stimulus are weighted more strongly than those that correspond to the unattended stimulus. In other words, the underlying mechanism of such “shrinking” is the biasing of competition in favor of the attended stimulus.
We have proposed that the filtering of unwanted information may be achieved via top-down biasing signals on visual cortex. However, thus far we have only described activity patterns thought to be the result of that bias. More direct evidence of signals reflecting the top-down bias per se comes from studies in which the effects of top-down directed attention were assessed in the absence of visual stimulation; that is, there is evidence that attentional biasing signals not only manifest themselves in the modulation of visually-driven activity, but also in the absence of any visual stimulation whatsoever. Single-cell recording studies have shown that spontaneous (baseline) firing rates were 30–40% higher for neurons in areas V2 and V4 when the animal was cued to attend covertly to a location within the neuron’s RF before the stimulus was presented there (Luck et al., 1997). To study such ‘baseline increases’ in the human brain a third experimental condition was added to the design that was used to investigate competitive interactions and their modulation by spatial attention, as described above (Kastner, Pinsk, De Weerd, Desimone & Ungerleider, 1999a). In addition to the two visual presentation conditions, sequential and simultaneous and the two attentional conditions, unattended and attended, an expectation period preceding the attended presentations was introduced, during which subjects were required to direct attention covertly to the target location and instructed to expect the occurrences of the stimulus presentations. In this way, the effects of attention in the presence (ATT in Fig. 3A) and absence (EXP in Fig. 3A) of visual stimulation could be studied. In the visual system, as illustrated for area V4 in Fig. 3A, the fMRI signals increased during the expectation period (textured epochs), before any stimuli were present on the screen. This increase of baseline activity was followed by a further increase of activity evoked by the onset of the stimulus presentations (gray shaded epochs). The baseline increase was found in all visual areas with a representation of the attended location, indicating that it was topographically specific. It was strongest in V4, but was also seen in early visual areas, including the lateral geniculate nucleus (LGN; O’Connor et al., 2002). In the framework of biased competition theory, baseline increases can be thought of as a direct measure of the increased processing weight that a location receives during the allocation of attention.
Increases in baseline activity have also been found to depend on the expected task difficulty. Ress and colleagues (2000) showed that increases in baseline activity in V1 were stronger when subjects expected a visual pattern that was difficult to discriminate compared to a pattern that was easy to discriminate. In areas that preferentially process a particular stimulus feature such as V4/TEO and MT (e.g. color or motion), increases in baseline activity were initially shown to be stronger during the expectation of a preferred as compared to a non-preferred stimulus feature (Chawla, Rees, & Friston, 1999; Shulman et al., 1999). However, these results were not confirmed in more recent studies (McMains et al., 2007). Rather, it was shown that baseline increases did not differ in color-and motion-selective areas during the expectation of the preferentially processed color or motion stimuli at a peripheral target location. Thus, the baseline signals appeared to reflect a spatial, but not a feature bias, thereby making it unlikely that the biasing signals in visual cortex obtained during expectaton periods reflect a memory template (if at all, only for spatial location, but not for other stimulus properties). Importantly, the baseline increases did not sum up linearly with the visually-evoked signals, thereby ruling out the possibility that attentional modulaton of visually-driven activity can be explained by an additive model of sustained baseline increases.
The baseline increases found in human visual cortex may be subserved by increases in spontaneous firing rate similar to those found in single-cell physiology studies (Luck et al., 1997), but summed over large populations of neurons. The increases evoked by directing attention to a target location in anticipation of a behaviorally relevant stimulus are likely to reflect a top-down feedback bias in favor of the attended location in the human visual system.
Competition can not only be resolved by top-down mechanisms such as spatially directed attention, but also by bottom-up stimulus-driven signals. Unlike top-down biases, bottom-up biases have their source in the visual stimulus itself. For instance, as discussed in the next section, competition may be biased in favor of a visual salient item that contrasts with its background. Many bottom-up factors are likely to be generated in visual cortex itself. However, this is not necessarily the case. For instance, processing may be biased in favor of an emotionally salient item via connections with the amygdala. The critical aspect of a bottom-up bias is that it is something about the stimulus itself that induces the bias, as opposed to being imposed by the goals of the observer. Thus, bottom-up biases will affect processing in visual cortex even in the absence of top-down mechanisms directed to the stimuli in question.
In physiology studies, Reynolds and Desimone (2003) found that the responses of V4 neurons to a pair of stimuli presented together in a neuron’s RF were dominated by the more salient stimulus, suggesting that competition may have been biased in favor of the salient stimulus. In particular, as has been shown previously, the response of a V4 neuron to a reference grating decreased when a second (“probe”) grating of equal contrast was placed in the RF of the cell. However, as the contrast of the probe grating was decreased, and thus the relative salience of the reference grating increased, suppressive interactions among the stimuli were reduced. In other words, the response of the cell to the pair of stimuli resembled more and more the response to the reference grating alone, consistent with the idea that competition was being biased in favor of the more salient reference grating.
We found a similar effect in human visual cortex using color-orientation pop-out displays (Beck & Kastner, 2005) in a variation of the sequential/simultaneous fMRI paradigm, described above. Instead of the complex images used in the previous designs, four Gabor patches (wavelength, 0.47°; standard deviation of gaussian envelope, 0.73°; each approximately 2 × 2° in size) of different colors and orientations were presented in the upper right quadrant of the visual field. In addition to the sequential and simultaneous presentation conditions, the four stimuli appeared in two display contexts: pop-out displays, in which a single item differed from the others in color and orientation, and heterogeneous displays, in which all items differed from each other in both dimensions. We predicted that similar to the way in which top-down attention can counteract competitive interactions among multiple stimuli, bottom-up signals related to pop-out can bias competition in favor of the salient stimulus, resulting in reduced competitive interactions among stimuli appearing in the context of pop-out relative to heterogeneous displays. In accordance with previous findings, we found robust suppressive interactions among multiple stimuli in areas V2/VP and V4 when the stimuli were presented in the context of heterogeneous displays (Fig. 4a). However, this suppression was eliminated when the same stimuli were presented in the context of pop-out displays, consistent with the prediction that visual salience can bias competitive interactions among multiple stimuli in intermediate processing areas (Fig. 4b).
As in previous studies, we did not find evidence of competitive interactions in V1, presumably due to the small RF sizes in that area. However, an effect of display context was evident in this early visual area as well. Specifically, simultaneously presented pop-out displays evoked more activity than any of the other three conditions (Fig. 4). These results are consistent with single-cell physiology studies showing that neural correlates of pop-out can be found as early as in area V1 (Knierim & Van Essen, 1992; Nothdurft, Gallant, & Van Essen, 1999; Kastner, Nothdurft & Pigarev, 1999b). In particular, an oriented line surrounded by orthogonally oriented lines (i.e. a pop-out stimulus) evoked greater activity than the same stimulus embedded in a uniform field of lines or a random field of lines (i.e. a heterogenous stimulus). Moreover, computational models suggest that such contrast, or pop-out related signals, can be computed in V1 (Li, 1999). Taken together, these data indicate that V1 may be the source of the signal that biases neural competition in extrastriate cortex.
In short, both single-cell and fMRI experiments corroborate the predictions of biased competition theory that there are at least two ways in which competitive interactions can be biased in favor of a particular stimulus: via top-down factors such as spatially directed attention or via bottom-up stimulus-driven factors such as stimulus salience. Both mechanisms, however, constitute a spatial bias that results in a single object dominating the response of the neuron. Bias competition theory also predicts other bottom-up effects of the stimulus on competition that may not result in a bias per se. More specifically, it predicts that competitive interactions should be modulated by perceptual grouping.
As originally noted by the Gestalt psychologists (e.g., Rubin, 1915; Wertheimer, 1923), visual stimuli in cluttered scenes may be perceptually grouped according to their similarity, proximity, common fate and other stimulus properties (Palmer, 1992; Palmer & Rock, 1994), linking elements of a scene that are likely to belong together and thereby segmenting the scene into a more limited number of object-based perceptual units. Desimone and Duncan (1995) predicted that competition should occur between these perceptual groups, but not among multiple items within a perceptual group. This prediction was derived, in part, from behavioral experiments of visual search. Bundesen & Pedersen (1983) had subjects search for a target letter, defined by its color, in a field of colored letters. When the colored letters were distributed randomly around the screen, Bundesen & Pedersen found that reaction times (RTs) to detect the target increased with increasing number of distractors, as is typical of serial search results. However, when the distractors were organized into groups on the basis of color, they found that adding distractors to a visual search display had little effect on visual search times when those distractors grouped with existing distractors, suggesting little competition among distractors in the same group; whereas, the number of perceptual groups in the visual search display has a large effect on the speed with which subjects can find the target (Bundesen & Pedersen, 1983), suggesting competition between groups. More generally, such a prediction is a natural extension of Duncan’s (1996) overriding principle that competition occurs between objects, and not features within that object. If one considers grouping as one of a number of segmentation processes that define an object, then it seems likely that little competition will occur among members of a perceptual group.
Evidence in favor of this prediction was found in a study that used another variant of the sequential/simultaneous fMRI paradigm with the same oriented Gabor patches used in the pop out study (Beck & Kastner, 2007). We compared suppressive interactions among four identical items (homogenous display) to those induced by four heterogeneous stimuli that differed in both color and orientation (heterogeneous display). The idea of this manipulation was that, in cluttered scenes with multiple items, identical or similar items that are present in nearby locations tend to form perceptual groups by the Gestalt principle of similarity. Therefore, we predicted that competitive interactions should be minimal with identical stimuli in the display (homogeneous condition) as compared to different stimuli (heterogeneous condition). In accordance with previous data, simultaneous presentation of four heterogeneous visual stimuli evoked significantly less activity in areas V2, VP and V4 than the same stimuli presented sequentially, consistent with the idea that the stimuli compete for neural representation. However, when the four stimuli were identical, the suppression was considerably reduced relative to the heterogeneous conditions. This reduction was most evident in V4, but a similar pattern was found in V2 and VP. Such a result suggests in accordance with our prediction that competition is sensitive to the context in which the stimuli are presented: heterogeneous displays evoked more competition than homogeneous displays, in which the items were more likely to form a perceptual group on the basis of similarity. Thus, our results can be explained in terms of perceptual grouping and the competitive interaction expected to occur between items in that group: stimuli that form a better perceptual group evoke less competitive interactions. The relationship between grouping and competition can be interpreted in two ways. Competition may be influenced by grouping mechanisms from elsewhere in the cortex. These mechanism could boost the activity related to the set of stimuli as it enters V4, effectively counteracting any competition that may have occurred between stimuli. Such a perspective is consistent with effects of grouping and figure-ground segmentation found in early visual cortex (Lamme, 1995; Kastner et al., 1999b; Nothdurft et al., 1999; Kapadia et al., 1995; Zhou, Friedman, & von der Heydt. 2000). Alternatively, this similarity grouping may be a consequence of the competitive interactions. As mentioned, the response of V4 neurons to a pair of stimuli is best described as a weighted average of the responses to the two stimuli when presented alone (Luck et al., 1997; Reynolds et al., 1999). If the two stimuli that comprise the pair are identical, then the weighted-average model would predict that the response to the pair should be indistinguishable from the response to each of the individual stimuli (Reynolds et al., 1999). Thus, we may not need to appeal to additional grouping mechanisms to explain our data. Instead, the reduced competition present in the homogeneous displays, relative to the heterogeneous displays, may simply be the result of the averaging procedure performed by the neuron. If less competition is evoked by similar items, then there is no need to select or filter any one of them, and instead the items are processed as a group.
Further research is needed to discover whether there are other grouping or segmentation processes that modulate competitive interactions among stimuli. If competition occurs between objects, and not features within the object, then any display manipulation which tends to make a set of stimuli appear more like a single object should influence competitive interactions among those stimuli. McMains and Kastner (2007) have additional evidence in favor or this view. They have shown a similar reduction in competition among items that form the vertices of an illusory object relative to the same items rotated such that no illusory object is perceived.
In summary, both top-down directed attention and bottom-up, stimulus-driven, or context-dependent factors can influence competition. Figure 5 compares the sensory suppression index [SSI = (RSEQ − RSIM)/(RSEQ + RSIM); R, response computed as mean signal change; SEQ, sequential presentation condition; SIM, simultaneous presentation condition] for spatially directed attention, pop-out and stimulus similarity (Kastner et al., 1998, Beck & Kastner, 2005; Beck & Kastner, 2007). The SSI quantifies the degree of suppression (i.e. competition) induced by a display, such that positive SSIs indicate suppressive effects, an SSI of zero indicates no suppression, and negative SSIs mean reversal of the suppression, in which simultaneous displays actually evoke more activity than sequential. Solid black symbols refer to SSIs obtained by Kastner et al. (1998); open symbols refer to SSIs obtained by Beck and Kastner (2005); and grey symbols refer to SSIs obtained by Beck and Kastner (2007). In all three studies, competitive interactions among four heterogeneous stimuli were probed across visual cortex when top-down attention was directed away from the display (horizontal axis) and in the presence of either a top-down (attention) or a bottom-up manipulation (pop-out or stimulus similarity; vertical axis). The SSIs from all experiments fall below the dashed line indicating that all three manipulations resulted in reduced competition among the four stimuli relative to the unattended heterogeneous condition. However, as indicated by their distance from the dashed line, directed attention and the homogeneity of the stimuli showed similar reductions of competitive interactions and the pop-out manipulation showed the largest drop in SSI. Moreover, while the data for both the attended and homogeneous stimuli fall above zero on the vertical axis, indicating that some suppression remained and was thus not fully resolved by these manipulations, the data from the pop-out experiment fall on or below zero, indicating that competitive interactions were eliminated for these displays. Finally, it is also interesting to note that not only were competitive interactions in the unattended heterogeneous condition the largest in V4, as indicated by the fact that the circles from each experiment appear the furthest to the right of the graph in Figure 5, but the modulations induced by the biasing mechanisms were also the largest in V4, as indicated by the fact that the circles fall the furthest from the dashed line in Figure 5. Such a result is consistent with the notion that V4 might be an important site for competition to be resolved.
Finally, we would again like to emphasize that we have discussed the top-down and bottom-up influences on competition separately because we believe they depend on qualitatively different mechanisms. However, in everyday life these biases are likely to interact. Future research will have to investigate how and where, and under what circumstances, these mechanisms interact.
Although the top-down biases studied thus far have primarily been spatial, biased competition theory also posits that visual cortex activity can be biased in favor of a relevant feature in parallel across the visual field. In this view, search for a particular object does not need to occur by serially moving spatially directed attention across the visual field, but instead selection can be biased in favor a particular feature simultaneously across the visual field. For example, returning to the example of looking for a friend in a crowd, if we know that our friend is wearing a red shirt, we could bias our search in favor of the color red and thus only search among those individuals wearing red, and filtering out the remaining people. Such a view is reminiscent of guided search (Wolfe, Cave & Franzel, 1989), in which simple features, such as color, can be used to select a subset of the visual array in which to search serially. Moreover, similar mechanisms have been proposed to explain feature-based attention and the spatial attention effects described above (Boynton, 2005).
Recent physiological data provide evidence for the guided search model and the spatial and parallel feature-based biases predicted by biased competition theory (Bichot, Rossi & Desimone, 2005). Bichot and colleagues recorded from V4 neurons while monkeys searched for either a target shape or target color in an array of elements composed of a variety of colored shapes. Critically, the monkey was free to move his eyes and thus the researchers could record from a neuron whose RF contained an item that either was or was not the target of the next saccade. Items that were not the target of a subsequent saccade were assumed not to be the focus of spatially directed attention. Importantly, the response to a preferred stimulus (e.g. a red item) in a neuron’s RF was enhanced when it matched the cue (e.g. indicating a red target), even when the monkey had not yet found it (i.e. it was not the target of a subsequent saccade). In fact, the response to a stimulus similar (e.g. magenta) to the preferred stimulus (e.g. red) in the neuron’s RF was enhanced when it matched the cue. Moreover, these results were not only present in the form of enhanced spiking rate but also in the form of increased synchrony between the spikes and local field potentials (LFP) in the gamma range. The fact that these enhancements and increased synchronization occurred at locations that presumably were not spatially attended supports the idea that activity in visual cortex can be biased in favor of a target feature in parallel across the visual field. However, it should be noted that the same data provided evidence for a spatially directed bias also. V4 cells showed enhanced spiking activity when the stimulus within the RF was target of a subsequent saccade (i.e. was spatially attended) relative to when the stimulus was not the target of a subsequent saccade. In other words, these data provide support for the guided search model in which both parallel and serial search processes work together to select a target, and for biased competition theory, which posits both parallel and serial biasing mechanisms.
Thus far, we have discussed ways in which competition can be biased or modulated in visual cortex. However, biased competition theory also postulates that the source of the top-down bias is outside of visual cortex, and more specifically it postulates that the sources will overlap with structures involved in attention and working memory. This prediction rests on the concept of an “attentional template” in which the sought-after object is held in working memory and used as a basis for selecting the target in a field of distractors. Indeed, the changes in baseline activity in the absence of visual stimulation described in section 3.1.2 are compatible with such a notion. But what specific structures are implicated in these processes?
Initially, biased competition theory predicted that the main source of the bias would be prefrontal cortex (PFC), the main reason being that PFC neurons show stimulus-specific delay activity in working memory tasks (see Funahashi, 2006 for review), therefore making them ideal candidates for the neural substrate of an attentional template. Moreover, lesions or deactivations of PFC impair performance on working memory tasks (see Fuster, 1997 and Curtis, 2006 for review). Likewise, fMRI studies in humans have demonstrated elevated delay activity in spatial and object working memory tasks in PFC (Courtney et al., 1997; Sakai, Rowe & Passingham, 2002; D’Esposito, Cooney, Gazzaley, Gibbs & Postle, 2006; Kastner et al., 2007). More importantly, from the perspective of providing a feedback signal to visual cortex, PFC has been shown to have reciprocal connections with almost all extrastriate visual cortex (Barbas, 1988; Barbas & Pandya, 1989; Ungerleider, Gaffan & Pelak, 1989; Webster, Bachevalier & Ungerleider, 1994) and cooling PFC has been shown to disrupt delay activity in inferior temporal (IT) cortex (Fuster, Bauer & Jervey, 1985).
Although PFC was initially proposed as the source of the top-down bias, it has since been shown that parietal cortex also has properties that make it a likely candidate for a source of the biasing signal that modulates activity in visual cortex. For instance, fMRI studies have revealed that the intraparietal sulcus (IPS) and posterior parietal cortex (PPC) is also activated in the delay period of a working memory task (Curtis, Rao & D’Esposito 2004; Jonides et al., 1998). The PPC has also been heavily implicated in spatially directed attention (Kastner & Ungerleider, 2000), which has already been shown to modulate competitive interactions in visual cortex (Reynolds et al., 1999; Recanzone & Wurtz, 2000; Kastner et al. 1998). In fact, a network of regions consisting of the superior parietal lobule (SPL), the frontal eye fields (FEF), and the supplementary eye fields (SEF) were found to be activated by a variety of visuospatial attention tasks (for a meta-analysis, see Kastner & Ungerleider, 2000; Pessoa et al., 2003). Using the same fMRI design that served to study baseline increases, as described above, Kastner and colleagues (1999a) found evidence to support the idea that these same regions may generate the top-down biasing signals. Areas within the FEF, SEF, the SPL and the IPS were activated when subjects attended to the peripherally presented stimuli and monitored the location nearest fixation for a target stimulus compared to a control condition in which the subjects ignored the same stimuli and instead attended a central fixation dot (Fig. 6A). More importantly, these same areas were activated during the expectation period in which visual stimulation was absent (Fig. 6B), making them likely candidates for generating the baseline shifts seen in visual cortex, described in the previous section. Moreover, as illustrated by the timecourse of fMRI signals in FEF (Fig. 3B), the increase in activity during the expectation period was greater in the frontoparietal regions than in visual cortex and, unlike in visual cortex, there was no further increase in evoked activity during the presentation of the attended visual stimulus. Such a pattern of results suggests that the frontoparietal activations played a role in spatially directing attention to the target location or maintaining target identity in working memory. In other words, in keeping with the predictions of biased competition theory, these activations are more likely to reflect the source of the biasing signal rather than recipient of the bias. This notion has been strongly supported by physiology studies in monkeys, in which subthreshold stimulation of eye movement representations within FEF resulted in spatial attention-like modulation of activity in retinotopically corresponding sites in V4 (Moore and Armstrong 2003).
Finally, the neurological syndrome of visuospatial neglect, and the related phenomenon of visual extinction, is of particular relevance to bias competition theory and the possible source of the top-down bias. Patients with visuospatial neglect, which typically follows unilateral lesions of the right parietal lobe, show impairments in directing spatial attention to the contralesional side of space, and in severe cases may completely disregard contralesional space, eating from one side of the plate or applying make-up to one side of the face (Vallar & Perani, 1986; Bisiach & Vallar, 1988; Heilman et al, 1993; Rafal, 1994). In less severe cases, patients may exhibit visual extinction, in which they are able to orient to an object in their contralesional field when it is presented alone. However, when there is a competing object in the opposite visual field, they will report only the object in the intact visual field and deny the presence of an object in the impaired field. Such findings have been characterized as an attentional bias towards the intact hemifield in the presence of competing distractors, (Kinsbourne, 1993), and implicate the parietal cortex as a source of the attentional bias. Behavioral testing of a Balint’s syndrome patient RM, who has bilateral lesions of the parietal lobe, corroborated this hypothesis. In the presence of distractors, RM was impaired at discriminating the orientation of a target grating or discriminating morphed faces. Moreover, this impairment increased with the increasing salience of the distractors, a pattern not shown by control subjects (Friedman-Hill, Robertson, Desimone & Ungerleider, 2003). Such a result is again consistent with the idea that the parietal cortex participates in the filtering of unwanted information, and therefore represents a prime candidate for the source of top-down biasing signals.
In summary, data gathered from a variety of methods suggest that regions in both the frontal and parietal cortex are likely candidates for the source of the biasing signal that, according to biased competition theory, resolves competition in visual cortex.
The third tenet of biased competition theory is perhaps the one that is least supported by empirical evidence. This is Duncan’s (1996, 2006) supposition that selection of a target object emerges through the integration of many separate competitive systems, such that when an object gains dominance in one system (e.g. visual cortex), it will tend to gain similar dominance in other systems (e.g. higher order frontal and parietal areas). Everling, Duncan and colleagues (2002, 2006) provide some evidence in favor of this hypothesis. They trained monkeys to detect a target object (a picture of a fish) in a target location (indicated by a precue) among distractor objects (hamburger and bear) presented at either the cued or uncued location. Unlike similar experiments directed at visual cortex (Moran & Desimone, 1985; Luck et al., 1997; Reynolds et al., 1999; Chelazzi, Duncan, Miller & Desimone, 1998), however, Everling et al. isolated neurons in prefrontal cortex (PFC). Like the previous experiments directed at visual cortex, Everling et al. found evidence of attentional selectivity, with an enhancement of the attended object relative to an unattended object, but for the PFC neurons this selectivity was more complete. Unlike in visual cortex, where attentional filtering was spatially local (i.e. attentional filtering only occurred when the stimuli fell within the same RF), in the PFC, attention had a global filtering effect throughout the visual field. Such a result is consistent with the idea that competitive interactions and biases are occurring throughout the system, with the PFC reflecting stronger and more global effects of selective attention. However, more research is needed, in which recordings are taken at multiple sites (eg. PFC and IT), to determine how, or even whether, competition is biased throughout the cortex.
Another example for the integration principle has recently been found in the spatial domain. Attention effects in visual cortex have been shown to be highly spatially specific. For example, directing attention to a particular region of space enhances responses only in cortical areas with a representation of the attended location (e.g. Brefczynski & DeYoe, 1998). Thus, a seemingly straightforward hypothesis deducted from the integration principle of biased competition theory is that all spatial maps in the brain that are engaged during the spatial allocation of attention will integrate their information to generate biasing signals that selectively enhance responses of neurons representing the attended location. The recent finding of spatial maps in higher-order frontal and parietal cortex that are revealed in cognitive tasks such as a memory-guided saccade task (Schluppeck et al., 2006; Kastner et al., 2007) has enabled us to test this idea directly (Szczepanski & Kastner, 2006). Subjects directed attention to a peripheral target location in either one of the four visual quadrants. It was found that topographically organized areas in parietal cortex (IPS 1–4 located along the intraparietal sulcus and an area in the SPL) and in frontal cortex (the FEF and a region in the inferior precentral cortex) yielded spatially specific signals in this task, similar to those found typically in visual cortex during the spatial allocation of attention. These studies provide first evidence that spatially-specific signals representing the attended location gain dominance throughout lower-order and higher-order spatial representations of the cortex.
Finally, the idea that when an object gains dominance in one system it will tend to gain dominance in other systems is related to the concept of object-based attention. There is a large behavioral literature showing that attention is not simply directed to locations or features of an object, but to the whole object itself. Neuroimaging studies have further supported this notion by showing that the effects of attention appear to spread to the unattended features of the attended object (O’Craven et al., 1999; McMains et al., 2007). For instance, Kanwisher and colleagues (O’Craven et al., 1999) showed that when subjects were asked to attend to a face that was moving, increased activity was not only found in the fusiform face area (FFA) but also in MT, which is sensitive to motion. Similarly, when subjects were asked to attend to either the motion or the color of a patch of colored moving dots, similar attention-related increases in activity were seen in V4, TEO and MT (McMains et al., 2007). In other words, despite the fact that subjects were only asked to attend to and make judgments about a single feature, attention-related effects were found throughout the system including in regions that preferentially process the unattended features of the object.
Biased competition theory of selective attention, as proposed by Desimone & Duncan (1995), has had a tremendous influence on the field of visual attention. Moreover, evidence now exists in favor of all three of its most basic principles. The first principle of competition now seems well established; multiple stimuli presented simultaneously in the visual field compete for representation in visual cortex by mutually suppressing neural responses. Moreover, evidence suggests that this competition is greatest at the level of the RF. There is also now an increasing body of evidence in favor of the second principle of control, suggesting that competition can be biased by both top-down and bottom-up factors. The finding that the stimulus-driven factor of stimulus similarity also affects competition is particularly interesting, as it opens a new avenue of investigation into influences on competition. Unlike the experiments involving directed attention and stimulus salience (i.e. pop-out), which ask the competition can be biased in favor of a particular stimulus, the stimulus similarity experiment is really asking what are the units of competition. What types of stimuli compete with each other? How do grouping and segmentation processes and other processes involved in the representation of an object influence the ongoing competition? Also, thus far, the bottom-up and top-down manipulations of competition have been studied in isolation. How might stimulus-driven and top-factors interact to resolve competition? Of the three basic principles of biased competition theory put forth by Duncan, the third, that competition is integrated across systems, is probably the least investigated. However, the few studies that address this issue are in general agreement with this principle. Attended stimuli appear to be preferentially processed in multiple regions throughout the brain. However, further research is needed to determine the extent of the regions involved and the generality of this finding. It would be important, for instance, to determine whether other biasing mechanisms, such as stimulus salience, have similar effects in multiple brain regions. In short, despite more than a decade of work inspired by biased competition theory, the framework continues to provide a rich source of research questions related to how we select and represent visual information.
We thank Michael Arcaro for help with manuscript preparation. Our work is supported by grants from NIMH (RO1 MH64043, P50 MH-62196), NEI (RO1 EY017699) and NSF (BCS –0633281) to SK, and from NIMH (RO3 MH082012) to DB.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Diane M. Beck, Department of Psychology and Beckman Institute, University of Illinois, Urbana-Champaign, 405 North Mathews Ave., Urbana, IL 61801.
Sabine Kastner, Department of Psychology, Princeton Neuroscience Institute, and Center for the Study of Brain, Mind & Behavior, Princeton University, Princeton, NJ 08540.