|Home | About | Journals | Submit | Contact Us | Français|
Multiple stimuli that are present simultaneously in the visual field compete for neural representation. At the same time, however multiple stimuli in cluttered scenes also undergo perceptual organization according to certain rules originally defined by the Gestalt psychologists such as similarity or proximity, thereby segmenting scenes into candidate objects. How can these two seemingly orthogonal neural processes that occur early in the visual processing stream be reconciled? One possibility is that competition occurs among perceptual groups, rather than at the level of elements within a group. We probed this idea using fMRI by assessing competitive interactions across visual cortex in displays containing varying degrees of perceptual organization, or perceptual grouping (Grp). In strong Grp displays, elements were arranged such that either an illusory figure or a group of collinear elements were present, while in weak Grp displays the same elements were arranged randomly. Competitive interactions among stimuli were overcome throughout early visual cortex and V4, when elements were grouped regardless of Grp type. Our findings suggest that context-dependent grouping mechanisms and competitive interactions are linked to provide a bottom-up bias towards candidate objects in cluttered scenes.
In typical visual scenes, many visual elements are present that tend to segment into foreground and background regions and group together in order to form candidate objects. Principles of perceptual organization that mediate such perceptual segmentation were first described by the Gestalt psychologists (Wertheimer, 1923; Rubin, 1958), who proposed that visual stimuli may be perceptually grouped according to several rules such as similarity, proximity, or common fate (Palmer, 1999). For instance, stimuli that are of the same color or shape will tend to group together. Perceptual grouping and image segmentation are thought to be a fundamental problems that the visual system must solve, and many of these perceptual organization processes are generally thought to occur early in the visual processing stream (Driver et al., 2001).
However, it has also been shown that due to the limited processing capacity of the visual system, multiple stimuli present at the same time in the visual field are not processed independently, but rather compete for neural representation (Desimone and Duncan, 1995; Beck and Kastner, in press). Neural correlates of competitive interactions have been found in visual cortex in both single-cell physiology and functional brain imaging studies (Miller et al., 1993; Kastner et al., 1998; Reynolds et al., 1999; Kastner et al., 2001; Beck and Kastner, 2005; Beck and Kastner, 2007). Neuroimaging studies in humans have found that multiple simultaneously presented stimuli evoke a smaller response than the same stimuli presented sequentially (Kastner et al., 1998; Kastner et al., 2001; Beck and Kastner, 2005; Beck and Kastner, 2007). These studies indicate that stimuli present simultaneously in the visual field interact in mutually suppressive ways. Competitive interactions among multiple stimuli have been found to occur automatically and outside the focus of attention (Kastner et al., 1998; Reynolds et al., 1999), thereby constituting a ‘default state’ when viewing natural scenes. As a result of neural competition, the representation of an individual stimulus is weakened when presented among other stimuli. A second fundamental problem the visual system must solve is how neural competition among multiple stimuli can be overcome. According to the biased competition theory of attention, the ongoing competition among stimuli can be influenced or ‘biased’ both by top-down processes that reflect current behavioral goals or by bottom-up stimulus-driven factors (Desimone and Duncan, 1995). If attention is directed to one of multiple stimuli present at the same time in the visual field, suppressive influences of nearby distractors are counteracted (Recanzone et al., 1997; Kastner et al., 1998; Reynolds et al., 1999). Neural competition can also be influenced by bottom-up stimulus-driven factors such as visual salience (Reynolds et al., 1999; Beck and Kastner, 2005). Thus far, most studies have only considered competitive interactions that occur among individual stimuli that appear as separate objects in visual displays. However, as discussed above objects are generally composed of many elements.
It seems paradoxical that visual stimuli compete for neural representation on the one hand, while at the same time mechanisms operate to group stimuli together. How can these two seemingly orthogonal, yet fundamental processes of early vision be reconciled? One possibility is that perceptual grouping and competition occur mostly independent of each other, such that grouped and ungrouped elements compete similarly. Alternatively, the two processes might depend on each other, such that once stimuli are grouped together via perceptual organization mechanisms, that they compete as a unit. Biased competition theory proposes that competition occurs mainly between objects and not among elements that belong to a common object (Desimone and Duncan, 1995). Consider the Kanizsa illusion (Kanizsa, 1976), where four circular ‘pacman’ elements (i.e. inducers) are aligned to form an illusory square (Fig. 1A). If competition occurs mainly between objects and not among elements that belong to a common object, then the four inducers presented in the context of an illusory figure should compete with each other less than when the inducers are rotated outward, thereby disrupting the perceptual grouping among the inducers.
We probed the interaction of perceptual organization and competition by testing the influences of visual displays with varying degrees of perceptual organization on competitive interactions among multiple stimuli for two different perceptual organization principles, illusory contour formation and collinearity. We found that, when stimuli were grouped, they competed with each other less than when the same stimuli were presented without any context of grouping. These results begin to define what the units of competition and ultimately of attentional selection are, thereby implementing fundamental early grouping processes into the framework of a biased competition account.
Thirteen subjects (5 females, age 22–37, normal or corrected-to-normal visual acuity) gave informed written consent for participation in the study, which was approved by the Institutional Review Panel of Princeton University. Ten subjects participated in experiment 1, while twelve subjects, including nine subjects from experiment 1, participated in experiment 2. In addition, six of the twelve subjects participated in a control study for experiment 1.
The stimuli were generated using Matlab software (Mathworks, Natick, MA) and the Psychophysics Toolbox (Brainard, 1997; Pelli, 1997) and were projected onto a projection screen located at the end of the scanner bore using a PowerLite 7250 liquid crystal display projector (Epson; Long Beach, CA). In Experiment 1 (Exp 1), the visual stimuli consisted of four varin (Varin, 1971) pacman inducers (width of 1.75°) presented in four nearby locations (separated by .5°) in the upper right quadrant of the visual field (Fig. 1C, D). The entire stimulus display encompassed 4 × 4° and was centered approximately 9.5° from fixation. Color and angle of inducers were the same for a given display, but both dimensions differed from display to display (Fig. 1A, B; color: red (7.1 cd/m2), green (38.58 cd/m2), yellow (45.5 cd/m2), purple (8.87 cd/m2), or cyan (39.4 cd/m2); angle: 50°, 70°, 90°, 110°, or 130°). In a control study, each of the inducers differed in color and luminance for a given display, but had the same angle (Fig. 3A). Stimuli were presented on a dark gray (1.7 cd/m2) background.
Stimuli were shown under two presentation conditions, sequential (SEQ) and simultaneous (SIM, Fig. 1C, D), following a previously established research protocol (Kastner et al., 1998; Kastner et al., 2001; Beck and Kastner, 2005; Beck and Kastner, 2007). In the sequential condition, each inducer was presented alone in one of the four locations and in random order for 250 ms. In the SIM condition, the four inducers were presented in the same locations and for the same time, but they were presented together. In the SIM condition, the stimulus array was randomly presented, either 0, 250, 500, or 750 ms after the beginning of a 1 s presentation period. Stimuli were presented in blocks lasting 15 s, thus consisting of 15 different stimulus display presentations.
In addition to the two presentation conditions, two display type conditions were probed that varied the degree of perceptual organization in a given display. In the ‘strong perceptual group’ condition (Fig. 1A, StrongGrp), the inducers were rotated inward so that an illusory figure was present. In the ‘no perceptual group’ condition (Fig. 1B, NoGrp), the four inducers were randomly rotated outward either 225°, 200°, 180°, 160°, or 135° from their positions in the StrongGrp condition. For both perceptual organization conditions, the same sequence of color and inducer angles was shown within each run.
In Experiment 2 (Exp 2), all parameters except for the stimulus display that was used were the same. The stimulus display (Fig. 4A, B) consisted of 16 oriented colored (red, green, yellow, purple, or cyan) gabors (width of .5°, 3 cycles per degree) presented in the upper right quadrant of the visual field (average of .5° between gabors). Each gabor stimulus was jittered in space (up to .25° in all directions) and orientation, and the entire display was 4 × 4°. Stimuli were presented on a black background (.14 cd/m2) with an average luminance of 27.9 cd/m2. In the SEQ condition, the 16 gabor stimuli were divided into quadrants, each containing 4 gabor stimuli (Fig. 4A, gray dashed lines). In the SIM condition, all 16 stimuli in the display were presented together. In the StrongGrp condition (Fig, 4B), four of the stimuli were aligned to form a collinear contour with .2° between each stimulus (thus incorporating the Gestalt principles of grouping by similarity and proximity). The alignment could occur in one of four locations (either the second or third row, or second or third column). The color and orientation of the 12 gabor stimuli that were not part of the collinear contour were held constant for both perceptual organization conditions.
For both Exp1/2, presentation (SEQ or SIM) and perceptual organization (NoGrp or StrongGrp) conditions were combined in a 2×2 factorial design. Within a scanning run, each condition was presented once, interleaved with blocks of blank periods of 15 s each, for a total run length of 135 s. Presentation conditions were presented in an ABBA block order (SEQ-SIM-SIM-SEQ) with perceptual organization condition counterbalanced across the 12 runs (Kastner et al., 1998).
Throughout the course of both experiments, subjects monitored a rapid serial visual presentation (RSVP) stream, which consisted of digits and keyboard symbols (23456789&%$#) for the appearance of a target letter (ABC). Each symbol (average height .47°, width .43°) was present for 200 ms with targets occurring on average every 2.3 s. In the control study for Exp 1, the difficulty of the RSVP task was increased by having subjects search for a target letter (SK) among all other letters given that subjects had previous experience with the RSVP task. Each letter was presented for 175 ms or less to keep subjects performance around 75% (Table 1). Given that subjects were engaged in a demanding task at fixation, eye tracking was not performed.
Data were collected with a 3-T Siemens Allegra scanner (Allegra, Siemens, Erlangen, Germany). In an initial scan session, high-resolution structural images were acquired for each subject for the purpose of three-dimensional cortical surface reconstruction (MPRAGE sequence, TR = 2.5 s, TE = 4.38 s, flip angle 8°, 1 mm3 resolution), and retinotopic mapping was performed. For Exp 1, a standard birdcage head coil was used; for Exp 2 and the control study, a 4-channel visual surface coil (Nova Medical, Wilmington, MA) was used. Data acquisition and analysis was identical for all experiments. Functional images were taken with a gradient echo, echo planar sequence (TR=2.5 s, TE=40, flip angle=80°, 128 × 128 matrix). Twenty-five contiguous, axial slices (thickness 2 mm, 1mm gap, in-plane resolution 2 × 2 mm) covering occipital cortex were acquired. A high-resolution anatomical scan was collected (MPRAGE, same parameters as above) for anatomical comparison. An in-plane magnetic field map image was acquired to perform echo planar imaging undistortion (TR = .5 s, TE = 5.23 or 7.69 s, flip angle 55°, 2 mm slices, in-plane resolution 2 × 2 mm).
Data were analyzed using AFNI (including SUMA) (Cox and Hyde, 1996) (http://afni.nimh.nih.gov/afni/, http://afni.nimh.nih.gov/afni/suma), Matlab (Mathworks, Natick, MA), and Freesurfer (Dale et al., 1999; Fischl et al., 1999). The functional images were motion corrected, undistorted using the field map scan, and spatially smoothed in-plane with a Gaussian filter of 2 mm. The first six images of each scan were excluded from analysis. Statistical analyses were performed using multiple regression in the framework of the general linear model (Friston et al., 1995) with AFNI. Time series were modeled by using a regressor that contrasted blocks of peripheral visual presentations (regardless of presentation or perceptual organization condition) versus blank periods, convolved with a Gaussian model of the hemodynamic response. Additional regressors were used to factor out within-run linear drifts, quadratic drifts, and head movement artifacts. Statistical maps comparing peripheral stimulation blocks versus blank periods were thresholded at p < .0001 or less (uncorrected for multiple comparisons) such that the comparison revealed only voxels activated by the peripheral stimuli.
Activated voxels were assigned to regions of interest (ROI) in visual cortex, as defined below. Time series of fMRI intensities were extracted from each ROI from unsmoothed data and were normalized to the last two time points of the average fixation block of a run. For each subject, mean signals were computed by averaging across the peak 3 time points (7.5–12s) of the average time series, for each condition and visual area. These values were further quantified by defining a sensory suppression index (SSI=(RSEQ − RSIM) / (RSEQ + RSIM); R=response computed as mean signal change during the two different presentation conditions) for the two perceptual organization conditions for each experiment. All reported t-tests are two-tailed.
Retinotopic mapping was performed for each subject using flickering checkerboard stimuli (Swisher et al., 2007) and standard phase encoding techniques (Sereno et al., 1995; Schneider et al., 2004) in order to identify retinotopically organized areas (V1, V2, V3, V4, V3a, V7 (Tootell et al., 1998; Kastner et al., 2001; Wade et al., 2002)). In addition human MT+ was identified based on its characteristic anatomical location (Watson et al., 1993).
In Experiment 1, the stimulus display consisted of four Varin (Varin, 1971) ‘pacman’ illusory contour inducers presented either sequentially (SEQ: Fig. 1C) or simultaneously to the periphery of the visual field (SIM: Fig. 1D). Integrated over time, the physical stimulation parameters in each of the four locations were identical in the two presentation conditions. However, as shown previously (Kastner et al., 1998; Beck and Kastner, 2005), competitive interactions among the stimuli could only take place in the SIM, but not in the SEQ presentation condition, and were indexed by the difference between the two presentation conditions.
In the ‘strong perceptual group’ (StrongGrp) condition, the inducers were aligned such that an illusory figure (e.g. a square) was formed in the SIM condition (Fig. 1A). Thus, when the four inducers were rotated inward they were linked together to form a single foreground object (Palmer, 1999). In the ‘no perceptual group’ (NoGrp) display condition, the inducers were randomly rotated outward, so that no figure was present and the four elements appeared as four single foreground objects with different orientations when simultaneously presented (Fig. 1B). Both perceptual organization conditions were presented SIM and SEQ to assess the degree of competitive interactions among the stimuli in the display under the different perceptual organization conditions. Importantly, the stimuli were matched across perceptual organization conditions, such that the main difference between the two simultaneous conditions was the presence or absence of a strong perceptual group. Throughout the presentation of the visual displays, subjects monitored a rapid serial visual presentation (RSVP) stream at fixation. This demanding fixation task prevented subjects from attending to the peripheral stimulus displays, thereby allowing the interaction of perceptual organization and competition to be investigated outside the focus of attention.
The stimulus array evoked robust activity throughout visual cortex, including early visual areas V1, V2, V3, areas of the ventral stream (V4), and areas of the dorsal stream (V3a, MT, and V7). In previous studies, competitive interactions have been investigated mainly in the ventral stream due to its important role in object processing. Recently, areas along the dorsal stream have also been found to represent object information (Konen and Kastner, 2008), and consequently we investigated activity in these regions. However, while competitive interactions were present in the dorsal areas for both the illusory contour and collinearity experiments (NoGrp SEQ vs. SIM all t>2.3, p <.05), there was no clear effect of perceptual organization in areas within the dorsal stream. Therefore, we focused the subsequent analysis on ventral visual cortex. Notably, the lateral occipital complex (LOC) was not robustly activated. While this area has been implicated in the processing of illusory contours (Mendola et al., 1999), it shows a preference for stimuli presented at the fovea (Sayres and Grill-Spector, 2008). Our stimuli, which were presented in the more peripheral parts of the visual field, were therefore not ideal for activating LOC.
As previously shown (Kastner et al., 1998; Kastner et al., 2001), activity evoked by the stimulus display in the NoGrp SIM condition was smaller than that evoked by the NoGrp SEQ condition throughout visual cortex (Fig 2: V1, V2, V3, V4: all t>3.4, p<.01), particularly in intermediate areas such as V4, reflecting the suppressive interactions that occur mainly at the level of the receptive field (RF) when multiple objects compete for neural representation. For the StrongGrp display condition, we hypothesized that the four inducers when arranged to form an illusory figure, that is, a single foreground object, would not compete to the same degree for neural representation. Thus, we expected a smaller difference in responses evoked by the SEQ and SIM presentation conditions in the StrongGrp as compared to the NoGrp display condition. A release from competition is reflected as an interaction between the perceptual organization (StrongGrp vs. NoGrp) and presentation conditions (SEQ vs. SIM), and is driven by the greater activity evoked by the SIM StrongGrp compared to the SIM NoGrp condition. A significant interaction was observed for early visual cortex (V1, V2, V3: all F1,9>6.79, p<.05) and in area V4 (F1,9=5.1, p<.05). Finally, activity evoked by the two SEQ conditions did not differ significantly in any of the areas (all t<1.96, p>.05). This finding suggests that the small changes in low-level features (inducers rotated inward versus outward) between the two perceptual organization conditions were not the source of the difference in activity obtained between the two SIM conditions.
In order to quantify the differences in responses evoked by SIM and SEQ presentations further, a sensory suppression index (SSI) was calculated. The index permits a comparison of the degree of competition effects both across different visual areas and perceptual organization conditions. Positive values indicate stronger responses evoked by SEQ than by SIM presentations (reflecting the mutual suppression that occurs during the simultaneous presentation), negative values indicate the opposite, and values around 0 indicate the absence of response differences (or no difference in the amount of competition elicited by the two presentation conditions). Strikingly, the StrongGrp sensory suppression indexes (SSIStrongGrp) were greatly reduced compared to the SSINoGrp, suggesting that the competition was resolved once an illusory object had been formed (Fig. 2B: V1, V2, V3, V4: all t>2.68, p<.05). In fact, the SSIsStrongGrp were not different from zero in V1 and V2 (all t<1.72, p>.05). Consistent with previous reports (Kastner et al., 1998), SSIs for the NoGrp conditions appeared to increase from V1 to V4 (Fig. 2B), with significantly larger SSIsNoGrp in V4 than V1 (t=3.12, p<.05).
The above results are consistent with the interpretation that the formation of an illusory figure can counteract competitive interactions in visual cortex. However, we considered an alternative interpretation. When the inducers were rotated inward and an illusory figure was formed, the inducers were filled-in behind the illusory figure due to visual interpolation (Palmer, 1999), thereby resulting in a homogenous group of four identical circular stimuli. In contrast, no such visual interpolation occurred in the SIM NoGrp condition, resulting in a heterogeneous group. It has previously been shown that homogenous displays where stimuli share the same color, shape, and orientation evoke less competition in visual cortex than do heterogeneous displays where stimuli differ in color and orientation (Beck and Kastner, 2007). Thus, it is possible that the interaction of illusory contour formation on neural competition in visual cortex was driven by either the formation of a single foreground object, or a homogeneous background, or by a mixture of both. In order to address this issue we performed a control study in which the same experiment was performed except that in each stimulus display, the four inducers varied both in terms of color and luminance, thereby avoiding the presentation of a homogenous display (Fig. 3A). Similar to the main experiment, a significant release from competition was found when a group was present, as demonstrated by the significant interaction between perceptual organization and presentation condition in areas V4 (F1,5=7.26, p<.05) and V3 (F1,5=18.06, p<.01), with a trend in V2 (F1,5=3.62, p=.12) (Fig. 3B). In addition, the SSIsStrongGrp were significantly reduced compared to SSIsNoGrp in V2, V3, and V4 (all t>2.75, p<.05, Fig. 3C). Importantly, there was again no difference between the two SEQ conditions (all t<.87, p>.05), suggesting that the two perceptual organization conditions were well matched in terms of low-level features. These results suggest that the presence of an illusory foreground figure itself is sufficient to counteract competitive interactions in visual cortex. However, competition was not overcome to the same degree as with a heterogeneous background, suggesting that both the formation of a single foreground object and the creation of a uniform background contributed to the large reduction in competition observed in the main experiment. In addition, the fact that there was less competition in the StrongGrp than in the NoGrp display condition, even though the perceptual organization present in the StrongGrp condition resulted in five stimuli (each inducer plus the foreground figure) compared to four stimuli in the NoGrp condition suggests that it is not just the number of stimuli that determines the degree of competition, but rather the specific context in which they are presented.
To test the generality of our hypothesis that many types of perceptual organization principles may influence or even determine the magnitude of competitive interactions among multiple stimuli, we conducted a second experiment that investigated a different principle of perceptual organization – collinearity. The same SEQ/SIM paradigm as in the first experiment was used but with an entirely different display that consisted of 16 colored oriented gabor stimuli (Fig. 4A, B). In the SEQ condition, the stimulus array was subdivided into quadrants, with one quadrant of the stimulus display (four gabor stimuli, Fig. 4A dashed lines) being presented at a time, while all 16 gabor stimuli were presented together for the SIM condition. In the NoGrp condition, all gabor stimuli were randomly oriented (Fig. 4A), while in the StrongGrp condition, four of the gabor stimuli were aligned and placed close together to form a vertical or horizontal collinear contour (Fig. 4B). Thus, in the StrongGrp condition, the four aligned gabor stimuli were part of a perceptual group embedded in the remaining individual stimuli, while in the NoGrp condition, no contour was present and the gabors appeared as 16 isolated stimuli.
We predicted that suppressive interactions among the stimuli would be partially overcome by the perceptual organization achieved as a result of collinear alignment. In support of our hypothesis, a significant interaction between perceptual organization and presentation condition, reflecting smaller response differences between the StrongGrp presentation conditions, or a release from competition, was observed throughout visual cortex (Fig. 4C, V1, V3, V4: all F1,11>5.36, p<.05, V2: F1,11=3.94, p=.07). In addition, the SSIsStrongGrp were significantly smaller than SSIsNoGrp for areas V1, V2, V3, and V4 (all t>2.28, p<.05), reflecting a reduction in competition when a perceptual group was present in the display (Fig. 4D). To verify that the two perceptual organization conditions were well matched in terms of low-level features, we compared activity for the two SEQ conditions. Activity did not differ in any of the areas investigated (all t<1.69, p>.05).
The current study was designed to investigate the interaction of perceptual organization and competition outside the focus of attention by presenting the stimuli in the far periphery while subjects were engaged in a demanding task at fixation. However, it is possible that stimulus arrays with a perceptual group capture attention more strongly than arrays without a perceptual group, resulting in a redeployment of attention to the periphery (Yantis, 2000). To consider this possibility, behavioral performance was investigated for the two SIM conditions (SIM NoGrp versus SIM StrongGrp) across the three experiments. We reasoned that, if attention was captured by the StrongGrp displays, the resulting interference would be reflected in slower reaction times and poorer accuracy. However, there were no differences in accuracy (Table 1: all t<.76, p>.48) or reaction times (Table 1: all t<.88, p>.4) obtained for these conditions in any of the three experiments. As a more stringent test of a redeployment of attention, behavioral performance was investigated for the collinearity and illusory contour control experiment (for which the data were available) for trials in which the stimulus array appeared in the periphery at the exact same time that a RSVP target appeared at fixation. Such co-occurrences happened on approximately 32% of the trials. Again, there was no difference between the two simultaneous conditions in terms of reaction time (collinearity: SIM StrongGrp: 540ms, SIM NoGrp: 533ms, t=.61, p=.55, illusory contour control: SIM StrongGrp: 546ms, SIM NoGrp: 562ms, t=.92, p=.4) or accuracy (collinearity: SIM StrongGrp: 93%, SIM NoGrp: 85%, t=2.0, p=.07, illusory contour control: SIM StrongGrp: 71%, SIM NoGrp: 78%, t=.97, p=.38). These behavioral results suggest that the observed influences of perceptual organization on competition were not due to attentional modulation resulting from attentional capture by the perceptual group.
We investigated influences of perceptual organization principles on competitive interactions among stimuli in visual cortex. When the context of the visual displays was manipulated so that perceptual grouping among stimuli could occur, competitive interactions were reduced as compared to when the same stimuli were not grouped. Perceptual organization counteracted suppressive interactions in early visual cortex and area V4. This basic finding was demonstrated for two different display configurations and perceptual organization principles, illusory contour formation and grouping by collinearity and proximity, thereby generalizing the influences of perceptual organization on competitive interactions to different principles of perceptual organization. Importantly, the influences of perceptual organization on neural competition in visual cortex occurred outside the focus of attention, suggesting that the underlying neural mechanisms operated in a highly automatic, bottom-up fashion.
To create the NoGrp and StrongGrp display conditions, small changes in the stimulus arrays were necessary. In the illusory contour experiment, the inducers were either rotated inward or outward, while in the collinearity experiment, the gabor stimuli were either all randomly oriented and randomly spaced, or four of them were aligned and placed closer together. However, we argue that the observed reduction in competitive interactions among the stimuli in the StrongGrp condition resulted from the formation of a perceptual group or object, not from these changes in low-level features. Importantly, while these low-level changes were present in the SEQ conditions also, no difference between the StrongGrp and NoGrp were observed, suggesting that differences in low-level features do not account for the differences obtained during the SIM conditions. In addition, all three experiments converged on the same primary result of an interaction between perceptual organization and neural competition, even though they used different displays. It is unlikely that small changes in the different displays would all result in greater activity for the SIM StrongGrp compared to the NoGrp conditions.
As noted previously (Beck and Kastner, 2005; Kastner et al., 1998; Kastner et al., 2001), there were several differences between the SEQ and SIM presentation conditions, in addition to the level of competition they induced. For instance, the visual presentation period of the SEQ condition extended over 1 s, while the presentation period for the SIM condition was 250 ms. In addition, the SEQ condition contained four visual onsets, while the SIM condition contained only one. However, if the stimulus duration, number of onsets, or any other inherent low-level differences between the SEQ and SIM conditions were solely driving the observed difference in activation for the two conditions, then the difference between conditions should be constant across different stimulus configurations. Previously, Kastner and colleagues (Kastner et al., 2001) have found that within a visual area, the difference between the SEQ and SIM conditions decreased with increasing spatial separation. In the current experiment investigating perceptual grouping, the difference between the SEQ and SIM condition was reduced for stimulus arrays that appeared in the context of a perceptual group, while the number of onsets and the stimulus duration were held constant. In fact, there was no difference between the StrongGrp SEQ and SIM presentations in V1 and V2 for the illusory contour experiment, even though presentation factors such as the number of onsets differed.
The current study investigated competitive interactions among elements that grouped together to form an object. The amount of competition evoked by a stimulus array depended upon the amount of perceptual grouping among the elements, such that perceptually grouped elements induced less competition. The process by which elements group together is an important step in image segmentation and is often viewed as a fundamental problem the visual system must solve. Our study is the first to investigate the interaction of these two fundamental processes in vision, that of grouping elements together to define candidate objects and that of competition which defines the units of selective attention. Our results are consistent with the hypothesis from biased competition theory that competition occurs among objects and not elements that form an object (Desimone and Duncan, 1995). They are also consistent with behavioral studies demonstrating that it is easier to identify two properties of one object than properties of two different objects, even when the two objects overlap (Duncan, 1984), or that objects interfere less with performance when they are grouped by similarity (Bundesen and Pedersen, 1983; Duncan and Humphreys, 1989). In addition, two studies investigating bottom-up stimulus driven influences on competition, found that identical stimuli competed with each other less than stimuli that differed (Reynolds et al., 1999; Beck and Kastner, 2007). While these studies were limited to investigating competition among objects because the stimuli did not appear within a context other then a uniform background, the results are consistent with the current proposal that the context in which objects or elements appear can influence the degree of competition. We predict, based on the current findings, that the amount of competition present in a stimulus array will depend on the perceptual grouping among elements for a large variety of Gestalt principles such as good continuation, shared motion, figure-ground segmentation, and contour features such as local concavities and T-junctions (Wertheimer, 1923; Grossberg et al., 1994; Palmer, 1994; Driver et al., 2001), which need to be tested in future studies.
How might mechanisms for perceptual organization and competitive processes interact at the neural level? There are two main possibilities. First, competition in intermediate visual areas, such as V4, may be influenced by perceptual organization processes that occur in early visual cortex. Neural correlates of illusory contour formation have been found in visual cortex, as early as in areas V1 and V2, (von der Heydt and Peterhans, 1989; Sheth et al., 1996; Lee and Nguyen, 2001; Maertens and Pollmann, 2005, 2007; Montaser-Kouhsari et al., 2007). In addition to illusory contour formation other Gestalt principles of grouping such as collinear alignment (Kapadia et al., 1995; Ito and Gilbert, 1999; Altmann et al., 2003), grouping by proximity (Han et al., 2005), and figure-ground segmentation (Driver et al., 1992; Lamme, 1995; Zhou et al., 2000) are thought to rely on mechanisms of early visual cortex. These mechanisms could boost the activity related to the set of stimuli as it enters intermediate visual areas such as area V4, effectively counteracting competition that may have occurred between stimuli. Intermediate visual areas with larger RF might read out the bottom-up perceptual biases resulting from perceptual organization computed in early visual cortex and integrate them with parallel competitive processes. A second possibility is that perceptual grouping and competition may rely on the same set of neural mechanisms implemented at intermediate processing stages. It has been demonstrated that neural competition is greater at the level of the RF, thus depending on the RF architecture of extrastriate cortex (Kastner et al., 2001). The finding that some perceptual organization principles such as shape formation (Behrmann and Kimchi, 2003) appear to rely on intermediate visual areas suggests that the same regions might play a key role in both competition and perceptual grouping. These two possibilities are not mutually exclusive, as intermediate areas might integrate perceptual organization information through recurrent processing. For instance, cells in V2 and V4 have been found to integrate visual information regarding figure-ground segmentation, which is thought to play a role in illusory contour formation, from far beyond their classic receptive fields, and feedback from area V5/MT has been found to influence figure-ground segmentation processes in early visual cortex (Hupe et al., 1998). Regardless of the underlying neural mechanisms, our findings suggests that a variety of perceptual organization principles will contribute to the bottom-up saliency of elements in a scene, resulting in varying levels of competition in intermediate visual areas.
While the current study was not designed to investigate the mechanisms underlying illusory contour formation or collinear alignment, our data provide insight into the automaticity of these perceptual grouping processes. The current results suggest that the perceptual organization principles underlying illusory contour formation and grouping by collinear alignment and proximity occur automatically outside the focus of attention. These results are consistent with other evidence suggesting that illusory contour formation and grouping by collinearity can occur independent of attention (Davis and Driver, 1994; Vuilleumier and Landis, 1998). For instance, responses have been found in V1 cells of anaesthetized monkeys that reflect illusory contour formation (Grosof et al., 1993) and collinear alignment (Polat et al., 1998). However, there is also evidence to the contrary, suggesting that illusory contour formation and grouping by collinearity and proximity require attention (Wasserstein et al., 1987; Freeman et al., 2001; Han et al., 2005). The current study provides important evidence that neural correlates of illusory contour formation and grouping by collinear alignment and proximity in human visual cortex occur in a highly automatic fashion.
Our findings suggest that image segmentation and perceptual grouping principles interact to provide biases in favor of potential figures. This may be a neural mechanism underlying the ‘object superiority’ effect, defined as the perceptual advantage objects receive over unorganized elements (Weisstein and Harris, 1974; Kovacs and Julesz, 1993; Driver and Baylis, 1996; Arrington et al., 2000; Kimchi et al., 2007). For instance, subjects have better memory performance for the contour of a figure than for the contour of a ground region (Driver and Baylis, 1996). Recently, Yeshurun and colleges found that subjects performed better on a Vernier acuity task when it was performed at a location that previously contained a perceptually grouped object than when the task was presented at a location containing ungrouped background elements (Yeshurun et al., 2008). They interpreted these results as attentional capture by the perceptual group (Kimchi et al., 2007; Yeshurun et al., 2008). Thus far, behavioral studies investigating the object superiority effect have been unable to distinguish between attentional capture by a perceptual group and a purely automatic bottom-up bias in favor of the perceptual group that does not result in the redeployment of the attentional spotlight. The current results suggest that the behavioral advantage observed for perceptually grouped stimuli would be better explained by a bottom-up bias that does not involve the attention network. These biases may provide a mechanism by which the visual system can ‘flag’ potentially interesting locations in parallel throughout the visual field, forming candidate objects that will be likely candidates for attentional selection.
This study was supported by grants from NIH to S.K. (2RO1 MH64043, 1RO1 EY017699, 2P50 MH–62196) and to S.M. (F32 EY017502).