|Home | About | Journals | Submit | Contact Us | Français|
What is the role of attention in multiple-object tracking? Does attention enhance target representations, suppress distractor representations, or both? It is difficult to ask this question in a purely behavioral paradigm without altering the very attentional allocation one is trying to measure. In the present study, we used event-related potentials to examine the early visual evoked responses to task-irrelevant probes without requiring an additional detection task. Subjects tracked two targets among four moving distractors and four stationary distractors. Brief probes were flashed on targets, moving distractors, stationary distractors, or empty space. We obtained a significant enhancement of the visually evoked P1 and N1 components (~100–150 msec) for probes on targets, relative to distractors. Furthermore, good trackers showed larger differences between target and distractor probes than did poor trackers. These results provide evidence of early attentional enhancement of tracked target items and also provide a novel approach to measuring attentional allocation during tracking.
One of the more dramatic demonstrations of attention to multiple foci is the multiple-object tracking (MOT) task (Pylyshyn & Storm, 1988). The subject is presented with an array of identical objects and is told to follow a subset of target objects as all of the items move independently for several seconds or minutes. Intuitively, this is a challenging task; yet most people can track three to five objects under typical conditions. Our goal in this study was to determine how spatial attention is allocated during this task. In particular, we sought to establish a hierarchy of the allocation of attention to various elements of the display (i.e., targets, distractors, and background) so that we might begin to characterize the mechanisms by which attention facilitates tracking.
Spatial attention is thought to act through a combination of mechanisms that both enhance the processing of relevant information and suppress the processing of irrelevant information (e.g., Posner & Dehaene, 1994). These two mechanisms are generally distinguished by comparing the processing of attended and unattended information with that of an attention-neutral baseline condition. Attended stimuli typically show enhancement relative to baseline, whereas unattended stimuli show suppression. The preferred technique for assessing the role of spatial attention during tracking tasks has been the dot-probe method (Alvarez & Scholl, 2005; Feria, 2008; Flombaum, Scholl, & Pylyshyn, 2008; Pylyshyn, 2006; Pylyshyn, Haladjian, King, & Reilly, 2008), which has been widely used to infer attentional distribution in visual search tasks (Cave & Zimmerman, 1997; Cepeda, Cave, Bichot, & Kim, 1998; Klein, 1988). In this technique, subjects must detect small, low-contrast probe dots presented at various locations while simultaneously performing the MOT task. The assumption is that probes should be detected most readily at attended locations and should be more likely to be missed when presented at unattended locations.
Using the dot-probe technique, Pylyshyn (2006; Pylyshyn et al., 2008) compared detection performance for probes on targets and distractors with performance in a neutral baseline condition in which probes were presented in empty space within the display. He found that detection was highest for empty space probes and that target probes were detected more frequently than distractor probes. Pylyshyn attributed this unexpected superiority for empty space to a low-level masking effect for probes on objects. To control for this masking effect, he also asked subjects to detect probes in the display without the requirement to track targets and found that they were much better at detecting probes in space than at detecting moving items. Using performance on this task to reinterpret probe detection in the tracking task, he concluded that probe performance was equivalent for targets and empty space but was impaired for probes on distractors. This pattern of results suggests that the primary role of spatial attention during MOT is to suppress distractors. Surprisingly, though, it suggests that tracked targets are not enhanced by attention, which contrasts strongly with results in the spatial attention literature, in which a combination of enhancement and suppression attention effects has typically been observed (Hillyard, Vogel, & Luck, 1998; Hopf et al., 2006; Luck, 1995; Moran & Desimone, 1985). One way to interpret these data would be to conclude that attentional enhancement is simply not involved in the tracking of moving targets. However, absence of evidence is not evidence of absence. The aim of this study was to test the alternative hypothesis that this lack of evidence for attentional enhancement of targets during tracking is a consequence of how attentional allocation in MOT has been measured.
The absence of evidence for an attentional enhancement of tracked targets may suggest that the attentional mechanisms that facilitate tracking are distinct from those involved in spatial attention. However, we argue that the dot-probe approach is not ideal for assessing the spatial distribution of attention in MOT—particularly, target enhancement. Accurate probe detection relies on the subject’s awareness of the probe, which requires complete processing of the probe to the level of report. Considering that in most previous demonstrations, target enhancement in spatial attention tasks has been shown to occur at fairly early (~100 msec) perceptual stages of processing (Hillyard et al., 1998; Luck, 1995), the dot-probe approach may not be sufficiently sensitive to detect enhancements that occur at such an early stage. Furthermore, the dot-probe technique itself may influence the distribution of attention in MOT. Subjects are in a dual-task situation in which attentional resources must be shared between tracking and probe detection. They cannot ignore distractors and empty space entirely, because task-relevant probes will be presented at these locations. Thus, detection performance for dot probes may tell us more about the strategies subjects use to achieve both tasks simultaneously than it does about attention distribution in the primary task (MOT).
In the present study, subjects had a single task: tracking targets. We presented probes at various locations, but instead of asking the subjects to detect them, we measured the electrophysiological response to these task-irrelevant probes. We measured the P1 and N1 components of the event-related potential (ERP). These are early (~75–150 msec) visual-evoked responses that reflect initial perceptual processing in extrastriate cortical areas (Heinze, Mangun, et al., 1994; Hillyard et al., 1998). Both components have repeatedly been shown to be acutely sensitive to the allocation of spatial attention, even when the evoking stimulus is task irrelevant (Heinze, Luck, Mangun, & Hillyard, 1990; Vogel, Luck, & Shapiro, 1998). Moreover, the P1 and N1 attention effects have been shown to be sensitive to both enhancement of attended information and suppression of unattended information. In particular, Luck (1995; Luck et al., 1994) found that the P1 to items at unattended locations was suppressed, relative to neutral conditions. Conversely, the N1 to items at attended locations was enhanced, relative to neutral conditions. Together, these previous results indicate that the P1 and N1 responses to task-irrelevant probes provide an ideal index for measuring both attentional enhancement and suppression in MOT at an early perceptual stage. If target positions are attentionally enhanced, we should expect larger P1/N1 responses to probes on targets than to probes on distractors or empty space. If distractors are suppressed, we should expect a decreased P1 response to distractors, relative to empty space.
As Pylyshyn (2006) noted, finding an appropriate neutral baseline condition is a difficult problem for the dot-probe technique. It may be easier to detect empty space probes, because they are not masked by item contours. Therefore, we also measured the ERP response to probes presented within stationary objects placed at random positions within each quadrant of a display (see also Pylyshyn et al., 2008). Aside from not moving, these objects were identical in appearance to the moving items, so that stationary probes would be equally subject to contour masking.1 Thus, we had two neutral baseline conditions: empty space and stationary objects.
The subjects maintained central fixation while tracking two targets among four moving distractors and four stationary objects for 6.33 sec (see Figure 1). At the end of the trial, all movement ceased, one object became red, and the subject judged whether or not it was a target. During the tracking period of each trial, eight task-irrelevant white square probes were briefly flashed at variable intervals. These probes could appear randomly on a target, on a distractor, in empty space, or on a stationary object.
Thirty-one participants (19 female; age range, 18–31 years) from the Eugene, Oregon community completed the experiment for monetary compensation. Three participants were excluded because of excessive eye movements (see below), leaving a total of 28 subjects in the sample.
Each subject completed 12 blocks of 30 trials each (360 total trials). Each trial included two of each type of probe—target, distractor, stationary object, and empty space—for a total of 720 probes per type. All the items were empty boxes subtending approximately 0.5° of visual angle. Items moved along random trajectories at a constant velocity of 1°/sec. Motion was constrained within an invisible 17° × 17° box centered on the screen. Items were allowed to collide and reflected from each other at their angle of incidence with no momentum exchange.
At the start of each trial, all the items were stationary. Two of the 10 items were red, designating them as targets. After 333 msec, the targets turned black and began to move, along with four of the eight distractors. During the trial, white probes appeared at varying intervals, with a minimum interprobe interval of 633 msec and a duration of 100 msec. After 6,333 msec, all motion ceased, 1 item became red, and the subject responded as to whether or not this item was a target. The red item was equally likely to be a target or a moving distractor.
Electroencephalographic (EEG) activity was recorded from 20 tin electrodes mounted in an elastic cap (Electrocap International). In addition to the standard International 10/20 System sites, five additional sites were used: OL and OR, positioned midway between O1 and T5 on the left hemisphere and O2 and T6 on the right; POz, located on the midline between Pz and O1–O2; and PO3 and PO4, located halfway between POz and T5 on the left and POz and T6 on the right. All the sites were recorded with a left-mastoid reference, and the data were rereferenced offline to the algebraic average of the left and right mastoids. The horizontal electrooculogram (EOG) was recorded from electrodes placed approximately 1 cm to the left and right of the external canthus of each eye to measure horizontal eye movements. In order to detect blinks and vertical eye movements, the vertical EOG was recorded from an electrode mounted beneath the left eye and referenced to the right mastoid. Probe events containing artifacts (ocular, movement, or amplifier saturation) were discarded. Subjects with artifact rejection rates in excess of 25% were excluded from the sample. Three subjects were excluded from further analysis using this criterion. EEG and EOG were amplified with an SA Instrumentation amplifier with a band-pass of 0.01–80 Hz and were digitized at 250 Hz in LabView 6.1 running on a Macintosh.
Tracking performance was quite good (mean percent correct, 88%; SD = 0.08%). We transformed accuracy to effective tracking capacity, m = n(2p − 1), where n is the number of targets (e.g., 2), and p is percent correct (Scholl, 2001). Mean m was 1.52 objects (out of a maximum possible score of 2), with substantial intersubject variability (SD = 0.3).
Figure 2A shows ERPs time-locked to probe onset across the four probe conditions. The two early spatial attention-sensitive components of interest can be clearly seen. The initial positive wave (P1) displays a narrowly focused scalp distribution, maximal over occipital electrodes. This is followed by the more broadly distributed negative wave (N1), which is maximal at central electrodes. For further analysis, we defined P1 amplitude as the mean amplitude from 100 to 150 msec following probe onset at an occipital pair of electrodes (OL/OR). We similarly defined N1 as the mean amplitude from 125 to 185 msec following probe onset at central electrode sites (Cz, C3, and C4). As can be seen in Figure 2B, both of these components were strongly modulated by probe type, yielding a significant effect of probe type on amplitude [P1, F(3,81) = 9.93, p < .001; N1, F(3,81) = 23.44, p < .001].
For both components, amplitude was highest for target probes, followed by distractors and empty space, and was lowest for stationary objects. Subsequent paired t tests revealed significant differences between target probes and all the other probe types [P1, t(27) = 3.36, 4.65, 3.01; N1, t(27) = 4.13, 6.42, 6.89; all ps < .007]. Furthermore, N1 amplitude to distractor probes was greater than that to either of the baseline probe types [stationary object, t(27) = 3.01, p > .006; empty space, t(27) = 3.23, p > .004]. However, although P1 amplitude to distractor probes was greater than that to stationary objects [t(27) = 3.33, p < .004], it was not reliably different from responses to empty space [t(27) = 0.75].
Are these electrophysiological effects simply correlated with attentional allocation, or are they related to performance? To answer this question, we took advantage of the interindividual variance in tracking and attempted to predict P1/N1 amplitude on the basis of tracking performance. We performed a median split of the ERP data based on the subjects’ tracking performance and analyzed ERP amplitude as a function of group (i.e., good trackers vs. poor trackers) and probe type. N1 amplitude was highly sensitive to tracking performance. As can be seen in Figure 3A, the primary difference between the two groups was in the relative amplitudes to targets and distractor probes, with good trackers showing a much larger difference between these two conditions than did poor trackers (see Figure 3B). We found a significant interaction between group and target versus distractor probes [F(1,26) = 6.24, p = .019]. Importantly, we looked at correlations across all subjects to verify that this effect was not an artifact of the median-split procedure. Before doing so, we calculated the reliability of each measure, using a split-half correlation procedure. The reliability for these measures was as follows: behavioral performance, r = .83; average N1 response, r = .89; response to target probes, r = .67; and difference between target and distractor responses, r = .65. Figure 3C shows the correlation between the target–distractor difference in N1 amplitude and tracking capacity (m), which was highly significant (r = .43, p = .024; when corrected for attenuation, r = .59). However, it was not the case that good trackers simply had larger N1 amplitudes for all probes: Neither overall N1 amplitude irrespective of probe placement (r = .08) nor target amplitude alone (r = .17) was significantly correlated with tracking ability. Similarly, the difference in amplitude between target probes and the two baseline probe types was not significantly correlated with tracking performance (r = .09 and .19 for empty space and stationary object, respectively), suggesting that the treatment of background space is the same for all subjects irrespective of tracking ability. In sum, these results indicate that there was less attentional differentiation between moving distractors and targets for poor trackers than for their more skillful counterparts.
What is the role of spatial attention during MOT? On the basis of results from the dot-probe paradigm, Pylyshyn (2006; Pylyshyn et al., 2008) suggested that whereas attention suppresses distractors, tracked targets are not enhanced by attention. On this distractor suppression model, we would expect equivalent ERP responses for probes on targets and on the background. However, we observed a substantially different hierarchy of attentional allocation: Targets showed the greatest response, with weaker responses to distractors, and the weakest responses to background or stationary objects. Thus, our results provide strong evidence in favor of attentional enhancements of targets during tracking. However, we found no evidence that distractors are suppressed below the level of the background, at least when measured at this early level of perceptual processing.
Previous work using spatial attention manipulations has indicated that the P1 component is indeed sensitive to the suppression of information at unattended locations (e.g., Luck et al., 1994). Thus, the absence of a suppression effect in the present study is unlikely to have been due to a lack of sensitivity to suppression mechanisms. Nonetheless, these results certainly do not rule out the possibility of distractor suppression at all levels. Indeed, the behavioral evidence consistent with distractor suppression during MOT has been replicated in a number of studies and appears to be a robust and reliable effect (Flombaum et al., 2008; Pylyshyn, 2006; Pylyshyn et al., 2008). How can we integrate the present results favoring target enhancement with those in the previous literature favoring distractor suppression? One possibility is that, whereas the P1/N1 response reflects attention at early perceptual stages of processing, the behavioral measures reflect distractor suppression at later postperceptual stages. If this formulation is correct, we would expect that postperceptual ERP components (e.g., N400, P3) should show distractor suppression effects (for a related line of reasoning, see Vogel et al., 1998). Another possibility is that distractor suppression reflects a strategy subjects adopt to deal with the dual-task demands of tracking targets while detecting probes. Although we cannot distinguish between these alternatives with our present data set, this is a fruitful topic for further research. One caveat to the distractor suppression interpretation of existing MOT dot-probe studies is that the designation of enhancement or suppression is always made relative to the empty space baseline, and these studies typically show that probe detection in the absence of a tracking task is higher for empty space than for moving objects (Pylyshyn, 2006; Pylyshyn et al., 2008). One finding that is very clear and consistent with the present results is that probes on target locations are always reported at a much higher rate than are distractor probes.
During an attentional tracking task, we observed modulations of the visual-evoked P1 and N1 components that closely resemble those observed in standard spatial attention tasks (Heinze, Luck, et al., 1994; Mangun & Hillyard, 1991). Although the attentional modulations of these components may be similar, it is certainly plausible that distinct mechanisms facilitate MOT and conventional spatial attention tasks. In particular, whereas spatial attention tasks generally require attention to be focused on a cued location in anticipation of a single upcoming target, MOT would appear to require object-based attention (Alvarez & Scholl, 2005; Drew & Vogel, 2008; Scholl, Pylyshyn, & Feldman, 2001; vanMarle & Scholl, 2003). Nonetheless, both location- and object-based attention appear to produce similar modulations of the perceptual response to task-irrelevant probes. For example, Martinez et al. (2006) used a task-irrelevant probe ERP technique while subjects performed a variation of the Egly, Driver, and Rafal (1994) object-based attention task and found that the P1 and N1 were enhanced for probes presented at the attended portion of an object. Importantly, they also found that P1 and N1 were larger for probes on the unattended portion of the attended object than they were for probes on an unattended object that was equally distant from the attended region, indicating that the benefits of attentional allocation extended throughout the object.
With a novel method of assessing spatial attention during MOT, our present results also help us to understand why individuals differ in tracking ability. We found that the difference between good and poor trackers was not the overall amplitude of the response to probes at the attended location, nor was it the treatment of nonmoving stimuli. The key difference in our data was the relative amounts of attention allocated to targets and distractors. We found that tracking performance improved as the difference in amplitude between probes on targets and distractors increased. One straightforward interpretation of this result is that poor trackers were more likely than good trackers to inadvertently track one or more distractors, leading to a smaller average difference between target and distractor responses. Although we did not find direct evidence that poor trackers paid significantly more attention to distractors than to targets, it is possible that we failed to see such a relationship due to the fairly large number of distractors in the display. That is, given that there were four moving distractors, if a subject inadvertently began to track a particular distractor, we had only a one in four chance of probing that particular item on that trial. Future experiments will be necessary to more clearly determine whether these subjects directly allocate more attention to distractors. Nonetheless, the present results indicate that behavioral tracking performance is related to the relative amounts of attention allocated to targets and distractors. Thus, the present results are similar to those in our recent work examining the relationships between working memory capacity and the ability to prevent salient but irrelevant information from being stored in memory (Vogel, McCollough, & Machizawa, 2005). Thus, the present results add to the growing body of evidence that the ability to selectively prevent irrelevant information from being attended is an important correlate for success in both visual working memory and MOT (Kane & Engle, 2003; McNab & Klingberg, 2008; Vogel et al., 2005).
This research was supported by NSF Grant BCS-0617681, awarded to E.K.V., and NIH Grant MH-65576, awarded to T.S.H. and E.K.V.
1Although contours for stationary distractors may not be identical to those for moving items, due to motion-defined contours, our results indicate that probes in empty space elicited a smaller electrophysiological response than did probes on distractors or targets.
Trafton Drew, University of Oregon, Eugene, Oregon.
Andrew W. McCollough, University of Oregon, Eugene, Oregon.
Todd S. Horowitz, Brigham and Women’s Hospital, Cambridge, Massachusetts and Harvard Medical School, Boston, Massachusetts.
Edward K. Vogel, University of Oregon, Eugene, Oregon.