|Home | About | Journals | Submit | Contact Us | Français|
Descriptions of how attention modulates neuronal responses suggest that the strength of its effects depends on stimulus conditions. Attention to an isolated stimulus in the receptive field of an individual neuron typically produces a moderate enhancement of the cell's response, but neuronal responses are often strongly modulated when attention is shifted between multiple stimuli that lie within the receptive field. However, previous reports have not compared these stimulus effects under equivalent conditions, so differences in task difficulty could have been responsible for much of the difference. Consequently, the quantitative effects of stimulus conditions have remained unknown, and it has not been possible to address the question of whether the differences that have been observed could be explained by a single mechanism. We measured the attentional modulation of the responses of 70 single neurons in area MT of two rhesus monkeys using a task design that kept attention stable across different stimulus configurations. We found that attentional modulation was indeed much stronger when more than one stimulus was within the receptive field. Nevertheless, the broad range of attentional modulations seen across the different conditions could be readily explained by single mechanism. The neurophysiological data from all stimulus conditions were well fit by a model in which attention acts through a response normalization mechanism (Lee and Maunsell, 2009). Collectively, these results validate previous impressions of the effects of stimulus configuration on attentional modulation, and add support to hypothesis that attention modulation depends on a response normalization mechanism.
When individual stimuli are presented in a neuron's receptive field, the primary effect of attention is to increase responses to all stimuli proportionally, as if it adjusts the cell's overall sensitivity (McAdams and Maunsell, 1999; Treue and Martinez Trujillo, 1999; McAdams and Maunsell, 2000; Cook and Maunsell, 2004). However, attention has more complicated effects with two stimuli in the receptive field. As first described by Moran and Desimone (1985), when a preferred stimulus and a non-preferred stimulus both appear in the receptive field of a neuron in area V4, the response is strong when attention is directed toward the preferred stimulus and weak when attention is directed toward the non-preferred stimulus. Thus, the effect of attention to one of two receptive field stimuli is not an overall increase in a neuron's sensitivity. This distinctive effect of shifting attention between two stimuli within a neuron's receptive field have been replicated in subsequent studies of V4 and other areas (Treue and Maunsell, 1996; Luck et al., 1997; Reynolds et al., 1999; Ghose and Maunsell, 2008; Ghose, 2009).
Recent modeling studies suggest that the different effects of attention seen with one and two stimuli in the receptive field might be explained if attentional modulation depends on a response normalization mechanism (Ghose, 2009; Lee and Maunsell, 2009; Reynolds and Heeger, 2009). Normalization explains nonlinear aspects of neuronal activity by positing that increases in the strength or number of inputs to a cell also increase inhibition to the cell. Recent recordings from the macaque middle temporal visual area (MT) provided direct support for the idea that attentional modulation uses the normalization mechanism that mediates stimulus interactions whenever two stimuli appear within a receptive field. Across MT neurons, the strength of stimulus interactions for each cell (with attention held constant) is correlated with the strength of attentional modulation (with the stimuli held constant), such that neurons showing no stimulus interactions show no attentional modulation (Lee and Maunsell, 2009).
A critical test for the normalization model of attention is whether it can explain the dramatic differences in the amount of attentional modulation seen with one or two stimuli present in a neuron's receptive field. Although the potential for normalization models to explain this difference has been shown in simulations (Lee and Maunsell, 2009), it has not been tested directly with neurophysiological data. Previous measurements of attentional modulation in these conditions cannot be used, because no experiment has measured these effects using methods that kept attention constant across conditions. Greater task difficulty increases the strength of attentional modulation (Spitzer et al., 1988; Boudreau et al., 2006; Chen et al., 2008), so the greater attentional load caused by two stimuli in a neuron's receptive field could account for much of the difference that has been seen.
Because the normalization model is potentially of great importance for understanding the mechanisms underlying attention, we have measured how attention modulates neuronal responses to individual and paired stimuli under conditions that keep attentional effort constant.
All procedures related to animal subjects were approved by the relevant Institutional Animal Care and Use Committees. Data were collected from two male rhesus monkeys (Macaca mulatta) that weighed 6 and 8 kg. A scleral search coil and a head post were implanted under general anesthesia. After recovery, each animal was trained to do a speed change detection task (Figure 1A). The animal was required to hold its gaze within 0.75° (monkey P) or 1.0° (monkey M) from the center of a small fixation target while series of drifting Gabor stimuli were flashed at three locations: two within the receptive field of the neuron being recorded and one at a symmetric location on the opposite side of the fixation point from the receptive field. All three series were centered at the same eccentricity from the fixation point and the Gabors were identical except for their drift direction. The two stimulus locations in the receptive field were separated by at least 5 times the SD of the Gabors (mean Gabor SD 0.48°, SD of Gabor SD 0.06°, Gabor SD range 0.35° to 0.60°, mean separation of Gabor centers 4.2°, SD 1.3°, range 1.9° to 8.1°). Because receptive fields in MT are large, they could readily accommodate two stimuli (receptive field sizes in MT range from about 4° to 16° over the eccentricities sampled; Desimone and Ungerleider, 1986). The stimuli were presented on a gray background (42 cd/m2), which had the same mean luminance with the Gabors, on a gamma-corrected video monitor (1024 × 768 pixels, 85 Hz refresh rate).
On each trial the animal was cued to attend to the stimuli in one of the three locations and to respond when a Gabor with a different drift speed appeared there (the target), ignoring any speed changes at uncued locations (distractors), which occurred with the same probability as changes at the cued location. The animal indicated its response by making a saccade directly to the target location within 600 ms of its appearance. Correct responses were rewarded with a drop of juice or water. The target location was cued either by a yellow annulus at the beginning of each trial or by instruction trials. Instruction trials consisted of a series of Gabor stimuli that appeared in only one location. Two instruction trials were inserted each time the cued location changed.
Gabors were presented synchronously in all three locations for 200 ms, with successive stimuli separated by periods with pseudorandom durations of 94-247 ms (monkey P) or 141-294 ms (monkey M). During each presentation, the Gabor at any location was equally likely to drift in one of three directions (see below), or, one fourth of the time, not appear at all (blank). The blank stimuli were critical, because they caused the receptive field to contain (unpredictably) zero, one, or two stimuli on a given presentation. Because the animals were unlikely to be able to adjust their attention in response to the number of stimuli within the duration of the brief presentations, this allowed us to measure neuronal responses with different numbers of stimuli in the receptive field under equivalent attentional conditions. Had single and paired stimuli been presented separately on different trials, it is likely that attention would have differed between conditions because focusing attention on one of two adjacent stimuli requires more effort than attending to a single stimulus in relative isolation. Greater task difficulty causes greater attentional modulation of neuronal responses (Spitzer et al., 1988; Boudreau et al., 2006; Chen et al., 2008), which would have precluded quantitative comparisons between conditions. The ability of a single attentional factor to explain the behavior of neurons across stimulus conditions (see Results) suggests that attention was indeed stationary across conditions. The blank stimuli may have also helped control the animal's attention in another way. Because we used a single reference speed for each cell, the animal might have done the task by comparing the speed of the stimulus at target location with the speeds at other locations. Frequent blank stimuli undermine this strategy, and may have encouraged the animal to focus attention on the target location. The strong attentional modulation of neuronal responses associated with cueing different locations that we observed (see Results) suggest that the animals focused most of their attention on the location of the target.
Although stimuli were pseudorandomly selected for each presentation, they were constrained so that all possible combinations of the four stimuli (preferred, intermediate, null and blank) in the two receptive field locations occurred equally often. The task design led to the receptive field being stimulated with 9 pairings of stimuli moving in the same or different directions (3 directions in each of two locations), 6 single stimuli (3 directions in either of two locations) and one blank condition (Figure 1B). Each of these 16 stimuli might be presented during trials on which the animal's attention had been directed to any one of the three locations, resulting in 48 different task conditions.
The timing of the target appearance in each trial was selected from an exponential distribution (flat hazard function for speed change) to encourage the animal to maintain constant vigilance throughout each trial. However, trials were truncated at 5 seconds if the target had not appeared (~10% of trials), in which case the animal was rewarded for maintaining fixation up to that time. The speed change was adjusted for each stimulus configuration to keep the animal challenged. The median behavioral performance across all target locations was 94% correct (hits / (hits + misses), range 76%-99%).
After training was complete, a recording chamber was implanted to allow a posterior approach to MT (axis ~22° from horizontal in a parasagittal plane). Recordings were made using glass-insulated Pt-Ir microelectrodes (~1 MΩ at 1 kHz). A guide tube and grid system (Crist et al., 1988) was used to penetrate the dura. Extracellular signals were amplified and filtered, and action potentials from individual neurons were isolated with a window discriminator. Spike times were recorded with 1 ms resolution.
Once spikes from a single unit were isolated, a hand-controlled visual stimulus was used to estimate the location of the receptive field. We then used computer-controlled presentation of Gabor stimuli to measure tuning for direction (12 directions), spatial frequency (10 frequencies), and temporal frequency (10 frequencies) and to quantitatively map the receptive field (using 3 eccentricities and 8 polar angles) while the animal did a fixation task. The direction that produced the strongest response was taken as the preferred direction, the opposite direction was taken as the “null” direction and a direction 90° from preferred was used as the “intermediate” direction. The quantitative mapping of the receptive field was done with the preferred Gabor stimulus (preferred direction, spatial frequency, and temporal frequency).
For the main task the spatial and temporal frequencies were selected based on the corresponding tuning measurements, although we sometimes used a suboptimal temporal frequency to limit the difficulty of the task. The temporal frequency was rounded to a value that produced integral number of cycles of drift during each stimulus presentation, so that the Gabors started and ended with odd spatial symmetry, thereby insuring that the spatiotemporal integral of the luminance of each stimulus was the same as the background. The Gabors were achromatic and were presented at nominal 100% contrast. The two stimulus locations in the receptive field were chosen to give approximately equal responses, and the third stimulus was at the equal eccentricity but opposite to their midpoint, with respect to the fixation point.
Cells were included in the analysis if they were held for at least 12 repetitions (mean 33 repetitions) of each combination of receptive field stimuli and attentional state (48 conditions; Figure 1B). The response for each condition was taken as the average rate of firing in a period 50 to 250 ms after stimulus onset. Target stimuli and stimuli presented with a distractor were excluded from analysis, as were stimuli that appeared after the target. Additionally, the first one or two stimulus presentations in each trial (first 500 ms) were excluded from analysis to reduce variance arising from stronger responses to the start of a stimulus series. Instruction trials were excluded from data analysis.
For fitting models we used the weighted least square method for parameter estimation and used the square of the correlation coefficient between estimated firing rate from each model and the firing rate of a given neuron (i.e., variance explained by the model) as an index for goodness of the fit. For the estimation of the attentional normalization model, only variances of the responses to paired stimuli were used as weights for the fit.
Complete data sets were collected from 70 well-isolated units in area MT of the two monkeys. The median eccentricity of receptive field centers was 11° (range 4° to 20°). The median drift speed for the Gabors was 7.5 deg/s (range 1.6 deg/s to 23.3 deg/s).
Responses recorded from a representative neuron are presented in Figure 2. Each histogram shows responses of the cell to one stimulus condition, in the same 4×4 arrangement shown in Figure 1B (see insets). Thus, responses in the upper row were recorded when no stimulus appeared at location 1 and responses in the left column were recorded when no stimulus appeared at location 2. There are three plots for each stimulus condition: one with attention directed to each of the three stimulus locations. Black traces are responses recorded when the animal was attending to the stimulus location far from the receptive field (attend location 0). The colored traces show responses to the same stimulus condition when animal attended to one of the locations inside the receptive field (red for location 1 and green for location 2).
When a single stimulus appeared in the receptive field (top row and left column), the neuron's responses varied depending on its direction, with strong, sustained responses to the preferred direction and only transient responses to the onset (and sometime offset) of the null direction. Responses to single stimuli were comparable at the two receptive field locations, although not identical. Paired stimuli in the receptive field evoked responses that were typically intermediate between the responses to each single stimulus of the pair (compare responses to paired stimuli with corresponding responses in the top row and left column). For most pairs, responses were approximately the average of the responses to the two single stimuli, as has been shown before (Recanzone et al., 1997; Britten and Heuer, 1999).
The red and green plots in Figure 2 show the effect of directing attention toward each of the two stimulus locations in the receptive field. When a single stimulus was in the receptive field, attention to the stimulated location caused a modest increase in response (red traces in the left column; green traces in the top row; ~15% response increase for the preferred direction). Much stronger modulation was seen with some pairs of stimuli in the receptive field. The strongest modulation occurred when one stimulus had the preferred direction and the other had the null direction. In that configuration, attention to the preferred direction almost doubled the response seen with attention to the null stimulus. Attention to either stimulus moved the response of the neuron toward the response it would have had if that attended stimulus appeared alone. Note, however, that attending to one stimulus was not the same as removing the other stimulus (see Ghose and Maunsell, 2008). This effect of shifting attention between a preferred and non-preferred stimulus within the receptive field is well established in MT and other areas (Moran and Desimone, 1985; Treue and Maunsell, 1996; Luck et al., 1997; Reynolds et al., 1999). Strong attentional modulation was also seen when a null stimulus was paired with an intermediate stimulus.
Because the stimulus presentations were brief, it is unlikely that the animal could have adjusted its attentional effort between single and paired stimulus conditions (see also Williford and Maunsell, 2006). Thus, these data represent a measurement of attentional modulation with single and paired stimuli in the receptive field under directly comparable conditions. They confirm suggestions from earlier reports that attentional modulations with two stimuli are much stronger than those for single stimuli even when task difficulty is kept constant between conditions, and rule out the possibility that the greater modulation with two stimuli arises from either greater attentional effort or comparisons that were not commensurate (because previous measurements have typically compared attention to preferred versus attention to non-preferred with two stimuli in the receptive field, but compared attention to preferred versus attention to a distant, neutral stimulus with one stimulus in the receptive field).
The responses from the example neuron in Figure 2 were representative of the entire sample of neurons. Figure 3 shows average responses for all 70 neurons in the same format as Figure 2. With attention directed away from the receptive field (black traces), responses to each pair of stimuli approximates an average of the response to each stimulus alone: the response to a pair of preferred and null stimuli is about halfway between the response to each alone, while the response to two preferred stimuli is similar to the response to either preferred stimulus by itself. Consistent with the example neuron, attentional modulation observed when a single stimulus was in the receptive field was moderate, while the modulation when a pair of preferred and non-preferred stimuli was in the receptive field was strong. Attention increased responses to a single preferred stimulus by 9% (median, 3 – 14% interquartile range, black versus red curves in left column, bottom row and black versus green curves right column, top row), but shifting attention between the preferred and the null stimuli in the receptive field modulated responses by 59% (median, 41 – 96% interquartile range, red versus green curves in second column, bottom row and right column, second row). Attentional modulation was stronger for paired stimuli even when measured as the difference between responses with attention directed outside the receptive field and responses with attention directed to the preferred stimulus of the pair in the receptive field (median 28%, 17 – 45% interquartile range, black versus red or green curves in second column, bottom row and right column, second row).
Figure 3 shows that there was typically little attentional modulation in the earliest portion of neuronal responses. To examine the change in attentional modulation with time, we calculated attentional modulation separately for two periods, one for the initial, transient response (50 -125 ms from stimulus onset) and the other for the sustained response (125 – 250 from stimulus onset). Attentional modulation was minimal during the first period. The medians and interquartile ranges for attentional modulation were 4% (0-10%) for a single preferred stimulus, 18% (9-36%) for shifting attention between preferred and the null stimuli in the receptive field, and 10% (4%-12%) for shifting attention between the preferred stimulus paired with a null stimulus and a stimulus outside the receptive field. The corresponding values for the second period of the response were 12% (4-21%), 111% (63-179%) and 47% (23-75%). Because we saw no difference in the form of attentional modulation between these periods, we report values for the entire response period (50-250 ms from stimulus onset).
Figure 3 also shows that while attention to a one of two receptive field stimuli shifts the response toward the response to the attended stimulus when it is presented alone, the average effect of attention to one stimulus is not the same as eliminating the unattended stimulus. We calculated the difference in responses to individual stimuli when attention was directed outside the receptive field, and compared the difference with the modulation produced by shifting attention between the two stimuli when they appear together. If attention has the same effect as removing the unattended stimulus from the receptive field, then the ratio of the two changes will be one. We included all possible pairings (including null, intermediate, and preferred stimuli) and did a regression analysis for each neuron. The median of the slope from 70 MT neurons was 0.44 (interquartile range of 0.3 to 0.6). Thus, attentional modulations had less than half the effect of removing the unattended stimulus. A similar result has been described for neurons in V4 (Ghose and Maunsell, 2008).
The stronger attentional modulation seen with two stimuli in the receptive field can be explained by attention working through a normalization mechanism. In attentional normalization, the strength of attentional modulation depends on the strength of the normalization signal that a cell receives. Because two stimuli in the receptive field induce stronger normalization than a single stimulus, attention can modulate responses more with paired stimuli than for a single stimulus, even if the attention signal itself is constant across conditions. The model of attentional normalization we use has been presented previously (Lee and Maunsell, 2009) and we will not describe its properties in details here. A critical aspect of the model is that attention does not act primarily on the excitatory inputs to a neuron, but instead works through a normalization mechanism. This configuration is supported by neurophysiological recordings showing that attentional modulation of the responses of an individual neuron is correlated with the strength of the stimulus normalization mechanisms in that neuron (measured with attention held constant; Lee and Maunsell, 2009). The model is described by the following equations,
where R is the response of the neuron either to a single stimulus or paired stimuli in the receptive field, N is the normalization related to each of the two stimuli, I is the direct input that comes exclusively from each stimulus, and u is a power term that allows non-linear summation between inputs. In each normalization term (Eq. 2), s is a constant that provides a baseline signal when no stimulus is present (spontaneous activity), α determines how rapidly normalization varies with stimulus contrast, β is the attention term which varies with attention conditions (1 when the stimulus is not attended and typically greater than 1 when a stimulus is attended), and c is the contrast of the stimulus, which in the current experiment is either 1 or 0 depending on whether a stimulus is present in a given receptive field location. While c and β for each N will depend on the contrast of the stimulus and whether it is attended, α and s are the same for both Ns.
We estimated the direct inputs from responses to individual stimuli using the relationship between response (R) and the direct inputs (I) using Equation 1. For this, we set the contrast of the first or the second stimulus to 0, yielding equations 3 and 4, where the direct inputs are defined by responses to individual stimuli and the spontaneous activity of the neuron.
In these equations, m is the spontaneous activity of each neuron and other parameters are the same as in other equations. Because this relation was obtained from individual responses with no attention, the parameters used for estimating the direct inputs were independent of attention (β = 1).
Equation 1 has a form of weighted average, which is similar with the original response normalization model used for a plaid stimulus (Carandini et al., 1997). The weight given to each of the inputs (I1 and I2) is determined by the strength of normalization (I1 and I2). Presenting a single stimulus in the receptive field is equivalent to making the contrast of the other stimulus 0, leaving its input equal to spontaneous activity of the neuron (Eqs. 3 and 4). The normalization term for the 0 contrast stimulus, N(0), will be s. In this condition, attention-based changes in the value of N associated with the single stimulus will not have a strong effect on the response (because input from 0 contrast stimulus will have little weight). In contrast, when two stimuli with unit contrast are in the receptive field, each input will have the same normalization term, N(1). In this case, modulation of N for one input by attention will have stronger effect on the response, either increasing or decreasing the response depending on the relative size of the excitatory inputs, I. Therefore, the model predicts that the size of attentional modulation with two stimuli in the receptive field should be greater than attentional modulation with a single stimulus in the receptive field condition even if the strength of attention signal, β, remains the same across conditions.
We tested the attentional normalization model by fitting each neuron's data from the different stimulus and attention conditions simultaneously using a weighted least-square fit method. The model has four free parameters (α, β, s, and u) and the number of data points used for the fit was 41 (the complete 48 stimulus and attention conditions less the 6 single stimuli and the spontaneous activity with attention directed away from the receptive field). When the attention of the animal was directed to the stimulus outside the receptive field, we fixed the attention term β to 1, and when the attention was directed to either of the stimuli in the receptive field, we let β take a single value for attention to either receptive field location.
Figure 4 shows the performance of the model for the example cell in Figure 2. In testing the model, we do not consider response dynamics (see Discussion), and instead reduce each stimulus condition to a rate of firing, which are represented by bars. In each bar plot, the black, red and green bars show the neuron's rate of firing in different attention conditions averaged between 50 ms and 250 ms from the onset of the stimulus. The gray bars show the model responses for each stimulus and attention condition. For this cell, the model explained 96% of the variance of mean responses across conditions, capturing all the major features of the attentional modulation. For example, the model predicted a weak attentional modulation for single stimuli, a stronger modulation for paired stimuli, and essentially no effect when the attention was directed to the location in the receptive field where no stimulus appeared (a single stimulus at location 1 and attention to location 2, or vice versa: black versus green fit in left column or black versus red fit in top row).
Overall, the normalization model performed extremely well in explaining the attentional modulation observed in the responses of MT neurons to single and paired stimuli. Figure 5 shows average neuronal responses and average model responses across the 70 neurons. Overall, the model explained a median of 93.5% of the variance (interquartile range 90 – 96 %, total range 49 – 99%). The predicted effect of attention on the response to a single preferred stimulus was 7% increase (median, 3 – 9% interquartile range), and the predicted modulation for shifting attention between a preferred and a null stimuli in the receptive field was 51% change (median, 27 – 124% interquartile range). The predicted attentional modulation for shifting attention to a preferred stimulus paired with a null stimulus in the receptive field from a stimulus outside the receptive field was 20% (median, 11 – 35% interquartile range). Notably, the model explained the difference in the size of attentional modulation between a single stimulus in the receptive field and paired stimuli in the receptive field without assuming different mechanisms for the two conditions. The different degrees of attentional modulations observed in these different stimulus conditions can be explained by a single process as embodied in the normalization mechanism.
We have directly compared the amount of attentional modulation that occurs with different numbers of stimuli in a neuron's receptive field. Although earlier studies with two stimuli in a receptive field showed stronger modulations than measurements made using a single receptive field stimulus (Moran and Desimone, 1985; Treue and Maunsell, 1996; Luck et al., 1997; Reynolds et al., 1999), attentional effort was not controlled, so it was uncertain whether much or all of the difference might have depended on the greater task demands associated with a task using two closely-spaced stimuli. Further uncertainty arose because the two-stimulus measurements generally compared attention to preferred stimuli and attention to non-preferred stimuli, while the one-stimulus measurements compared attention to preferred stimuli and attention to neutral stimuli (those placed far outside the receptive field). Because the animal could not predict how many stimuli would appear during the current measurements, attentional effort was the same for each stimulus presentation. Keeping each stimulus presentation brief (200 ms) made it unlikely that the subjects had time to adjust their attention in response to the stimuli that appeared (Nakayama and Mackeben, 1989; Motter, 1994; Williford and Maunsell, 2006), and the success of the normalization model in accounting for the effects of attention in both stimulus conditions using a single value for the attention term suggests that variations of attentional effort between these conditions were minimal.
The normalization model was able to explain both the modest attentional modulation seen with a single stimulus in the receptive field and also the strong modulation that occurs when attention is shifted between preferred and non-preferred stimuli that both lie within the receptive field. Thus a single mechanism can explain the large range of effects that attention has been seen to have on individual neurons. Because it implements a form of divisive normalization, the model also readily explains the gain changes for tuning curves, such as for orientation (McAdams and Maunsell, 1999) or direction (Treue and Martinez Trujillo, 1999), when attention is directed to individual stimuli inside a receptive field (Lee and Maunsell, 2009). When two stimuli are present in the receptive field, the normalization mechanism effectively boosts the strength of the modulation by performing a type of weighted average that can have an effect like shifting the center of weight for a receptive field toward the more attended stimulus (Connor et al., 1996; Connor et al., 1997; Womelsdorf et al., 2006).
While the normalization model that we used succeeded in explaining different amounts of attentional modulation in different stimulus conditions, it is not intended to be a complete description of how attention is implemented in neuronal circuits. For example, it cannot explain the ability of attention to modulate spontaneous activity (see Lee and Maunsell, 2009). Such changes are such a small component in the overall variance of neuronal responses that they make it difficult to justify an additional parameter. Additionally, the model does not explain the dynamics of attentional modulation. For the neurons in this study, attentional modulation was minimal during the transient response and grew during the sustained component (Figures (Figures22 & 3). In future experiments it will be interesting to explore whether this phenomenon can be explained by a delay in normalization relative to direct input, as might occur if normalization were mediated by long-distance horizontal or feedback connections. Finally, the current form of the normalization model considers only the classical receptive field. The original response normalization model proposed that neurons whose receptive fields overlap with a visual stimulus contribute to the normalization signal (Heeger, 1992; Heeger et al., 1996). Similarly, the attention normalization model assumes that the effects of attention are limited to the receptive field. It therefore cannot explain the suppression of neuronal responses by attention to reigns immediately outside the receptive field (Tootell et al., 1998; Vanduffel et al., 2000; Pinsk et al., 2004; Sundberg et al., 2009).
The normalization model is consistent with the ideas of biased competition (Desimone and Duncan, 1995) and feature similarity (Treue and Martinez Trujillo, 1999; Martinez-Trujillo and Treue, 2004), which have been put forth to describe effects of attention. As originally presented, those models were not quantitative (equation based), but rather described how neurons behave under different stimulus conditions. The normalization model extends these and subsequent contributions by providing a descriptive mechanism that can provide accurate quantitative predictions of how neurons will respond over broad ranges of stimulus and attention conditions.
Recently, others have also proposed that attention may act through normalization mechanisms, and have presented detailed normalization models (Ghose and Maunsell, 2008; Boynton, 2009; Ghose, 2009; Reynolds and Heeger, 2009). These models are more elaborate than the one considered here, including, for example, terms that allow for variation in the spatial extent of attention. Because those models have similar form and more free parameters, they undoubtedly would perform at least as well in fitting the data we report here. While our data strongly support the idea that attentional modulations are closely associated with normalization mechanisms, further data would be needed to argue for a particular form of a normalization model. In particular, experiments that both controlled and measured the spatial extent of subjects' attention would be needed in determining whether a more elaborate normalization model is justified.
Results from other neurophysiological studies have suggested that normalization might be important for attentional modulation of neuronal responses (Qiu et al., 2007). A study (Roberts et al., 2007) showed that the effect of attention on neuronal responses in primary visual cortex could be explained by a mechanism similar to the one described here. They reported that attention changes the length tuning of neurons only with a longer preferred stimulus, and suggested that this might be explained by the covariations between the increasing length of the stimulus and the increasing size of the neuronal pool that contributes to a cell's response through lateral connections or feedback interactions (e.g., a normalization pool). They found that attention had greater effects for longer stimuli, which is consistent with the hypothesis of attentional normalization (Lee and Maunsell, 2009; Reynolds and Heeger, 2009).
By supporting the idea that attentional modulation depends on a normalization mechanism, the current results have implications for understanding how attention modifies sensory signals. The normalization model readily explains how attention can modify the gain of neuronal response, predicting that when a single stimulus is in the receptive field, attention will increase the response by a given proportion, whether the stimulus is preferred or non-preferred. Notably, the normalization model suggests that differences in the amount of attentional modulation shown by different neurons might reflect differences in normalization mechanisms rather than factors specific to attention. It is commonly observed that there is considerable variance in the how much attention modulates different neurons within areas, and that attentional modulations are stronger in later stages of visual cortex (see Maunsell and Cook, 2002). The significance of these differences is not understood, however it has been seen that neurons that show little normalization also show little attentional modulation (Lee and Maunsell, 2009). It is possible that the different degrees of attentional modulation seen within and between cortical areas may have more to do with changes in normalization (for example, changes associated with larger receptive field sizes in later visual areas; Schwartz and Simoncelli, 2001) than with differences in the strength of the attentional signals received by different neurons.
Finally, it should be noted that while the current results address the effects of directing attention to different spatial locations (a form of top-down or endogenous attention), they do not address the effects of bottom-up (exogenous) attention. We had to use briefly presented stimuli to ensure that top-down attention did not vary between single and paired stimulus presentations. The Gabor stimuli caused no change in overall luminance when they appeared, but the abrupt increase in contrast at their onset was an exogenous cue that could have attracted some attention to each stimulus. Because stimuli were always flashed, this exogenous cueing would have been a fairly constant factor across all our measurements, but it might have affected the relative attentional modulation for single and paired stimuli. The onset of a single stimulus in the receptive field might have brought more attention to that stimulus than the onset of the same stimulus when another stimulus appeared simultaneously beside it (assuming attention drawn to the stimulus by its onset is divided when another stimulus appears at the same time). The performance of the model provides little insight into exogenous attention because systematic differences in the responses to single and paired stimuli cause by exogenous attention would always be present, and could be captured by terms affecting how individual responses were summed. Because exogenous attention is transient (Nakayama and Mackeben, 1989; Bisley and Goldberg, 2006), its contribution could in principle be removed by looking long after stimulus onset. Unfortunately, this approach is impractical because there can be no assurance that endogenous attention will remain stationary over long stimulus presentations. Exploring differences within the brief stimulus presentations that we used would be similarly problematic, because it is unclear whether the paucity of attentional modulation during the transient is owing to exogenous cueing or other factors, such as differences in the dynamics of direct inputs and normalization. Thus, while the current data provide clear result about the effects of endogenous attention, whether bottom-up attention involves additional mechanisms remains to be determined.
We thank Incheol Kang and Amy Ni for helpful comments on earlier versions of this manuscript, and Vivian Imamura, Dennis Murray, Tori Williford for technical assistance. Supported by NIH R01EY05911 and the Howard Hughes Medical Institute.