The goal of our experiments was to study the spatial structure and temporal dynamics of the modulatory surround influence in border ownership (BOS) selective neurons. We have shown in previous studies that displays of rectangular figures produce strong BOS modulation in many neurons of area V2. Building on these results, we devised tests in which similar figures were broken up into fragments that could be presented in isolation and in combinations. This method allowed us to measure the influences of the individual fragments and their interactions.
The solid figures that we have mainly used in the past cannot be decomposed into fragments without introducing conspicuous new edges that would likely influence the responses themselves, whereas outline figures, which are often used in fragmented figure tests by psychologists, tend to be less effective in producing side-of-figure selective responses (Fangtu Qiu and Rudiger von der Heydt, unpublished results). We have therefore designed displays with ‘Cornsweet edges’ in which luminance (or color) varies across the contour in the form of a step function, as in regular solid figures, but transitions smoothly (exponentially) to the background color on both sides. Examples of such figures are shown in and . We used odd-symmetrical edges with a positive luminance/color excursion on one side and an equally large negative excursion on the other side. (In the case of color variation, positive means a deviation from the background color—a neutral gray—towards the neuron’s preferred color in 3-dimensional linear color space, and negative means a deviation in the opposite direction.) We generally used luminance variation of neutral gray if a cell was not color selective, and combined luminance-color variation in color selective cells. It was important that the transition to the background color had the same shape on both sides of the edges because this made the assignment of the local edge ambiguous (asymmetric edges with a sharp transition on one side and a gradient on the other tend to bias perception towards assignment to the side of the gradient (von der Heydt and Pierson, 2006
; Palmer and Ghose, 2008
Figures composed of Cornsweet edges (in the following called ‘Cornsweet figures’) appear either brighter or darker than the background, depending on the polarity of the edges (, top). They also appear tinted with color if the variation has a chromatic component. As in our previous studies with solid figures we always tested both signs of contrast polarity and compared the responses to the right-hand edge of a bright figure with the responses to the left-hand edge of a dark figure, that is, two displays that were identical within and around the CRF (see examples in ). The tests also included the corresponding conditions in which the colors were flipped, producing the opposite sign of local contrast in the CRF ( bottom).
For each neuron we generated the basic pair of Cornsweet figures based on the cell’s preferred orientation and color, and placed each figure so that one edge was centered on the CRF, as shown in (for a cell with vertical orientation preference). In the following we refer to this edge as the ‘center edge’. From each of the basic figures a number of fragmented figures were generated by variably eliminating edges and corners. The edge fragments tapered off to the background along the edge with a profile of a half cycle of cosine-squared so as to avoid introducing sharp terminations. Each figure was decomposed into eight fragments, four edges and four corners. According to their location relative to the CRF refer to these as ‘Near corners’, ‘Near edges’, ‘Far corners’, and ‘Far edges’ ().
Naming conventions. We also use the abbreviations NC, NE, FC and FE.
In the main test, the center edge was present in each stimulus presentation, and the presence of the other seven fragments was varied factorially, generating 27 = 128 combinations for each figure. Thus, the main test consisted of 128 fragmented versions of each of the basic figures (128 versions of the bright figure on the left and 128 of the dark figure on the right) plus the corresponding displays with reversed local contrast, for a total of 512 displays.
In addition, displays without the center edge were tested for control (). This set consisted of displays of all seven surround fragments on either side ( top) and displays of each of the 14 surround fragments in isolation ( below), again with both contrast polarities, for a total of 32 displays.
In a complete test run, the 512 displays of the main test and 2 sets of the 32 control displays were presented in random order at a rate of two per second (six presentations per fixation).
We analyzed the data of each neuron with two regression models, a simple linear model for the control set,
and a 2nd
order model for the main test,
represents the responses, defined as the spike count between 100 and 350 ms after stimulus onset, R0
is the baseline spike count of the neuron (the mean across all intervals from 50 ms before to 30 ms after stimulus onset), Re
is the mean center edge response, C
is the contrast polarity variable (C
= −1, 1
), and Fi
) are the fragment variables (Fi
= 0 if absent, Fi
=1 if present
). The first sum in Equation 2
represents the linear effects of the surround fragments, and the second sum their interactions with contrast polarity. The last term represents the interactions between fragments. Note that only the products for pairs of fragments on the same side are non-zero, because fragments were present only on one side at a time. The value of R
0 is the estimated response to the center edge alone (which depends only on the contrast polarity). Thus, di
estimate how much the presence of surround fragments enhanced or reduced the center edge responses. If a neuron was tested with different sizes or shapes of figures, separate regressions were performed on each set of data.
To better understand the design, consider that for each contrast polarity a total of 15 fragments were tested, the center edge and 14 surround fragments. Each fragment is associated with a binary parameter (0=absent, 1=present). Thus, the number of all possible fragment combinations would be 2^15. However, only a subset of these combinations was tested. There were two restrictions: (1) only fragments of one figure at a time were presented; thus only seven surround fragments and the center edge were tested for either figure location; (2) complete factorial sets of surround fragment combinations were tested only in the presence of the center edge; in the absence of the center edge (the control condition), a reduced set was tested: the single fragments and the combinations of seven surround fragments on either side. These restrictions enabled us to use a factorial design to measure the effects of the surround fragments in modulating the CRF response, and also the interactions between the fragments in modulating this response. Without restrictions this would not have been feasible.
order effects of the surround fragments were estimated from the control data set (equation 1
). In this set, we tested each single fragment on either side in the absence of the center edge
. According to the concept of a CRF that is surrounded by regions that are purely modulatory, these stimuli should produce no responses at all. We will test this assumption with the model of Equation 1
The influences of the surround fragments in the presence of the center edge
were estimated by the model of Equation 2
. Under the assumption that the 1st
order effects are zero, the main effects in this model are estimates of 2nd
order effects: the modulatory influences of the surround fragments on the center edge response. The model of equation 2
also estimates the interactions between fragments in modulating the center edge response. In the CRF-surround concept, these are 3rd
Neurons were recorded from areas V1 and V2. The receptive field eccentricities ranged between 0.7 and 6.5 (median 1.77) deg visual angle. Only orientation selective cells were studied. After determining the preferred orientation with a suitable bar stimulus and mapping the CRF, the standard BOS test with solid squares was performed (, for two square sizes, 3 deg and 8 deg). Cells that did not show a significant side-of-figure effect (p<0.05) in this test were usually not studied further.
Comparison of border ownership signals for Cornsweet figures and solid figures
We first wanted to see if Cornsweet figures were effective in producing BOS signals. Solid figures differ in color from the background over large regions, whereas in Cornsweet figures, the color differences are confined to a narrow seam along the contours. We compared BOS modulation by Cornsweet figures and by solid figures in each neuron. A scatter plot of the main effects of side for the two kinds of figures (from separate ANOVAs) shows that the effects were highly correlated (, Pearson r = 0.85, N=171). Cornsweet figures were 86% as effective as solid figures (as measured by the slope of the line that minimizes the squared perpendicular deviations). This result shows that BOS modulation depends more on the contours than on regions of different color or luminance.
Fig. 4 Comparison of the border ownership effects obtained with solid figures and Cornsweet figures with the same geometry. A 3-factor analysis of variance was performed on square-root transformed spike counts (see Methods), and the border ownership effect was (more ...)
Spatial aspects of context integration
Fragmented figure tests were completed in 99 neurons from two animals (31 from M23 and 68 from M24), in some cases with multiple sizes and shapes, for a total of 141 tests. Neurons were included in the following analysis if the effect of side-of-figure was significant at p<0.01 (2-factor ANOVA, factors: side-of-figure, local contrast). To be sure that the surround fragments were outside the CRF we also checked if, in the control condition without center edge, the ‘near corners’ did not produce any responses. One neuron (2 tests) was excluded for this reason. A total of 100 tests in 66 neurons were included (20 from M23 and 46 from M24; 43 neurons were tested with one figure size, 23 with multiple figure sizes and shapes). In 80% of the tests, each condition was tested twice or more times, in the others at least one quarter of the conditions were repeated. Eight of the neurons were recorded in area V1 (15 tests) and 57 in V2 (84 tests). One neuron could not be assigned to an area with certainty. Our sample consists mainly of V2 neurons, because BOS selectivity is much less common in V1 than in V2 (Zhou et al., 2000
The modulatory effects of the surround in two example neurons, each tested with squares of two sizes, 3 and 8 deg, are shown in . We plotted the linear estimates in the form of maps that show the locations of the figure fragments relative to the CRF, and the strengths of the effects coded by color (red for facilitatory, blue for suppressive). The estimates for the two contrast polarities were averaged. As can be seen from Equation 2
, these linear estimates quantify how much, in spikes per second, a fragment increased or decreased the responses relative to the center edge response. The insets show the maps of the CRF obtained with bar stimuli; the gray values represent response strength.
Fig. 5 Maps of the surround effects for two example neurons. The small and large C-shapes represent maps generated from tests with 3 deg and a 5 deg squares respectively. Top, Effects of the 14 fragments in modulating the center edge response (main test). Bottom (more ...)
The maps for the control condition (no center edge) are shown at the bottom of the figure. Here, the colors represent the mean responses to each single fragment relative to the activity before stimulus onset.
Inspecting plots of the fragment effects for all cells we made the following observations: (1) Figure fragments on both the preferred and the non-preferred sides modulated the responses. (2) On each side, several fragments seemed to have an effect; those on the preferred side tended to have positive effects, those on the non-preferred side tended to have negative effects. However, many of the effects of individual fragments were not statistically significant. (3) The control condition responses did not seem to be correlated with the effects in the main test. The control responses tended to be similar on the two sides, whereas the modulatory effects tended to go in opposite directions. Specifically on the non-preferred side, control responses could be positive, while the modulatory effects were negative. (4) Most cells showed positive and negative modulatory effects, but some showed only negative effects. Otherwise, the results were quite similar across cells.
The population means of the linear estimates of fragment effects are shown in . Here, we have averaged the effects of symmetrically located fragments, plotting only one mean for near corners, near edges, and far corners, on each side (no difference is expected between the population means for fragments in symmetrical locations, because the receptive field orientation varied from neuron to neuron, and, in the average, only the location of features relative to the receptive fields should matter). The two contrast polarities were also averaged. Zero on the ordinate (left scale) corresponds to the neurons’ responses to the center edge alone (19.5 spikes·s−1
). The bars labeled ‘All’ are estimates of the responses to the complete figures on preferred and non-preferred sides. These were calculated from the linear estimates and the interactions (Equation 2
). The responses in the control condition (no center edge) are plotted at the bottom of the graph (scale on the right). Here, the bars represent the mean responses to each single fragment and to figures in which only the center edge was missing (‘All’), and zero corresponds to the activity before stimulus onset. The error bars represent the 95% confidence limits determined by bootstrapping, as explained in Methods.
Fig. 6 Population means of the effects of contour fragments in the receptive field surround. A, Main effects. The shaded bars show linear estimates of the influence of each type of surround fragment, and estimates of the combined effects of all fragments, in (more ...)
The Figure shows that, in the mean, the effects on the preferred side were all positive and of almost equal size for all types of fragments. The mean effects on the non-preferred side were negative and stronger than those on the preferred side. Here, the fragments closer to the CRF produced slightly stronger suppression than farther fragments.
The effects of the complete figures ( All) were much weaker than the sum of the effects of the single fragments: had the effects of the single fragments been additive, the response to the complete figure on the preferred side would have been 3.8 spikes·s−1 greater than the response to the center edge alone, but in fact the response to the complete figure was smaller than the center edge response: −0.5 spikes·s−1. On the non-preferred side, adding up the effects of the single fragments would give a suppressive effect of −17.2 spikes·s−1, but the response to the complete figure was only −4.9 spikes·s−1 smaller than the center edge response. This non-additivity indicates interactions between the fragments in modulating the center edge response. These will be discussed below.
In the absence of the center edge, the surround fragments produced virtually zero responses on average ( bottom). The virtual absence of responses in the single fragment controls demonstrates that the surround stimuli were in fact outside the CRF. Remarkably, the combinations of the 7 surround fragments ( bottom, All) did produce responses on both sides. These were small, but much stronger than expected based on linear summation of the single fragment responses. Note also that the combination of 7 fragments on the non-preferred side produced excitation in the absence of the center edge, but had strong suppressive effects in the presence of the center edge ( top, All). Finding positive responses to the combinations of the seven surround fragments despite the absence of single-fragment responses might suggest a summation-and-threshold mechanism. However, such a mechanism would not explain the suppressive effect in the presence of the center edge response.
As the non-additivity of the main effects indicates, there were interactions between the fragments. The pairwise interactions of fragments are plotted in . The interactions on the preferred side were mainly negative, whereas those on the non-preferred side were mainly positive. The result is the sub-additivity of effects seen in .
Comparison between V1 and V2
We also analyzed the data from V1 and V2 separately (Fig. S3
). Although our sample of BOS selective V1 neurons is small (8 neurons, 15 tests), the results indicate that those V1 cells that are BOS selective use similar context integration mechanisms as their counterparts in V2.
Discounting gain normalization
The plot of the population main effects () shows that the suppressive effects on the non-preferred side were stronger than the enhancement effects on the preferred side. We wondered if this asymmetry was due to a gain normalization mechanism of the kind proposed by Heeger (1992)
, in which the gain of a neuron depends on the total amount of ‘stimulus energy’ in the receptive field. This means that the observed response robs
is the product of the response r
that would be obtained without the gain normalization mechanism and a gain factor g
depends on the stimulus energy, g
), and decreases as E
increases. In the case of V2 neurons, the energy summation region might include the surround, and the stimulus energy would then be given by the number of contour fragments in the display, g
). The average response strength of the neurons in our sample did indeed decrease with the number of fragments. In , crosses represent
), the mean response across all stimuli in which the number of fragments was n
. Dots connected by solid lines show the mean responses for preferred and non-preferred sides separately. Being a nonlinear operation, gain normalization would produce interactions between fragments. This raises the question if the interactions we found () are due in part, or even entirely, to the gain normalization mechanism.
Fig. 7 Discounting gain normalization. The plots show the average strength of responses as a function of the number of contour fragments in the display. Crosses, mean responses across all stimuli with the given number of fragments. Black dots and black lines (more ...)
(1) is the strength of response to the center edge alone,
(1) is the gain reduction produced by adding n-1 surround fragments. We can formally ‘undo’ this effect by multiplying each single response ri
), where n
is the number of fragments present for response ri
. The dots connected by dashed lines in show the mean responses after this scaling.
The regression coefficients obtained from the scaled responses are plotted in . The main effects were more symmetric (). The interactions also became slightly more symmetric after the scaling, but the general pattern remained the same – negative on preferred, positive on non-preferred side (). Thus, gain normalization is not the cause of this characteristic pattern of interactions.
Population means of the effects of contour fragments after scaling responses to compensate for hypothetical gain control (see Results for explanation). A, Main effects. B, Interactions. Conventions as in .
The pattern of main effects and interactions shown in can be understood by assuming that the neurons have opposite modulatory inputs from contour integration mechanisms on the two sides, and that the signals at these integration stages saturate. Saturation of the modulatory signals results in incomplete summation of the surround influences from either side. In the model of Equation 2
, this would produce negative interactions between the positive (enhancing) factors and positive interactions between the negative (suppressive) factors, exactly as observed.
Without the assumption of gain normalization it is hard to explain how the negative interactions on the preferred side can be so strong that the surround effect in the All condition is negative despite the linear effects of the seven fragments all being positive (). For example, if one assumes response saturation at the recorded neurons, this would explain the negative interactions, but saturation at this point cannot reverse the sign of the All effect. Also, a compressive nonlinearity at the output, as in response saturation, would predict the interactions on the non-preferred side to be negative too, but they are positive. The assumption of gain normalization, which is plausible in itself, reveals the intrinsic symmetry of the BOS mechanism ().
We conclude that one can interpret the pattern of surround effects as the combined result of gain normalization and a mechanism that produces symmetrical enhancement and suppression on the two sides.
Specificity of the surround mechanism Selectivity for contrast polarity
In all our experiments we tested figures of both contrast polarities. For a dark-light edge in the CRF, we measured the influence of the fragments of dark figures on the left, and the influence of the fragments of light figures on the right; for a light-dark edge in the CRF, the contrasts of both figures were reversed correspondingly. In – we showed averages over the two contrasts. However, in most of the cells in our sample the responses depended on the contrast polarity of the edge in the CRF, and the question arises if the influence of the surround fragments also depend on their contrast polarity. To see if that was the case, we selected the subsample of cells that were contrast polarity selective (47 neurons, 78 tests, significant effect of C in model Eq. 2
, p<0.01) and calculated the surround fragment effects for preferred and non-preferred contrast polarity (in the following called “positive” and “negative” contrasts). Fragments of positive contrast produced much stronger effects than fragments of negative contrast (; for the interactions see Fig. S4
). Of course, the CRF responses differed also in strength (24.7 versus 15.8 spikes·s−1
Fig. 9 The surround influence depends on edge contrast polarity. The fragmented figure tests included figures of either contrast polarity (e.g. light and dark). In cells that were contrast polarity selective, the two contrast conditions produced different strength (more ...)
If the surround influence were multiplicative and independent of contrast polarity, then the fragment effects would be proportional to the CRF responses. To see if this was the case, we have plotted the mean surround fragment effect over the strength of the CRF responses (; the solid line connects the points for the preferred figure side, the dashed line those for the non-preferred side). If the surround modulation had been independent of contrast polarity, then the effects for positive contrast could be extrapolated from the effects for negative contrast, as shown by the dotted lines in . However, this was clearly not the case. While the CRF responses for positive contrast were 56% stronger than those for negative contrast, the differential effect of the surround fragments (preferred – non-preferred side) more than doubled (increase of 127%). This shows that the surround influences themselves are contrast polarity selective in these cells, matching the selectivity of the CRF.
Control with scrambled figures
In models of BOS coding it is generally assumed that the surround influence be orientation specific (Kikuchi and Fukushima, 2003
; Zhaoping, 2005
; Baek and Sajda, 2005
; Craft et al., 2007
). This specificity is important for implementing some of the Gestalt rules that govern figure-ground organization in perception. We performed a control experiment in a subset of the neurons in which we compared squares with scrambled figures. The latter were obtained by rotating each of the seven surround fragments by 90 deg (). The scrambled figures consisted of the same amount of contour as the regular figures. All combinations of fragments were tested, exactly as was done in the main experiment, but with scrambled and regular figure displays randomly interleaved.
A comparison of the effects of regular and scrambled figures is shown in . The scrambling reduced the differences between preferred and non-preferred sides. Especially on the preferred side, the scrambled figures did not have much of an effect at all, whereas regular figures produced clear enhancement. This indicates that the surround input comes from orientation tuned neurons and that the integration of these inputs is orientation specific. The scrambled figure experiment is discussed further in the section on the time course below.
Fig. 10 Orientation selectivity of the surround influence: Comparison of the effects of surround fragments of squares and scrambled figures (). Means of 18 tests in 12 cells are plotted as in . Error bars represent 95% confidence intervals. Overall, (more ...) Comparison with model predictions
In principle, there are three different ways of how information can spread laterally in the visual system (Angelucci et al., 2002
): by convergence of forward connections; through intracortical connections (‘horizontal fibers’); and by divergence of backward projections. The spreading of forward connections is believed to account for the CRFs which are small and thus provide only little context. The non-classical surround modulation is thought to be mediated by horizontal fibers and by feedback from higher-level areas.
Accordingly, models of BOS have used three principles of context integration: feed forward mechanisms (e.g., Sakai and Nishimura, 2006
), signal propagation via horizontal fibers (e.g., Zhaoping, 2005
; Baek and Sajda, 2005
), and feedback loops including higher-level areas (e.g., Craft et al., 2007
; Jehee et al., 2007
). These three principles are illustrated in .
Fig. 11 Models of border ownership selectivity. Three different principles of context integration have been proposed: A, Feed forward mechanisms, B, lateral propagation in cortex via horizontal fibers, and C, feedback from a higher level cortical area. D, Integrations (more ...)
It has been suggested that the surround influence in BOS neurons is similar to the ‘non-classical surrounds’ of receptive fields in primary visual cortex that can be demonstrated by applying a grating in an annular region around the CRF and measuring its effect on the response evoked by the center stimulus. Such gratings generally have a suppressive effect. Using patches of gratings, Walker et al. (1999)
found that the surrounds are usually non-uniform. Often, a grating patch that covered only a small fraction of the annulus had the same effect as the whole annulus. Walker et al. concluded that surrounds often consist of a single localized sensitive region.
Based on this and other similar studies of V1, Sakai & Nishimura (2006) proposed that the BOS modulation in neurons of monkey V2 observed by Zhou et al. (2000)
could be fully explained by assuming that each neuron possesses two localized sensitive regions in the surround, a suppressive one on the non-preferred side and a facilitatory one on the preferred side (). They also showed that the observed variation between neurons in BOS selectivity for different stimulus geometries could be explained by assuming that location and size of these regions varies between neurons, and were able to simulate this variation by randomly picking the location and size of the regions.
How patchy is the surround?
The assumption that the modulatory surrounds consist of only two ‘hot spots’ seems to contradict the rather uniform distribution of surround influences on either side in the population means (). But this uniform appearance could of course be the result of averaging the surrounds of many neurons, each of which might have a non-uniform structure. The results in the example neurons (cf. ) also suggest that BOS modulation originates from multiple locations on either side of the CRF. However, in the data of the individual neurons only few of these effects were significant. Thus, it is not clear if all the locations contribute small amounts that add up to the total BOS signal, or if most of the surround is insensitive and contributes nothing, as was found for the non-classical surrounds in V1. The population average is not conclusive, and the data from single neurons are too noisy to allow a conclusion about the exact spatial distribution of sensitivity in the surround.
To resolve this dilemma we devised the following method. We first ranked the fragment (main-) effects of each neuron, separately for preferred and non-preferred sides, and then averaged them for each rank. If the majority of neurons had a ‘hot spot’ in their surround, this method would show a high value for the first rank and much smaller values for the other ranks. And for surrounds like those described by Walker et al. in cat V1 which mainly consist of one local suppressive region, our rank-and-average method should turn out a strong negative value for the last rank on the non-preferred side (or two negative values, if two neighboring fragments stimulated the suppressive region), and values close to zero for the other ranks. In fact, the interpretation of the results is not quite so straightforward because, due to random variation of the estimated effects, the ranking tends to exaggerate the differences across ranks. Even in the case of a perfectly uniform sensitivity distribution, the value for the first rank would be higher than the value for the last rank just by sorting the random variations of effects in each neuron.
To derive model predictions for comparison with our results, we therefore generated from each set of coefficients of a real neuron a number of synthetic sets according to the model hypotheses, and then performed the same rank-and-average procedure on the synthetic sets of coefficients. Each synthetic coefficient was the sum of a mean and a noise term. The mean was assigned according to the hypothesis to be tested: For the null hypothesis, the means of all effects were set to zero. For the Sakai-Nishimura hypothesis, we took the regression coefficients of a neuron and assigned the sum of the main effects on either side to the location of the largest effect (in absolute terms) on that side, and zero to the other locations. The noise terms were generated from the data of each neuron by randomly re-sampling the responses within stimulus conditions, and fitting the model of Equation 2
. The differences between the coefficients of each fit and their bootstrapped means were added as noise terms to the hypothetical means defined above.
To obtain confidence limits for the model predictions we thus generated 2,000 synthetic sets of coefficients, each with 100 members (the number of neuronal data sets). As was done with the real neurons, we ranked the coefficients of each synthetic set, and averaged them by rank across sets. This produced probability distributions of the effects from which the means and 95% ranges were determined.
The means of the ranked effects characterize the variation of effects across the surround regions in an average neuron (, crosses). Blue bands show the 95% confidence limits under the null hypothesis. The slopes of these bands show the spurious variation produced by random variation of responses. The prediction from the null hypothesis also simulates the small difference between the grand means of the two sides that is introduced by assigning a preferred side in each neuron (assignment bias). The difference between the data means and the means of the null prediction is plotted in . This shows the true variation of fragment influences across the surround. Four of the seven fragments on the preferred side had facilitating effects, while two had suppressive effects, on average. On the non-preferred side, all seven fragments had suppressive effects.
Fig. 12 The variation of sensitivity within the surround. A, The linear estimates of fragment effects were ranked according to size for each neuron, and averaged across neurons. Crosses represent the means. The blue bands indicate the 95% confidence range of (more ...)
As pointed out, the asymmetry between the sides might be due to a separate gain control stage. When computing the variation of surround effects after removing the gain control effect by scaling the responses, as described, we find facilitation from 5/7 fragments on the preferred side, and suppression from 6/7 fragments on the non-preferred side. With or without scaling, 11 of the 14 fragments (79%) contributed positively to the BOS signals on average.
To compare our experimental data with the predictions from Sakai and Nishimura’s model, we first had to address an obvious discrepancy. The model allows only for positive effects on the preferred side and/or negative effects on the non-preferred side, whereas in our data many neurons had negative effects on both sides. To preserve the general idea of the model, we therefore performed the regressions (Equations 1
) after scaling the responses (see Discounting gain normalization above). As a result, the maximum effect on the preferred side was now positive, and the minimum effect on the non-preferred side negative, in nearly all neurons, as in Sakai and Nishimura’s model. In essence, what we assume here is that the actual neuronal responses involve an additional gain normalization mechanism that was not included in that model.
The mean ranked effects are plotted in together with the predictions from the model (red shaded bands). For comparison we also calculated the results that would be obtained had all fragments on either side influenced the responses equally (blue shaded bands). Most of the data points lie outside the red band, indicating significant deviation from the model predictions. Specifically, the model predicts exceedingly large values at rank 1 on the preferred side and rank 7 on the non-preferred side which are not apparent in the data. Instead, the data points show a smooth progression similar to that obtained under the even distribution assumption (blue), but somewhat steeper. The sensitivity distributions on both sides are clearly much more uniform than postulated by Sakai and Nishimura’s model. Note that the even distribution assumption is illustrated only as a baseline for comparison. The feedback schemes do not necessarily predict an even distribution of surround influences. As illustrated , the grouping cell model (Craft et al., 2007
) rather predicts a gradual decrease with distance.
Walker et al. also showed (in cat V1) that, when two locations of test gratings contributed surround modulation, then these locations were usually adjacent. This was further support for their conclusion that surround modulation often originates from a single compact region. To see if our data are compatible with such a model in which the surround patches are large enough so that each patch would generally be stimulated by two fragments, we calculated the combined effects of pairs of adjacent fragments and performed another rank-and-average test, this time assigning the sum of all the measured effects on either side to the pair that had the strongest combined effect. The result (Fig. S5
) was similar to that shown in , again at variance with the patchy surround model.
In conclusion, the analysis of the population data shows that the surround structure we have seen in the example neurons is typical. BOS selective neurons receive modulatory influences from large portions of their receptive field surrounds, and the receptive field axis divides the surrounds into watersheds of suppressive and facilitatory influences that are fairly symmetric. The apparent patchiness seen in the modulation maps of single neurons is partly due to the random variation of responses. The modulation maps obtained with different figure sizes, as shown in , also indicated that the surround influence is not due to isolated patches, but essentially emerges wherever the surround is stimulated by a contour.
The time course of surround influences
The time course of the surround influence is essential for understanding the underlying mechanism. Specifically the latency of the surround effects poses an important constraint on models. Sakai & Nishimura (2006)
assumed that context is integrated by feed-forward mechanisms, but stopped short of explaining how this could be implemented in the visual cortex. This question is not trivial, because V1 and V2 are retinotopically organized areas and, consequently, the ‘region of identical stimulation’ in our BOS test () is mapped onto corresponding cortical regions surrounding the neuron under study. The interior of these regions is devoid of relevant context information, and this information must therefore be transmitted to the neuron over some distance. For the figures in our study the cortical distances that need to be bridged are considerable. Neurophysiologically realistic models of BOS have proposed either signal propagation via horizontal fibers within V2 (e.g., Zhaoping, 2005
; Baek and Sajda, 2005
), or feedback loops including higher-level areas (e.g., Craft et al., 2007
; Jehee et al., 2007
The cortical distances to be bridged can exceed the maximum length of horizontal fibers, which is about 5mm in V2 (Levitt et al., 1994
; Stepniewska and Kaas, 1996
). In foveal cortex, this corresponds to no more than a degree of visual angle (Gattass et al., 1981
; Dow et al., 1985
). Over larger distances, signals must be relayed, which requires that each neuron in a chain of relays fires spikes. This means that these neurons must also be activated directly from their CRFs by the visual stimulus (if a neuron would be activated by a distant feature alone, this would imply a very large CRF, for which there is no evidence). Thus, the context information must be propagated along the figure contours, or through neurons that are activated by the surface of the figure. Here, the use of Cornsweet figures is decisive, because the interior of these figures is not only devoid of contours, but completely unchanged when the figures are turned on and off. Thus, the presentation of a Cornsweet figure does not directly activate any neurons that have CRFs inside the figure. Therefore, intracortical connection models must assume that signals propagate over cascades of neurons along the contour representation, as shown in (Zhaoping, 2005
For the larger figures in our study, the path lengths of signal propagation would be on the order of 10–20mm (see Craft et al., 2007
, for details). Because of the limited conduction velocity of intracortical fibers, signal transmission over such distances would introduce significant delays. Assuming a conduction velocity of 0.2m·s−1
(Grinvald et al., 1994
; Bringuier et al., 1999
), the delays would be on the order of 50–100ms. These estimates are somewhat uncertain, because data on horizontal fiber conduction velocity are scarce (we are not aware of any for V2). Nevertheless, they show that it should be possible to distinguish between different integration schemes on the basis of latencies: In the case of intracortical signal propagation, effects of fragments that are far from the CRF should arrive with longer latencies than effects of fragments close to the CRF. For mechanisms using back projections from a higher level the situation is different, because here the length of the pathways does not increase linearly with the lateral distance in cortex, and because the projections between areas consist of myelinated fibers that are about ten times faster than intracortical fibers (Girard et al., 2001
Effects of close and distant locations
The fragmented-figure method allowed us to calculate the time course for the effect of each single fragment. To address the question of conduction delays, we calculated the contributions to the BOS signal from the closest fragments (Near corners), and from the farthest fragments (Far corners and Far edge). In each case, the average response difference (preferred – non-preferred side) was calculated across the subset of tests in which only these fragments were present. The results are relatively ‘noisy’ because they are based on small fractions of the data (1/32 and 1/16, respectively).
Two findings are remarkable (). First, the fragments far from the CRF produced a strong BOS signal in the absence of contours connecting to the CRF (dotted line
). As explained above, this is not possible in the lateral propagation scheme. Second, the onset of this signal does not show the expected delay caused by signal propagation, but rather the opposite: it arrives earlier than the signal produced by the near corners (dashed line
). The signal from the far fragments reached half amplitude at 91 ms (95% CI 81 – 100) and the signal from the near corners at 107 ms (95% CI 90 – 123). (We did not include the Near edges in the comparison, because they are at an intermediate distance; inclusion of the Near edge data in one or the other group did not change the ordering of latencies.) Thus, it is highly unlikely that BOS signals in V2 are generated entirely by interactions within V2 (as suggested by Zhaoping, 2005
, and ). At least a major portion of the signal must be generated by mechanisms that use feedback loops through white matter, otherwise the nearly simultaneous arrival of the contributions from near and far locations is hard to explain. Note that we do not argue against the possibility that horizontal fiber connections in V2 contribute. Indeed, the paradoxical observation that the ‘near’ contribution arrives slightly later than the ‘far’ contribution could be explained by assuming that the former involves at least in part horizontal fiber connections while the latter does not. Thus, it is possible that both kinds of mechanisms are combined.
Fig. 13 The time course of the influences from near and far locations in the surround. Thick lines show the response difference between preferred and non-preferred figure locations (border ownership signal, left scale); dashed line, Near corners in the absence (more ...) Influence of figure size
To study the effect of figure size on the time course of context integration we compared the BOS signals in the neurons that were tested with large and small figures. The plots in show the mean BOS signals for squares of 3–4 deg size (solid thick line) and for squares of 6–8 deg size (dashed thick line). The signal for small figures emerged earlier than the signal for large figures (92 vs. 113 ms). Thus, the assignment of BOS for larger figures takes longer.
Fig. 14 The time course of the surround influences for small and large figures. Thick lines show the difference between preferred and non-preferred figure locations (border ownership signal, left scale) for small squares (solid line) and large squares (dotted (more ...)
Although an increase of the latency of BOS signals with figure size is predicted by the lateral propagation model, the measured difference of 21 ms is shorter than the expected 50–100 ms. An increase of latency with figure size would also be expected in the grouping cell model (, Craft et al., 2007
), because the length of the feedback loop is likely to increase somewhat when a figure is made larger. Larger figures would often straddle the vertical meridian, with the result that parts of the figure are represented in the hemisphere opposite the recorded neuron, requiring an extra path length through the corpus callosum. Likewise, if a figure straddles the horizontal meridian, it would be represented in widely separated parts of extrastriate cortex, which would also probably lead to an increase in length of white matter connections (Jeffs et al., 2009
), compared to smaller figures which are more likely to fit entirely within the representation of one quadrant. These conjectures can be tested by analyzing the influence of the retinal location of contours on the latency of the BOS signal (N.R. Zhang, P.J. O’Herron and R. von der Heydt, work in progress.)
It may seem odd that large figures lead to longer latencies than small figures, while far fragments produce shorter latencies than near fragments (). However, these two findings are not in contradiction. In the subsample of neurons that were tested with two sizes, the BOS signal for large figures showed the same paradoxical delay of Near influences relative to Far influences as seen in , whereas for small figures, the signals emerged simultaneously (Fig. S6
). In other words, increasing the figures size delayed the influence of the Near corners more than the influence of the Far fragments. This supports our conclusion that the influence of the Near corners might involve horizontal fibers, whereas the integration of more distant features occurs via the fast feedback scheme.
Scrambled versus regular figures
The time course of the BOS signals for scrambled and regular figures is shown in . The data are from the subset of cells tested with scrambled figures. The curves show the averages over all combinations of fragments in each condition. It can be seen that the scrambled figures generated a reduced BOS signal. Interestingly, the signal for scrambled figures emerged virtually at the same time as the signal for regular figures (99 vs. 97 ms at half amplitude), but the two signals diverged over time: in the interval between 200 and 350ms, scrambling reduced the signal to half its value (3.1 vs. 6.3 spikes·s−1).
Fig. 15 The border ownership signals for squares (solid line) and scrambled figures obtained from the squares by rotating each fragment by 90 deg (dotted line). Each curve is an average across all fragmentation conditions. The signal for scrambled figures is (more ...)
It may seem surprising that scrambled figures generate BOS signals at all. However, the scrambled figures also have contour fragments located almost exclusively either on the preferred or the non-preferred side. Thus, the displays are strongly asymmetric for scrambled as well as for regular figures. In comparing the signals for the two conditions, note also that the results shown in and are averages across neurons, and that not all neurons may be most BOS selective for square shapes. A neuron that is more selective for elongated shapes, for example, would be sensitive to contours parallel to the RF at the locations where our squares have contours orthogonal to the RF. In such a neuron, the Near Edges of our squares would have a stronger influence in the scrambled condition than in the regular condition. Thus, averaging across neurons might have reduced the effect of scrambling in and . We do not have sufficient data from tests with different shapes to show whether the shape selectivity of the context integration mechanism varies between neurons.
Summary of latencies
The latencies of BOS signals and their confidence limits under the various conditions are summarized in . The figure also includes the mean BOS signal (the difference between the mean of all responses to preferred-side displays minus the mean of all responses to non-preferred-side displays), which reached half amplitude at 101 ms (CI 99, 103).
Fig. 16 Summary of the latencies of border ownership signals for fragmented figure displays. Values on the abscissa indicate the time point of half amplitude, as estimated by fitting sigmoid functions (see Methods). Mean, average across all displays and all neurons (more ...)
The latency estimates in are based on the pooled responses to the various fragmented figures. For complete figures, the latencies are slightly shorter. From the standard test, we obtained 90 ms (CI 86, 95) for Cornsweet figures, and virtually the same for solid figures (88 ms; CI 84, 93). These latencies are still markedly longer than the 68 ms found previously for BOS signals in V2 (Zhou et al., 2000
). This discrepancy might be due to a difference in stimulus presentation. Zhou et al. presented only one figure per fixation period, whereas in the present study, we presented five or six different conditions within a fixation period. O'Herron and von der Heydt (2009)
have recently shown that BOS signals depend on the stimulus history. As a result, latencies are longer when a figure is flipped from one side to the other than when a figure is presented on a blank screen at the beginning of the fixation period. Thus, the frequent reversals of figure side in the present experiments may have led to the longer latencies.