Our results provide the first description of a sparse coding scheme in area V4, a major intermediate stage in ventral pathway visual cortex. Sparse coding is considered an important goal of sensory transformation because it increases representational capacity [
4] and reduces metabolic energy requirements [
17]. It is reasonable to speculate that the V4 coding scheme evolved in response to these adaptive advantages, though sparseness is not the only constraint that might produce an acute curvature bias. At the modeling level, we found that the acute curvature bias was not produced by minimizing various rearrangements of the terms in the RD expression (
Figure S4A), but there must be many mathematical constraints that would produce a similar bias. On the evolutionary level, there could be many other advantages to selective representation of acute curvature, perhaps relating to higher ecological relevance for object parts with acute curvature. Thus, there may be some other constraint that drove the acute curvature bias. However, regardless of how and why it evolved, the curvature bias seems likely to produce sparser object responses in area V4, and sparseness has strong implications for computational efficiency, metabolic efficiency, and memory storage.
This conclusion derives from the assumption that our modeling results are appropriate for interpreting the observed acute curvature bias in area V4. The models were closely based on previously validated models of intermediate visual neurons [
12,
13,
15], but mechanisms of intermediate and higher-level vision remain controversial, and no current model can be regarded as definitive. Area V4 neurons might operate in ways not captured by our models that affect sparseness. Moreover, we cannot make any claim, based on these analyses, about the absolute level of sparseness in V4 responses. We are only claiming that, given the tuning bias toward acute curvature, V4 responses are likely to be sparser than they would be without such a tuning bias. This seems logical, apart from any specific modeling results, given the low frequency of curved contours in relation to flat contours (
Figure S4B). Neurons tuned for less common image elements are bound to respond less frequently. V4 neurons show a strong tuning bias for acute curvature. Given the relatively low frequency of acute curvature, these neurons are bound to respond more sparsely than neurons without such a bias.
In early visual cortex, sparse coding is achieved by exploiting local statistical regularities in natural images to reduce redundancy of neural signals [
1,
2]. Gabor-like RF structures in V1 reduce redundancies due to local spatial frequency correlations in natural images [
5]. Nonlinear interactions with the non-classical surround exploit image correlations that extend beyond the classical RF [
6]. Coding in early visual cortex is constrained to minimize information loss, since the rest of the brain gets most of its detailed visual information directly or indirectly from V1 [
18]. Redundancy reduction based on local statistical regularities can be achieved without substantial loss of information.
Sparsification in mid-level cortex might require other mechanisms due to the different constraints of intermediate shape processing. Neurons in mid-level visual cortex integrate information across larger RFs and non-classical surrounds [
18]. Statistical correlations are bound to be lower on this larger scale, because physical relatedness between object parts is statistically weaker across greater distances. Thus, redundancy reduction may not be an option for sparsification at this scale.
However, mid-level cortex is also more specialized, with less need for complete preservation of image information, and more scope for emphasizing information required for specific aspects of visual perception [
18]. V4, in particular, is part of the ventral pathway [
18,
19], which emphasizes shape, color, and texture in the service of object perception. Given this specialization, further sparsification could be achieved by biasing representation toward image features with high object information content but lower probability of occurrence. In this way, a given object could be represented in terms of a small number of uncommon but diagnostic elements.
This alternate kind of sparse coding strategy appears to be implemented in V4 by emphasizing the representation of acute contour curvature, which is appropriately uncommon and diagnostic. Acute curvature was approximately an order of magnitude less common than flat or shallow curvature in our natural object set (
Figure S4B). This reflects the fact that, on the scale of visual perception, natural objects have mostly smooth rather than highly intricate boundaries. Thus, sparse coding simulations based on primarily acute curvature tuning () had low response densities (0.22 and 0.11, respectively). In contrast, non-sparse simulations based on tuning for more common flat/shallow contour regions had high response densities (, 0.80). (See also
Figure S4C.) At the same time, regions of acute curvature are still highly informative about object identity. In our simulations, accuracy remained high for the sparsest condition (RD = 0.11, accuracy = 97%) even when the remaining low-curvature model neurons (−0.4 <
c′ < 0.4) were removed (accuracy = 85%). Curved contour regions are also perceptually salient [
20–
22] and more perceptually informative than flat contours [
1,
23].
Bias toward representation of uncommon features with specialized information content could be a general strategy for sparse coding in higher-level cortex. Some evidence suggests that object coding is sparse at the final stages of the ventral pathway in IT (inferotemporal) cortex and medial lobe temporal structures like the hippocampus [
7,
8,
24]. Sparseness at these higher levels could be achieved by selectivity for more complex features [
15,
25,
26] with even higher information content. Bias toward tuning for acute curvature, which has been demonstrated in IT [
27], might also enhance response sparseness at this level. Alternatively, IT cortex might be optimized for discrimination at the expense of sparseness [
28]. The combined simulation/adaptive search strategy used here might help to elucidate coding strategies in higher-level visual cortex as well as in other sensory modalities.