PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of elifeeLifeRecent contentAbout eLifeFor authorsSign up for alerts
 
eLife. 2017; 6: e25784.
PMCID: PMC5605274

Dynamic representation of partially occluded objects in primate prefrontal and visual cortex

Nicole Rust, Reviewing Editor
Nicole Rust, University of Pennsylvania, United States;

Abstract

Successful recognition of partially occluded objects is presumed to involve dynamic interactions between brain areas responsible for vision and cognition, but neurophysiological evidence for the involvement of feedback signals is lacking. Here, we demonstrate that neurons in the ventrolateral prefrontal cortex (vlPFC) of monkeys performing a shape discrimination task respond more strongly to occluded than unoccluded stimuli. In contrast, neurons in visual area V4 respond more strongly to unoccluded stimuli. Analyses of V4 response dynamics reveal that many neurons exhibit two transient response peaks, the second of which emerges after vlPFC response onset and displays stronger selectivity for occluded shapes. We replicate these findings using a model of V4/vlPFC interactions in which occlusion-sensitive vlPFC neurons feed back to shape-selective V4 neurons, thereby enhancing V4 responses and selectivity to occluded shapes. These results reveal how signals from frontal and visual cortex could interact to facilitate object recognition under occlusion.

Research organism: Rhesus macaque

Introduction

When an object is partially occluded, relevant sensory evidence available to the visual system is diminished, making the process of object recognition challenging. Nevertheless, primates are remarkably adept at recognizing partially occluded objects –a common occurrence in the natural world. The neural mechanisms that mediate this perceptual capacity are largely unknown and are the focus of this study.

Biologically inspired models of object recognition are often implemented as hierarchical, feedforward architectures (Perrett and Oram, 1993; Wallis and Rolls, 1997; Riesenhuber and Poggio, 1999) despite extensive evidence for the role of feedback signaling in visual processing (Lamme et al., 1998; Gilbert and Li, 2013). These feedforward models, and even more elaborate schemes such as artificial convolutional neural networks, remain incapable of successfully recognizing partially occluded objects, although they can perform other object recognition tasks well (Wyatte et al., 2012; Pepik et al., 2015). The failure of these models has been attributed to the exclusion of critical computations mediated by feedback signals (Yuille and Kersten, 2006; Kriegeskorte, 2015; Rust and Stocker, 2010; Tang and Kreiman, 2017). Indeed, more recent models that incorporate feedback signals show improved recognition performance for occluded objects (O'Reilly et al., 2013; Tang et al., 2014b). However, little is known about where the relevant feedback signals originate in higher cortex, where they terminate in visual cortex and how they contribute to the recognition of occluded objects. To provide new insights, we investigated the role of the prefrontal cortex in the representation of occluded objects, focusing on how responses in this area compare to responses in the visual cortex, and how frontal and visual cortical areas might interact to facilitate recognition performance.

The prefrontal cortex (PFC) plays an important role in cognitive control—the orchestration of thought and action in accordance with internal goals (Miller and Cohen, 2001). Given its high-level function, it may seem unlikely that PFC would contribute to low-level visual representations and mediate the perception and recognition of occluded objects. However, anatomical studies demonstrate that a sub-region of PFC, the ventrolateral PFC (vlPFC), receives direct projections from visual cortical areas involved in higher form processing, that is V4 and inferotemporal cortex (IT) (Barbas and Mesulam, 1985; Ungerleider et al., 2008). The vlPFC also sends projections back to these visual areas (Ninomiya et al., 2012). The existence of functional interactions between these areas is also supported by the demonstration of synchronous neural activity in the theta frequency range between lateral PFC and V4 during perceptual discrimination of visual stimuli (Liebe et al., 2012) and by the engagement of PFC in perceptual processing under conditions of greater task difficulty (Jiang and Kanwisher, 2003). Given the anatomical and physiological evidence for interactions between vlPFC and visual cortical areas, we hypothesized that vlPFC responses could contribute to the representation and recognition of objects when perceptual judgments are made more difficult by partial occlusion.

To test this hypothesis, we conducted neurophysiological recordings in rhesus monkeys while they discriminated partially occluded shapes. Based on these neuronal data, we addressed three questions. First, how do vlPFC neurons respond to partially occluded shapes compared to neurons in visual area V4? Second, are the response dynamics and tuning properties of V4 neurons consistent with the arrival of feedback signals from vlPFC? Third, is V4 neuronal discriminability for occluded shapes enhanced after the putative arrival of feedback signals from vlPFC?

Results

Responses to partially occluded shapes in ventrolateral prefrontal cortex

To determine how vlPFC contributes to the representation and recognition of partially occluded objects, we studied single neuronal responses in monkeys performing a sequential shape discrimination task. In this task, monkeys reported whether two sequentially presented shapes, the ‘reference’ and ‘test’, were the same or different by making a saccade to one of two choice targets (Figure 1A). To test discrimination under occlusion, the test stimulus was partially occluded with a field of randomly positioned dots. The level of occlusion was titrated by varying dot diameter and was quantified as the percentage of the shape area that remained visible (% visible area). In each session, two shapes were chosen from a standard stimulus set (Pasupathy and Connor, 2001; Kosai et al., 2014) to serve as the discriminanda. For both monkeys, task performance was high for unoccluded stimuli (100% visible area) and decreased gradually as the % visible area decreased (Figure 1B) — that is as occlusion increased (gray arrow).

Figure 1.
Behavioral task and monkey performance.

We analyzed the responses of vlPFC neurons during the test stimulus epoch in which occlusion level was varied. Many neurons responded strongly to occluded stimuli and weakly to unoccluded stimuli. Data from two example neurons are shown (Figure 2). The responses of the first example neuron demonstrate a preference for one of the two shapes used (compare Figure 2A–B). For both the preferred and non-preferred shape, responses were stronger when the shapes were occluded (colored lines) than unoccluded (black lines). These responses were also more discriminable when the shapes were occluded: shape selectivity was stronger for occluded than unoccluded stimuli (Figure 2C; see Materials and methods). The responses of the second example neuron were also stronger when the shapes were occluded than unoccluded (Figure 2D–F). However, this neuron showed no preference for either of the two shapes used; shape selectivity was therefore weak for occluded and unoccluded stimuli (Figure 2F). The responses of this second example neuron are consistent with sensitivity for the total area or circumference of the occluding dots. In contrast, the responses of the first example neuron are inconsistent with this interpretation because the stronger responses to occluded stimuli were also accompanied by stronger shape selectivity.

Figure 2.
Responses of example vlPFC neurons.

Most of the vlPFC neurons we recorded responded more strongly to occluded stimuli (Figure 3). Of 216 neurons that were visually responsive during the test stimulus epoch (see Materials and methods), 98 neurons (45%) were significantly modulated by occlusion level (2-way ANOVA, p<0.05). The responses of most of these occlusion-sensitive neurons were stronger for higher occlusion levels (Figure 3A). For individual neurons, this observation manifested as a negative linear regression slope between % visible area and the average responses during the test epoch (Figure 3B). Of the 98 occlusion-sensitive neurons, 71 had a negative regression slope; 59 neurons had a slope that was significantly less than zero (p<0.05). For the subset of occlusion-sensitive neurons, normalized responses were also stronger at higher occlusion levels (Figure 3C). The results were also qualitatively similar when all visually responsive vlPFC neurons were included.

Figure 3.
Population results for vlPFC neurons.

Shape selectivity was stronger for occluded than unoccluded stimuli across the population of vlPFC neurons. For the subset of occlusion-sensitive vlPFC neurons, shape selectivity was strongest for occluded stimuli at intermediate occlusion levels (blue/green) and weakest for unoccluded stimuli (black) (Figure 3D). This observation also held for the subset of shape-selective vlPFC neurons (N = 66; Figure 3—figure supplement 1B) and for all visually responsive neurons (N = 216; Figure 3—figure supplement 1D). Even for the small subset of vlPFC neurons that responded more strongly or equally well to unoccluded stimuli (27/98 neurons had a positive regression slope; 17/98 neurons had a slope significantly greater than zero, p<0.05), shape selectivity was not stronger for unoccluded than occluded stimuli (see Figure 3—figure supplement 2A). Thus, the vlPFC neuronal population has stronger, more shape-selective responses to occluded than unoccluded stimuli.

In addition to showing a preference for occluded stimuli during the test epoch, the responses of many vlPFC neurons during this epoch also signaled whether the test and reference shapes were a match/nonmatch. Of 216 neurons, the responses of 42 had a significant main effect of match/nonmatch condition, and the responses of 65 other neurons had a significant interaction between shape and match/nonmatch condition (two-way ANOVA, p<0.05).

The vlPFC results presented thus far differ markedly from what we and others have reported in the visual cortex regarding the representation of occluded and unoccluded objects. In monkey cortical areas V4 (Kosai et al., 2014) and IT (Kovács et al., 1995) and in human occipitotemporal cortex (Tang et al., 2014a), neuronal responses are strongest for unoccluded objects and neuronal shape selectivity declines gradually with increasing occlusion level. Thus, the strong responses and shape selectivity of vlPFC neurons for occluded stimuli cannot be inherited directly from visual cortex. Next, we examine how the signals in vlPFC compare to those in the visual cortex by analyzing neuronal response dynamics in V4 datasets collected previously (Kosai et al., 2014), recorded in the same monkeys while they performed the same behavioral task used in the vlPFC testing sessions.

Responses to partially occluded shapes in visual area V4

If feedback signals originating in vlPFC contribute to V4 responses, their influence on V4 would be evident after the onset of vlPFC responses. Additionally, their influence would manifest in V4 as stronger responses for occluded than unoccluded stimuli, consistent with the vlPFC response properties described earlier. Many V4 neurons in our dataset did not show evidence of feedback modulation in their temporal response profiles. The responses of one such example neuron (Figure 4A–B) had a temporal response profile with a single transient response phase (i.e. peak) followed by a sustained response phase. During both the transient and sustained phases, responses were stronger for unoccluded than occluded stimuli, unlike what we observed in vlPFC. However, many other V4 neurons showed a different temporal profile with two transient response peaks – one early and one late – each of which showed a different dependency on occlusion level. The responses of one such example neuron (Figure 4C–D) had two transient response peaks: the first ~82 ms and the second ~150 ms after test stimulus onset. The neuron’s responses during the first peak (Figure 4D, black bar) were strongest for the unoccluded stimulus and declined gradually with increasing occlusion level. In contrast, the neuron’s responses during the second peak (Figure 4D, red bar) were strongest for intermediate occlusion levels.

Figure 4.
Responses of example V4 neurons.

The responses of two other V4 neurons with two peaks are shown (Figure 5). For one neuron (Figure 5A), the first and second peaks occurred ~63 ms and ~191 ms after stimulus onset. For the other neuron (Figure 5B), the first and second peak occurred ~66 ms and ~218 ms after stimulus onset. Additional examples of V4 neurons with two peaks are provided (Figure 5—figure supplement 1).

Figure 5.
Responses of two additional example V4 neurons with two transient response peaks.

We developed an ad hoc peak finding algorithm to identify V4 neurons with and without two transient response peaks (see Materials and methods; Figure 6—figure supplement 1). The algorithm detected the occurrence of two robust transient peaks separated by a sizeable intervening trough, and the results were vetted using statistical tests. Of 85 neurons, 30 neurons (35%; 14 neurons recorded in Monkey O and 16 neurons recorded in Monkey M) were classified as having two peaks (Figure 6A) and 55 neurons were classified as not having two peaks (Figure 6B). The second response peak was less striking when the responses were averaged across neurons (Figure 6C) due to variability in second peak times for individual neurons (see also Figure 10—figure supplement 6). Across all V4 neurons with two peaks, the timing of the first and second peaks had a broad range, with a median of 84 ms and 214 ms, respectively (Figure 6C). In comparison, across all occlusion-sensitive vlPFC neurons, the peak response occurred later than in V4, 93–581 ms after test stimulus onset, with a median of 157 ms (Figure 6D). Thus, the median peak time in vlPFC straddled the median peak times of the first and second response peaks in V4.

Figure 6.
Population results for V4 and vlPFC neurons.

If feedback signals from vlPFC contribute to V4 activity during the second response peak, we expect that V4 responses would differ in their dependence on occlusion level over time. We therefore assessed neuronal sensitivity to occlusion during the first and second response peaks in V4. Data from three example V4 neurons are shown (Figure 7A–C). For the first neuron (Figure 7A; same neuron as in Figure 4C–D), responses during the first peak (black) declined gradually as occlusion level increased. In contrast, responses during the second peak were strongest at intermediate occlusion levels. Thus, the difference in responses between the first and second peak (gray) was largest at intermediate occlusion levels. The two other example neurons showed similar results (Figure 7B–C; same neurons as in Figure 5A–B, respectively): the difference in responses between the first and second peak was larger for occluded stimuli than unoccluded stimuli for both neurons.

Figure 7.
Comparison of V4 responses during the first and second peaks.

We compared the first and second peak responses for V4 neurons with and without two peaks. For both groups of neurons, responses during the first peak (69–99 ms) declined gradually as occlusion level increased. There was no significant difference between neurons with and without two peaks in their responses during the first peak at any occlusion level (t Test, p>0.1). However, later in the test stimulus epoch (199–229 ms), around the time of the second peak, the responses of the two groups of neurons had different trends. For neurons with two peaks, the difference in responses between the first and second peak was largest for intermediate occlusion levels and small for unoccluded stimuli and high occlusion levels (Figure 7D, dark gray). Thus, the response difference curve had an inverted U shape, as seen for the example neurons (compare dark gray curves, Figure 7A–D). In contrast, for neurons without two peaks, the difference in responses between the two time points was small and similar in magnitude across all occlusion levels (Figure 7D, light gray). The difference in responses between the first and second peaks was significantly greater for neurons with two peaks than other neurons at intermediate occlusion levels (compare curves in 7D; t Test, p<0.05, asterisks). This finding is explained by the observation that V4 neurons without two peaks had responses that declined gradually over time, for all occlusion levels (Figure 6B). In contrast, neurons with two peaks showed a relative increase in responses to occluded stimuli during the second peak (Figure 6A), a pattern that mirrored the responses of occlusion-sensitive neurons in vlPFC.

To quantify how shape selectivity evolves during the test stimulus epoch, we examined average neuronal shape selectivity across time for unoccluded and occluded stimuli for V4 neurons with two peaks (Figure 8A) and those without (Figure 8B). For unoccluded stimuli (black lines), shape selectivity was similar in magnitude and time course for the two groups, reaching a maximum value at ~120 ms. For occluded stimuli (colored lines), shape selectivity was similar for the two groups (t Test, p=0.6) early in the test stimulus epoch, around the time of the first peak (69–99 ms). However, shape selectivity for neurons with two peaks was significantly stronger (t Test, p<0.01) later in the test stimulus epoch, around the time of the second peak (199–229 ms). This is because for neurons with two peaks, shape selectivity for occluded stimuli increased over time and reached a maximal value closer to the time of the second peak (Figure 8—figure supplement 1).

Figure 8.
Dynamics of neuronal shape selectivity in V4.

We demonstrate this enhanced shape selectivity for occluded stimuli later in the test stimulus epoch in two ways. First, we compared the magnitude of shape selectivity at different periods (Figure 8C, early and late). Second, we compared the timing of peak selectivity for occluded and unoccluded stimuli (Figure 8D–E). For neurons with two peaks, shape selectivity around the time of the second peak was significantly stronger than around the time of the first peak (t Test, p<0.01; Figure 8C). This observation did not hold for neurons without two peaks (p=0.92) or for unoccluded stimuli for either group of neurons (p>0.5). The timing of maximal shape selectivity for unoccluded stimuli occurred significantly earlier than the second peak (Figure 8D, median 131 vs. 214 ms, respectively; t Test, p<0.01). In contrast, the timing of maximal shape selectivity for occluded stimuli occurred around the time of the second peak (Figure 8E, median 188 vs. 214 ms, respectively; t Test, p>0.98),

The enhanced shape selectivity for occluded stimuli around the time of the second peak occurred even for neurons that had stronger responses during the first peak than the second peak (Figure 8—figure supplement 2A). This finding suggests that response magnitude during the second peak does not fully account for the strength of shape selectivity. However, for neurons that had stronger shape selectivity during the second peak, the relative magnitude of responses during the second peak was larger for the preferred than non-preferred shapes (Figure 8—figure supplement 2B). This differential enhancement of responses to preferred shapes serves to amplify shape selectivity during the second peak.

Given that we classified neurons based on an ad hoc algorithm with customized parameters (see Materials and methods and Figure 6—figure supplement 1), we sought to ensure that the findings did not depend on the choice of parameters used and that the algorithm did not yield false-positives. To address these concerns, we examined population results for neurons with and without two peaks using different choices of threshold parameters (Figure 8—figure supplements 34). Additionally, we developed a model-based procedure that was independent of the ad hoc peak finding algorithm to identify neurons whose responses to occluded stimuli were stronger than expected from a linear scaling of responses to unoccluded stimuli (Figure 8—figure supplements 56). We found good correspondence between the model-based and algorithm-based approaches in terms of the neurons identified as having two peaks. Population results generated using different parameter choices for the ad hoc algorithm and using the model-based procedure were remarkably similar to those presented earlier (Figures 6 and and88).

Collectively, these results support the hypothesis that occlusion-sensitive signals in vlPFC are relayed to V4 and that these feedback signals contribute to V4 responses during the second peak, enhancing neuronal selectivity for occluded shapes. These putative feedback signals may be well suited to enhance perceptual discriminability of partially occluded objects.

A model of V4–vlPFC interactions

To demonstrate the plausibility of feedback signals from vlPFC contributing to V4 responses to occluded stimuli, we constructed a two-layer dynamical model of V4 and vlPFC interactions (Figure 9; see Materials and methods). In this model, shape-selective V4 units send feedforward inputs to vlPFC units (Figure 9, light gray arrows). The shape preference of each vlPFC unit is inherited from the V4 unit which provides the strongest input. vlPFC units also receive a gain modulation signal that increases with increasing occlusion level (dashed box), imparting a preference for occluded stimuli that is not observed in the feedforward V4 inputs to vlPFC. Additionally, vlPFC units send feedback inputs onto V4 units (medium gray arrows) with connection strengths that are proportional to the feedforward signals from each V4 unit. Importantly, feedback signals from vlPFC first pass through a rectifying nonlinearity prior to their arrival in V4 (Equation 7, Materials and methods). The vlPFC feedback signals contribute to two key response features of the V4 units: a second transient response peak and a dynamic preference for occlusion level over the test stimulus epoch.

Figure 9.
Model of V4–vlPFC interactions.

To demonstrate model performance, we present data for a simulated V4 and vlPFC unit (Figure 10). In the model, feedforward input to the V4 unit is modulated both by shape and occlusion level (Figure 10A): it is strongest when the preferred shape is unoccluded and is progressively weaker for higher occlusion levels. This pattern is consistent with our V4 neuronal data and captures responses during the first peak for V4 neurons with two peaks, as well as the responses of V4 neurons without two peaks. Occlusion-dependent gain modulation of feedforward input from V4 produced vlPFC unit responses that were weak to unoccluded stimuli and stronger to occluded stimuli (Figure 10B). Furthermore, the V4 unit receiving feedback signals from vlPFC had two transient response peaks: one earlier and one later than the response peak of the vlPFC unit (Figure 10C). The V4 unit’s responses during the first peak were strongest for the unoccluded stimulus and declined with increasing occlusion level. In contrast, the V4 unit’s responses during the second peak were strongest at intermediate occlusion levels. Thus, the response dynamics generated by this V4–vlPFC interaction model successfully recapitulated the dynamics observed in our neuronal recordings (Figures 2 and and44).

Figure 10.
Example model results.

Parsimonious model construction

Our model was constructed to include only the minimal set of mechanisms that were needed to account for the main features of the neurophysiological data. We started with the simplest feedforward model composed of two V4 units and two vlPFC units, and we included four mechanisms to achieve the desired dynamics in V4 and vlPFC responses: (1) feedback from vlPFC to V4; (2) synaptic adaptation in the feedforward inputs from V4 to vlPFC; (3) half-wave rectification of feedback signals from vlPFC to V4; (4) occlusion-dependent gain modulation input to vlPFC. We included feedback from vlPFC to V4 because a network with only feedforward connections from V4 to vlPFC units cannot generate the second response peak in V4 units (Figure 10—figure supplement 1A). Thus, in our model, feedback from vlPFC to V4 is necessary to reproduce the response dynamics observed in V4. Second, without synaptic adaptation on the feedforward connections from V4 to vlPFC (Equation 9, Materials and methods), the feedforward-feedback loop reinforces activity in V4 and PFC units positively (Figure 10—figure supplement 1B). The resulting ‘ringing’ and ‘blow up’ in simulated model responses are inconsistent with the data, thus arguing for the inclusion of an adaptation mechanism. Indeed, such an adaptation mechanism is often used in models to soften positive feedback loops (e.g. Wei and Wang, 2016). Third, without half-wave rectification of the feedback input from vlPFC to V4 units, the model produces a large second response peak even to presentation of non-preferred stimuli (Figure 10—figure supplement 2), which conflicts with the data (Figure 8—figure supplement 2B). Thus, without half-wave rectification, the enhanced shape selectivity during the second peak in the V4 neuronal data (Figure 8C) is not reproduced by the model. Given that rectifying nonlinearities can occur when synaptic inputs are transformed into output spikes, the model suggests that feedback from vlPFC may arrive in V4 after passing through a synapse. Indeed, anatomical observations of disynaptic feedback connections between V4 and vlPFC exist (Ninomiya et al., 2012). Fourth, when all other mechanisms are in place but the gain modulation is removed, vlPFC responses decrease with increasing occlusion level (Figure 10—figure supplement 1C). Consequently, V4 shape selectivity under occlusion is not enhanced during the second peak. To consider the possibility that gain modulation of vlPFC may be mediated by signals from V4 or IT cortex, we verified that our simulations were unaffected by delays of up to ~50 ms in the arrival of gain modulation relative to the arrival of shape selective signals (Figure 10—figure supplement 3).

Heterogeneity in model V4 and vlPFC responses

To generate the simulated responses shown (Figure 10), we chose model parameters that reproduced the response dynamics of example neurons (Figures 2A, ,44 and and5).5). However, V4 neurons show substantial diversity in the magnitude and timing of the second response peak (Figure 5—figure supplement 1). vlPFC neurons also show diversity in terms of their shape selectivity and the dependence of their responses on occlusion level. Therefore, we systematically varied the model parameters governing synaptic strengths and delays to generate diverse simulated response dynamics and patterns in V4 and vlPFC model units. We varied the relative strengths of the feedforward input from the two V4 units to vlPFC, and we verified that a second response peak was observed in the V4 unit responses even when vlPFC units are only weakly shape-selective (Figure 10—figure supplement 4). By varying the feedback connection strengths, and the synaptic delays between the two areas, we were able to generate a range of second peak magnitudes (Figure 10—figure supplement 5A) and second peak times (Figure 10—figure supplement 5B). Finally, the population average response across model V4 units (Figure 10—figure supplement 6) resembles the V4 population data (Figure 6A).

Discussion

To determine the contributions of prefrontal cortex to the representation and recognition of partially occluded objects, we compared the response dynamics of vlPFC and V4 neurons in monkeys discriminating shapes in the presence and absence of occluders. Our study provides three new insights. First, neuronal responses in vlPFC are strongest for occluded stimuli and weaker for unoccluded stimuli, in contrast to neuronal responses in visual areas V4 and IT (Kosai et al., 2014; Kovács et al., 1995; Tang et al., 2014a). Second, the responses of many V4 neurons have two transient peaks, the second of which emerges after the onset of vlPFC responses and shows a stronger preference for occluded stimuli. Third, neuronal shape selectivity for occluded stimuli in V4 is enhanced during the second transient peak. Our results support the hypothesis that feedback signals from vlPFC mediate V4 responses during the second transient peak and that these signals facilitate object recognition under occlusion.

Representation of occluded stimuli in vlPFC

Our results demonstrate that visual representations in vlPFC do not always mirror representations in visual cortex and suggest that vlPFC may play an important role in representing objects. We used different experimental approaches for the V4 and vlPFC recordings, but these methodological differences cannot account for differences in how V4 and vlPFC neurons represent occluded and unoccluded stimuli. In V4 recording sessions, but not in vlPFC sessions, we tailored the stimulus color and shape to the preferences of the neuron; this may explain the preponderance of V4 neurons that responded preferentially to unoccluded shapes in our dataset. Without tailoring stimuli we would expect a roughly equal proportion of neurons showing responses that increased and decreased with increasing occlusion level. We found, however, that 72% of vlPFC neurons responded preferentially to occluded shapes, a proportion that deviates significantly from the null hypothesis (binomial test, p<0.01). Furthermore, because the visible difference between any two shapes declines with increasing occlusion level, we expect shape selectivity to decline regardless of whether we tailored visual stimuli. The enhanced shape selectivity we observed in vlPFC under occlusion defies this expectation.

The stronger responses to occluded stimuli in vlPFC cannot be attributed to neuronal preferences for the color of the occluding dots. We verified in control experiments that the preference for occluded stimuli was independent of dot color (data not shown). Given that many vlPFC neurons are selective for the shape of the occluded stimulus, it is unlikely that vlPFC responses solely reflect task difficulty level or attentional demands. If difficulty or attention could fully explain vlPFC responses, occlusion-sensitive neurons would not be shape-selective (i.e. the PSTHs in Figure 2A and Figure 2B would be identical). We also verified in control experiments that vlPFC neuronal responses were weaker when the occluding dots were in the same color as the background – an observation suggesting that the vlPFC responses we observed rely on explicit occlusion-related signals.

The dependence of vlPFC responses on occlusion level varied across neurons. The responses of some neurons increased gradually with increasing occlusion level whereas the responses of other neurons increased abruptly, even at the lowest occlusion levels. Further experiments are needed to determine whether neuronal sensitivity to occlusion is determined by feedforward inputs, by gating in vlPFC or by the difficulty of the perceptual discrimination.

We propose that vlPFC responses arise from the modulation of occlusion-dependent, shape-selective feedforward signals from V4 by another feedforward signal that is dependent only on the occlusion level. In our simple behavioral task, where the occluding dots have a different color than the occluded shapes, a neuron sensitive to the color and area of the occluding dots could signal the level of occlusion. Indeed, in one monkey performing the same behavioral task used in the current study, we found that the responses of many IT neurons were consistent with encoding the total area of the occluders (Namima and Pasupathy, 2016). However, in the natural world, where there are multiple objects and the attributes of the occluders are not known a priori, identifying which object is occluded, and by how much, could be challenging. Extending our simple model to tackle more complex, naturalistic cases would likely require the incorporation of attention and memory processes.

Implications for decision-making and recognition

To perform the sequential shape discrimination task used in the current study, the reference stimulus held in memory must be compared to the test stimulus on the screen. Given its role in working memory, the PFC is a plausible neural locus for this comparison (Fuster, 1989; Kim and Shadlen, 1999; Romo and de Lafuente, 2013). Our results suggest, however, that the comparison of reference and test stimuli is unlikely to be implemented in vlPFC. We found stronger neuronal selectivity in vlPFC for occluded than unoccluded test stimuli. Thus, if behavioral performance depended on comparisons implemented in vlPFC, discriminability would be higher for occluded stimuli and lower for unoccluded stimuli – the opposite of the performance we observed (Figure 1B). The weak neuronal responses in vlPFC to unoccluded stimuli are consistent with a report of weak neuronal selectivity in this area for stimulus color in monkeys performing a color change detection task (Lara and Wallis, 2014). Together, these findings challenge the notion that vlPFC activity mediates perceptual discriminations of form and color directly. The comparison of sensory representations could be implemented in other parts of the PFC or in sensory cortex, where signals correlated with monkeys’ behavioral decisions have been reported (Kim and Shadlen, 1999; Eskandar et al., 1992; Miller and Desimone, 1994; Wallis and Miller, 2003; Romo and Salinas, 2003; Zaksas and Pasternak, 2006; Kosai et al., 2014). The evidence for functional connectivity between V4 and lateral PFC during memory maintenance also supports the implementation of decision computations in visual cortex (Liebe et al., 2012).

The finding that neuronal responses in vlPFC are stronger for occluded stimuli is consistent with two possibilities—a role for this area in decision-making and a role in the recognition of occluded objects. Given that the monkeys were required to report their perceptual judgments, it is possible that the vlPFC responses we recorded reflect this area’s engagement in facilitating decisions under limited sensory evidence. When shape selectivity in visual cortex is weakened by the presence of occlusion (Kosai et al., 2014; Kovács et al., 1995; Tang et al., 2014a), decision-making becomes difficult. In this case, vlPFC feedback might serve to amplify weak signals expressly to facilitate decisions. In this regard, our results are consistent with vlPFC’s engagement in tasks of greater difficulty or cognitive demand (Crittenden and Duncan, 2014). Importantly, when the task becomes difficult, rather than reflecting task difficulty per se, we propose that vlPFC responses amplify behaviorally relevant signals to facilitate perceptual decisions.

An alternative possibility is that the preference of vlPFC neurons for occluded stimuli may be related to the specific engagement of vlPFC in the recognition of occluded objects. Previous work has argued that the processing of complex visual scenes containing clutter and occlusions may be guided by higher cognitive, memory processes (Cavanagh, 1991; Kveraga et al., 2007). For example, image representations in early and mid-level areas of the ventral visual pathway may be relayed to higher processing stages where they are compared to stored representations of object prototypes, leading to the recognition of objects in the scene (Kveraga et al., 2007). This recognition process may then guide the grouping of appropriate contours and regions, thereby facilitating object segmentation and scene understanding (McDermott, 2004; Kveraga et al., 2007). Our results are also broadly consistent with the possibility that vlPFC activity embodies a recognition signal that is fed back to V4 to refine object representations. Further experiments are needed to differentiate between the two alternative roles for vlPFC.

Response dynamics in V4

Studies of object representation and recognition often consider spiking activity only within the first hundred milliseconds after stimulus onset (e.g. Hung et al., 2005). The rationale for choosing this early temporal epoch for analysis is based on the argument that successful categorization can be achieved by feedforward processes alone (VanRullen and Thorpe, 2001; Serre et al., 2007). However, neuronal responses to visual stimuli depend not only on signals carried by feedforward connections but also by feedback and horizontal connections. Feedback and horizontal connections may modulate neuronal responses based on stimulus context and behavioral goals, and confer selectivity for more complex visual stimuli (Lamme and Roelfsema, 2000; Gilbert and Li, 2013).

The relative contributions of feedforward, feedback and horizontal connections to neuronal responses are hard to disentangle experimentally, but examining the response dynamics provides useful insights. For example, recent studies comparing V4 and V1 response dynamics during contour grouping and scene segmentation tasks suggest that feedback from V4 to V1 enhances the representation of figures and suppresses the representation of backgrounds in V1 (Chen et al., 2014; Poort et al., 2012). Similarly, our results suggest that feedback from vlPFC to V4 enhances the representation of behaviorally relevant, occluded shapes in V4.

Our model simulations suggest that the enhancement of V4 shape selectivity could be mediated even by weakly tuned vlPFC neurons. This may be because we used only two discriminanda in each experimental session, thereby simplifying the object recognition problem. In this case, any given vlPFC neuron receiving differential input from the two subsets of V4 neurons that signal the two shape discriminanda could contribute to enhanced V4 shape selectivity. We cannot rule out the possibility that IT responses also contribute to V4 responses during the second transient peak. However, our IT recordings suggest this is unlikely because, as in V4, shape selectivity in IT is stronger for unoccluded than occluded stimuli (Namima and Pasupathy, 2016). Thus, putative feedback from IT may not be well-suited for enhancing V4 shape selectivity at intermediate occlusion levels.

The V4 neurons in our data set showed a broad range of second peak times and peak magnitudes. The diversity in timing of the second peak is likely because V4 and vlPFC are connected via feedforward and feedback pathways that are direct and indirect, and these different pathways are expected to have different conduction times. It is also possible that feedback signals from vlPFC to V4 are carried by a sparse projection and then distributed more broadly across V4 via horizontal connections, resulting in longer delays (than expected for disynaptic transmission) between the peak of vlPFC responses and the second peak of V4 responses. The strength of connections between V4 and vlPFC is likely heterogeneous, which could explain the range of second peak magnitudes we documented. Our simulations demonstrate that even weak functional interactions between the two areas could result in a small, second response peak in V4 that may be undetectable in highly variable responses. Overall, the heterogeneous properties of the second response peak support the possibility that V4 neurons with and without two peaks lie along a continuum.

While our model simulations successfully reproduced the diversity of neuronal dynamics observed in V4 in terms of the amplitude and timing of the second transient response peak, this result was achieved by tweaking the parameters of response timing and connection strengths in different instantiations of the two-layer V4–vlPFC interaction model. We do not know whether a large set of interconnected neurons, each with a large number of incoming inputs, could exhibit diverse response dynamics despite differences in response timing and connection strengths. In such a network, ‘averaging’ across the many inputs each neuron receives could dampen diversity in the response dynamics. It is also possible that in the limiting case where the neural population is large in size, but the number of active incoming connections per neuron is relatively small, substantial variation in response dynamics could persist across neurons. A detailed study of the relationships between network connectivity, network size and neuronal dynamics would be useful to validate the proposed model.

We studied vlPFC and V4 neuronal responses in the same monkeys and using the same behavioral paradigm, thereby facilitating direct comparisons of measurements made in the two cortical areas. Nevertheless, our approach to studying vlPFC contributions to V4 had several limitations. First, we conducted V4 and vlPFC recordings in separate sessions so we cannot compare V4 and vlPFC activity on individual trials. Second, we used only two discriminanda in each behavioral session, so we do not know whether vlPFC neurons are sufficiently sensitive to shape information to mediate the recognition of occluded objects. Third, we did not instruct monkeys to report their behavioral decisions as soon as possible, so we cannot infer the precise epoch of V4 neural activity that mediates perceptual judgments. Specifically, we do not know whether V4 neuronal responses during the second transient peak, which have stronger shape selectivity, contributed to the monkeys’ perceptual decisions. Fourth, we do not know whether the engagement of vlPFC neurons in our study was contingent on the monkeys reporting their perceptual decisions, or whether these same neurons would also be engaged in the recognition of partially occluded objects in natural viewing conditions. We hope that future studies will answer these outstanding questions and probe the causal link between V4 and vlPFC activity using perturbations of neuronal activity.

Partial occlusions pose a major challenge to the successful recognition of visual objects because they reduce the evidence available to the brain. Recognizing partially occluded objects could require solving an ill-posed inverse problem (Helmholtz, 1910; Yuille and Kersten, 2006), one that lacks a unique solution because the retinal image of an occluded object is often compatible with multiple interpretations (e.g. see Bregman's B illusion, Bregman, 1981). As a result, recognition must rely not only on information about the physical object but also on information about the occlusion, scene context and perceptual experience. Our results provide support for the hypothesis that feedback signals from vlPFC, which carry information about occlusions, contribute to object representation in V4 and to object recognition under occlusion. Other brain regions, for example IT cortex, are likely to be involved and should be studied. Future experiments are needed to reveal the detailed algorithms used by neurons and circuits to solve object recognition under occlusion.

Materials and Methods

Experimental subjects

Two adult male rhesus macaques (Macaca mulatta) were prepared for neurophysiological recordings using sterile surgical procedures. For experiments in prefrontal cortex, recording chambers were centered over the principal sulcus and targeted the ventrolateral prefrontal cortex (vlPFC), located ventral to and along the caudal third of the principal sulcus. The stereotaxic, central coordinates of prefrontal recording chambers were derived based on structural MRI images for each animal, and were ~21 mm anterior of interaural zero and ~19 mm lateral to the midline. For experiments in visual cortex, recording chambers were centered on the dorsal surface along the prelunate gyrus and targeted area V4, extending between the lunate sulcus and the superior temporal sulcus. Recordings from the two areas were carried out serially in the same monkeys, starting with V4 then vlPFC. All animal procedures conformed to NIH guidelines and were approved by the Institutional Animal Care and Use Committee at the University of Washington.

Neurophysiology

Extracellular recordings were performed using epoxy-coated tungsten microelectrodes (250 µm, FHC) lowered into cortex through an acute microdrive system (Gray Matter Research, 8-channel). Voltage signals were amplified and band-pass filtered (0.1–8 kHz) using a recording system (Plexon Systems, 16-channel). The waveforms of single units were isolated manually using spike-sorting software (Plexon Systems, Offline Sorter). The results reported in the current study are based on 381 vlPFC neurons (260 and 121 from Monkey M and Monkey O, respectively) and on 85 V4 neurons (41 from Monkey M and 44 from Monkey O, respectively). A subset of the V4 neurons (62 of 85 neurons) contributed to a previous study (see Kosai et al., 2014).

Visual stimuli

Visual stimuli were presented on a calibrated CRT monitor (1600 x 1200 pixels; 97 Hz frame rate; 57 cm in front of the monkey). Stimuli were presented against an achromatic gray background of mean luminance 5.4 cd/m2. Stimulus onset and offset times were based on photodiode detection of synchronized pulses in one corner of the monitor. Stimulus presentation and behavioral events were controlled by custom software written in Python (Pype, originally developed by Jack Gallant and James Mazer; Mazer, 2013). Eye position was monitored using a 1 kHz infrared eye-tracking system (Eyelink 1000; SR Research).

Behavioral task

Monkeys performed a sequential shape discrimination task in the presence and absence of occluders (Figure 1A). Each trial began with the presentation of a central point (0.1°), which the monkey had to fixate within a circular window of radius 0.75°. After acquiring fixation, two stimuli were presented: a ‘reference’ stimulus, followed by a ‘test’ stimulus. The reference stimulus was always an unoccluded, 2D shape. The test stimulus was a 2D shape that was unoccluded or partially occluded by a field of randomly positioned dots. Occlusion level was quantified as the percentage of the shape area that remained visible (‘% visible area’) and was titrated by varying the diameter of the occluding dots (for details, see Kosai et al., 2014). Each stimulus was presented for 600 ms, with an inter-stimulus interval of 200 ms between the reference and the test stimuli. Following a 50 ms delay, the fixation point was extinguished and two peripheral choice targets appeared (left and right dots, 6° eccentric; Figure 1A). The monkey reported whether the two shapes presented were the same or different via a saccade to the right or left target, respectively, and within 500 ms of target onset. The monkey received liquid reward for correct performance. In cases where the monkey broke fixation or failed to respond, the trial was repeated later in the session. Behavioral trials were separated by an inter-trial interval of 2 s.

Approach to data collection

V4 data were collected by studying one neuron at a time and tailoring the shapes, occluding dots, colors and position of the test stimulus to the preferences of the neuron recorded. Based on preliminary characterizations (for details, see Kosai et al., 2014) we chose two shapes as the discriminanda: one preferred and one non-preferred. The two shapes were presented in the neuron’s preferred color whereas the occluding dots were presented in a contrasting, non-preferred color. The reference stimulus was presented at central fixation whereas the test stimulus was presented at the center of the neuron’s RF. The vlPFC data were collected by studying several neurons simultaneously, an approach that precluded tailoring the shapes and occluding dots to the preferences of individual neurons. To equate behavioral task difficulty across V4 and vlPFC recording sessions, we chose, at random, stimulus parameters for each vlPFC recording session from among those used for V4 recording sessions.

Two shapes were used in each behavioral session, yielding four trial conditions per occlusion level (2 shapes x two behavioral outcomes). We studied each neuron’s responses to the two shapes under four or more occlusion levels, including the unoccluded case. In V4 recordings, we sampled 4–9 oc clusion levels (median = 6). In vlPFC recordings, we sampled 5–6 oc clusion levels (median = 5). We only included data from neurons tested with at least seven repeated presentations of each trial condition and of each occlusion level tested. The median number of repeats was 24 for V4 recordings and 15 for vlPFC recordings.

Data analysis

Visually responsive and occlusion-sensitive vlPFC neurons

We identified visually responsive vlPFC neurons by comparing the firing rate during a 150 ms window, beginning 80 ms after test stimulus onset, to the firing rate during the fixation epoch before reference stimulus onset. Neuronal responses to the test stimulus were often transient (see Figure 2), motivating us to calculate firing rates in a 150 ms window rather than the full duration of stimulus presentation. The 80 ms offset was introduced to account for the visual response latency of neurons. Among 381 vlPFC neurons, 216 (142/260 in Monkey M and 74/121 in Monkey O) were significantly responsive during the test stimulus epoch (t Test, p<0.01). All further data analyses were restricted to these visually responsive neurons (57% of vlPFC neurons recorded).

To assess whether neuronal responses to the test stimulus were modulated by shape and/or occlusion level, we conducted a 2-way ANOVA on activity during the same response window defined above, with stimulus shape and occlusion level as factors. Among the 216 visually responsive neurons, the responses of 98 neurons (71/260 in Monkey M and 27/121 in Monkey O) showed a significant dependence on occlusion level (p<0.05) and the responses of 66 neurons showed a significant dependence on stimulus shape (p<0.05).

Shape selectivity

To examine the dynamics of neuronal shape selectivity, we performed a sliding-window Receiver Operating Characteristic (ROC) curve analysis on responses to the preferred and non-preferred shapes at each occlusion level. For V4 neurons, the preferred shape was that which evoked the largest average response across all occlusion levels. For vlPFC, because many neurons did not respond to unoccluded shapes, we computed the average response for each shape across all occlusion levels (i.e. visible area <100%) during the test epoch, and identified the preferred shape as that which evoked the largest average response. At every time point (1 ms steps), we counted spikes in a centered window of duration 75 ms (for V4) or 150 ms (for vlPFC). We then assessed shape selectivity by computing the area under the ROC curve derived from the spike count distributions of responses to preferred and non-preferred shapes. Shape selectivity values ranged from 0.5 (unselective) to 1.0 (very selective). To identify the time of maximal shape selectivity for occluded stimuli (Figure 8D, red bars), we also computed shape selectivity as described above, pooling across all levels of occlusion tested for each neuron.

Population response histograms

To generate population response histograms (Figures 3 and and6),6), we normalized the responses of each neuron to the maximum across all occlusion levels then averaged the data at each occlusion level for all neurons. We did not test all neurons at the same occlusion levels (for each neuron, we tested 4–9 occlusion levels), so the number of neurons contributing to the average histograms varied for each occlusion level; these numbers are listed in the figures. For both cortical areas, population response histograms for the occlusion conditions of 44% and 27% visible area were based on only a few neurons and were therefore excluded.

Peak latency

To find the time to peak response for each neuron, we first constructed an average response histogram from the Gaussian-smoothed (σ = 10 ms) PSTHs across all occlusion levels. We then identified the time of maximal response between 50–600 ms after test stimulus onset. This temporal window allowed us to identify peaks associated with responses to the test stimulus rather than responses related to the preceding reference stimulus, memory delay or saccades that followed the test stimulus.

Peak finding algorithm

To identify V4 neurons with two transient peaks in their responses to occluded shapes, we devised an ad hoc algorithm, described below. This procedure was designed to identify neurons with a robust second transient response peak that could not be attributed to small, noisy ripples in the response. For each neuron, we first constructed an average PSTH of its responses to the preferred shape at different occlusion levels, smoothed with a Gaussian function (σ = 10 ms). We only included occlusion levels that evoked a response that was at least 33% of the maximal response to the unoccluded preferred shape. We then used a zero-crossing algorithm to identify local peaks within 300 ms of stimulus onset. Small peaks (<50% of the first transient peak) and small trough-to-peak modulation ratios (<15% of local peak magnitude) were rejected as false positives (see Figure 6—figure supplement 1 for a schematic of the procedure and examples of rejection cases). For each putative peak that met the peak amplitude and modulation criteria, we asked whether there was a statistically significant response increase relative to the preceding trough. To assess statistical significance, we conducted a paired t-Test between single trial spike counts within a 30 ms window centered at the peak and at the preceding trough (p<0.05, Bonferroni corrected). Of 85 V4 neurons, 43 had no robust peaks beyond the first transient that passed the peak amplitude and modulation criteria. Of the remaining 42 neurons, 30 had a second peak that showed a statistically significant response increase relative to the preceding trough; these neurons were classified as having two peaks. Specifically, 29/30 neuons had exactly one peak that qualified as the second response peak.

Dynamic network model

To evaluate whether interactions between vlPFC and V4 could account for the observed response dynamics, we constructed a network model (Figure 9). The model includes two stages of cortical processing that are intended to map onto areas vlPFC and V4. The V4 stage comprises two units (V41 and V42), each selective for one of the shapes used in a testing session. The vlPFC stage also comprises two units (PFC1 and PFC2) that receive excitatory feedforward input from V4 units. Each vlPFC unit inherits a preference for stimulus shape from the V4 unit that provides the strongest input (e.g. PFC1 receives the strongest feedforward input from V41). In the model simulations presented here, feedforward and feedback connections strengths are proportional. However, we have verified that the results hold for a broad range of connection strengths, as long as each vlPFC unit sends stronger feedback input to the V4 unit which provides its dominant feedforward input.

All model parameters (listed in Table 1) were chosen to reproduce the response dynamics observed in the experimental data. In addition, we also varied synaptic weights and delays over a range of values (see Table 2) to compare the heterogeneity of model units to observed data (Figure 10—figure supplements 35).

Table 1.
Parameters used for the model (fixed)
Table 2.
Parameters used for the model (varied)

We modeled the dynamical firing rate response of each model V4 unit, rV4,i, as:

τV4drV4,itdt=-rV4,it+FUV4,it-rthr,V4+ηt,
(1)

and the firing rate response of each model vlPFC unit, rPFC,i, as:

τPFCdrPFC,itdt=-rPFC,it+f-PFCFUPFC,it-rthr,PFC+ηt
(2)

 where τV4 and τPFC denote the time constants of the responses; t denotes time; F(x)is a nonlinear function of the Naka Rushton form given by:

F(x)={fmaxxNσN+xN,x00,x<0
(3)

 where rthr,V4 and rthr,PFC are the firing rate thresholds, and η is a Gaussian white noise term with a standard deviation of 300*dt and a timestep dt=0.01(ms). We omitted the noise term for some simulations (Figure 10—figure supplements 16). Note that the precise form of the nonlinear function F(x) is not critical; any monotonically increasing nonlinear function with saturation and threshold, along with the dynamics of firing rates defined in Equations (1) and (2) provide a standard firing rate model (Dayan and Abbott, 2005).

For V4 model units, the input UV4,iwas the sum of two sources: (i) excitatory feedforward input from upstream visual areas, uFFi, and (ii) excitatory feedback inputs from vlPFC, uPFC(1)i and uPFC(2)i. The feedforward input uFFi confers shape selectivity to the V4 model units and a dependence of their responses on occlusion level (Figure 9). For the preferred stimulus, this input is strong and declines gradually with increasing occlusion level. For the non-preferred shape, this input is weak, as is the modulatory influence of occlusion level. The feedforward input, uFFi, was constructed by first convolving a difference of Gaussian filter k (the standard kernel normalized difference of g1=15exp[-(t-30)2800] and g2=10exp[-(t-50)2800])) with a 500ms-long ramp Ri(c,t) followed by cubing, normalization and half-wave rectification:

uFFi=[(kRi)3(max[kRi])2]+
(4)

The ramp function (R), defined separately for the preferred (i = 1) and nonpreferred (i = 2) shapes, increases monotonically with the percentage of visible area (c) and declines over time with a support of 500 ms, that is

Ri(c,t)={(2.5c+20)0.05t, i=1,(12c1/3+120)0.05t, i=2, 
(5)

when 30t530. Riis 0 otherwise.

Equations 4 and 5 were designed to simulate the input to V4 units (e.g. Figure 10A) with an onset latency of 30 ms, a strong initial transient response, a gradually declining sustained response, collectively lasting ~500 ms. Note that the precise function defining uFFi is not critical as long as it produces strong input signals for the preferred shape that decrease with increasing occlusion level, thus capturing the observed V4 neuronal response properties.

For the vlPFC units, the input, UPFC,i is the excitatory feedforward inputs from both V4 units, uV4(1)i and uV4(2)i. In addition, the vlPFC units receive a gain modulation signal, f-PFC, that is proportional to the occlusion level. We modeled f-PFC as a nonlinear, cubic function of the % visible area,c. The function’s output was lowest for the unoccluded shape (c=100%) and increased for higher occlusion levels (Figure 9, f-). The coefficients were fit so that the model responses closely resembled the neuronal data, but the qualitative results were independent of the coefficient values used:

f-PFC=-0.0017c3+0.39c2-29.6c+806.
(6)

Inputs between model units were modulated by connection weights: wsfffor the stronger feedforward inputs from V4 units to vlPFC units of the same shape preference (e.g. V4 unit 1→ vlPFC unit 1), wsfb for the corresponding feedback inputs (e.g. vlPFC unit 1→ V4 unit 1), wwff for the weaker feedforward inputs from V4 units to vlPFC units of a different shape preference (e.g. V4 unit 1→ vlPFC unit 2), and wwfb for the corresponding feedback inputs (e.g. vlPFC unit 1→ V4 unit 2). Thus, the feedback input from vlPFC unit j onto V4 unit i, uPFC(j)i, was implemented as follows:

uPFC(j)i(t)={wsfb [rPFC,j(tτd,fb)rthr1]+,i=j.wwfb[rPFC,j(tτd,fb)rthr1]+,ij.
(7)

where the responses of vlPFC units were thresholded (rthr1) and half-wave rectified. This threshold on vlPFC firing rates was introduced to reduce the magnitude of the second transient peak in V4 unit responses to the non-preferred shape (see Figure 10—figure supplement 2).

The feedforward excitatory input from V4 unit j to vlPFC unit i was implemented as:

uV4(j)i(t)={wsffrV4, j(tτd,ff),i=j.wwffrV4, j(tτd,ff),ij.
(8)

The feedforward and feedback temporal delays between vlPFC and V4 unit responses, τd,ff and τd,fb were chosen to be consistent with the difference in time between the vlPFC and V4 response peaks observed in our neuronal data.

To prevent the second response peak of V4 units from inducing a second response peak in vlPFC units (see Figure 10—figure supplement 1B), the feedforward connections from V4 to vlPFC included an adaptation term, as follows:

dwffdt=1τa(w,ffwff)w,ff0 if rPFC,irthr2
(9)

 where the weight wff of connections from V4 to vlPFC represents both wwff and wsff, and evolves with time scale τa. When vlPFC activity exceeds the value of rthr2 (10 spk/sec, see Table 1), the steady state feedforward connection from V4 to vlPFC, w,ff, goes to 0, and any subsequent input from V4 will fail to activate vlPFC. The feedback connectivity weight was time-independent and set to steady state values: wsfb=w,sfb, wwfb=w,wfb.

The set of differential equations, with stochastic noise term η,was solved using the Forward Euler Method in MATLAB. The initial firing rate values for rV4,i and rPFC,i were set to 0 spikes per second. The initial connectivity weights were equivalent to the steady-state weights w,sff, w,wff, w,sfb and w,wfb; these and other parameters are given in Tables 1 and and22.

The code for the full model is available on Github (Choi, 2017). A copy is archived at https://github.com/elifesciences-publications/V4-PFC-dynamics.

Acknowledgements

We thank Wyeth Bair, Gregory Horwitz and Dina Popovkina for helpful discussions and comments on the manuscript, and Yoshito Kosai for assistance with animal training and V4 data collection. Technical support was provided by the Bioengineering group at the Washington National Primate Research Center. This work was funded by NEI grant R01EY018839 to A Pasupathy, Vision Core grant P30EY01730 to the University of Washington, P51 grant OD010425 to the Washington National Primate Research Center, NSF grant DMS-1056125 to E Shea-Brown, and Washington Research Foundation Innovation Postdoctoral Fellowship in Neuroengineering to H Choi.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Funding Information

This paper was supported by the following grants:

  • http://dx.doi.org/10.13039/100001906Washington Research Foundation Innovation Postdoctoral Fellowship in Neuroengineering to Hannah Choi.
  • http://dx.doi.org/10.13039/100000001National Science Foundation DMS-1056125 to Eric Shea-Brown.
  • http://dx.doi.org/10.13039/100000053National Eye Institute R01EY018839 to Anitha Pasupathy.
  • http://dx.doi.org/10.13039/100000002National Institutes of Health OD010425 to Anitha Pasupathy.
  • http://dx.doi.org/10.13039/100000053National Eye Institute P30EY01730 to Anitha Pasupathy.

Additional information

Competing interests

No competing interests declared.

Author contributions

Data curation, Writing—review and editing.

Data curation, Writing—original draft, Writing—review and editing.

Software, Investigation, Methodology, Writing—original draft, Writing—review and editing.

Conceptualization, Supervision, Writing—review and editing.

Conceptualization, Resources, Software, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing.

Ethics

Animal experimentation: All animal procedures conformed to NIH guidelines and were approved by the Institutional Animal Care and Use Committee at the University of Washington (IACUC Protocol #4133-01).

References

  • Barbas H, Mesulam MM. Cortical afferent input to the principalis region of the rhesus monkey. Neuroscience. 1985;15:619–637. doi: 10.1016/0306-4522(85)90064-8. [PubMed] [Cross Ref]
  • Bregman AS. Asking the “What-For” questions in auditory perception. In: Kuy M, Pomerantz J, editors. Perceptual Organisation. Erlbaum; 1981.
  • Cavanagh P. What's up in top-down processing? In: Gorea A, editor. Representation of Vision: Trends and Tacit Assumptions in Vision Research. 1991. pp. 295–304.
  • Chen M, Yan Y, Gong X, Gilbert CD, Liang H, Li W. Incremental integration of global contours through interplay between visual cortical areas. Neuron. 2014;82:682–694. doi: 10.1016/j.neuron.2014.03.023. [PubMed] [Cross Ref]
  • Choi H. V4-PFC-dynamics. [ac02e4b];2017 https://github.com/myhannahchoi/V4-PFC-dynamics
  • Crittenden BM, Duncan J. Task difficulty manipulation reveals multiple demand activity but no frontal lobe hierarchy. Cerebral Cortex. 2014;24:532–540. doi: 10.1093/cercor/bhs333. [PMC free article] [PubMed] [Cross Ref]
  • Dayan P, Abbott LF. Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems. MIT Press; 2005.
  • Eskandar EN, Richmond BJ, Optican LM. Role of inferior temporal neurons in visual memory. I. Temporal encoding of information about visual images, recalled images, and behavioral context. Journal of Neurophysiology. 1992;68:1277–1295. [PubMed]
  • Fuster J. The Prefrontal Cortex. New York: Raven; 1989.
  • Gilbert CD, Li W. Top-down influences on visual processing. Nature Reviews Neuroscience. 2013;14:350–363. doi: 10.1038/nrn3476. [PMC free article] [PubMed] [Cross Ref]
  • Helmholtz H. Treatise on Physiological Optics. New York: Dover; 1910.
  • Hung CP, Kreiman G, Poggio T, DiCarlo JJ. Fast readout of object identity from macaque inferior temporal cortex. Science. 2005;310:863–866. doi: 10.1126/science.1117593. [PubMed] [Cross Ref]
  • Jiang Y, Kanwisher N. Common neural mechanisms for response selection and perceptual processing. Journal of Cognitive Neuroscience. 2003;15:1095–1110. doi: 10.1162/089892903322598076. [PubMed] [Cross Ref]
  • Kim JN, Shadlen MN. Neural correlates of a decision in the dorsolateral prefrontal cortex of the macaque. Nature neuroscience. 1999;2:176–185. doi: 10.1038/5739. [PubMed] [Cross Ref]
  • Kosai Y, El-Shamayleh Y, Fyall AM, Pasupathy A. The role of visual area V4 in the discrimination of partially occluded shapes. Journal of Neuroscience. 2014;34:8570–8584. doi: 10.1523/JNEUROSCI.1375-14.2014. [PMC free article] [PubMed] [Cross Ref]
  • Kovács G, Vogels R, Orban GA. Selectivity of macaque inferior temporal neurons for partially occluded shapes. The Journal of Neuroscience : The Official Journal of the Society for Neuroscience. 1995;15:1984–1997. [PubMed]
  • Kriegeskorte N. Deep Neural Networks: A New Framework for Modeling Biological Vision and Brain Information Processing. Annual Review of Vision Science. 2015;1:417–446. doi: 10.1146/annurev-vision-082114-035447. [PubMed] [Cross Ref]
  • Kveraga K, Ghuman AS, Bar M. Top-down predictions in the cognitive brain. Brain and Cognition. 2007;65:145–168. doi: 10.1016/j.bandc.2007.06.007. [PMC free article] [PubMed] [Cross Ref]
  • Lamme VA, Supèr H, Spekreijse H. Feedforward, horizontal, and feedback processing in the visual cortex. Current Opinion in Neurobiology. 1998;8:529–535. doi: 10.1016/S0959-4388(98)80042-1. [PubMed] [Cross Ref]
  • Lamme VA, Roelfsema PR. The distinct modes of vision offered by feedforward and recurrent processing. Trends in Neurosciences. 2000;23:571–579. doi: 10.1016/S0166-2236(00)01657-X. [PubMed] [Cross Ref]
  • Lara AH, Wallis JD. Executive control processes underlying multi-item working memory. Nature Neuroscience. 2014;17:876–883. doi: 10.1038/nn.3702. [PMC free article] [PubMed] [Cross Ref]
  • Liebe S, Hoerzer GM, Logothetis NK, Rainer G. Theta coupling between V4 and prefrontal cortex predicts visual short-term memory performance. Nature Neuroscience. 2012;15:456–462. doi: 10.1038/nn.3038. [PubMed] [Cross Ref]
  • Mazer J. pype3. [5e3bd9a];2013 https://github.com/mazerj/pype3
  • McDermott J. Psychophysics with junctions in real images. Perception. 2004;33:1101–1127. doi: 10.1068/p5265. [PubMed] [Cross Ref]
  • Miller EK, Desimone R. Parallel neuronal mechanisms for short-term memory. Science. 1994;263:520–522. doi: 10.1126/science.8290960. [PubMed] [Cross Ref]
  • Miller EK, Cohen JD. An integrative theory of prefrontal cortex function. Annual Review of Neuroscience. 2001;24:167–202. doi: 10.1146/annurev.neuro.24.1.167. [PubMed] [Cross Ref]
  • Namima T, Pasupathy A. Neural responses in the inferior temporal cortex to partially occluded and occluding stimuli. Society for Neuroscience Abstracts 2016
  • Ninomiya T, Sawamura H, Inoue K, Takada M. Segregated pathways carrying frontally derived top-down signals to visual areas MT and V4 in macaques. Journal of Neuroscience. 2012;32:6851–6858. doi: 10.1523/JNEUROSCI.6295-11.2012. [PubMed] [Cross Ref]
  • O'Reilly RC, Wyatte D, Herd S, Mingus B, Jilk DJ. Recurrent Processing during Object Recognition. Frontiers in Psychology. 2013;4:124 doi: 10.3389/fpsyg.2013.00124. [PMC free article] [PubMed] [Cross Ref]
  • Pasupathy A, Connor CE. Shape representation in area V4: position-specific tuning for boundary conformation. Journal of neurophysiology. 2001;86:2505–2519. [PubMed]
  • Pepik B, Benenson R, Ritschel T, Schiele B. What is holding back convnets for detection? Lecture Notes in Computer Science. 2015:517–528. doi: 10.1007/978-3-319-24947-6_43. [Cross Ref]
  • Perrett DI, Oram MW. Neurophysiology of shape processing. Image and Vision Computing. 1993;11:317–333. doi: 10.1016/0262-8856(93)90011-5. [Cross Ref]
  • Poort J, Raudies F, Wannig A, Lamme VA, Neumann H, Roelfsema PR. The role of attention in figure-ground segregation in areas V1 and V4 of the visual cortex. Neuron. 2012;75:143–156. doi: 10.1016/j.neuron.2012.04.032. [PubMed] [Cross Ref]
  • Riesenhuber M, Poggio T. Hierarchical models of object recognition in cortex. Nature neuroscience. 1999;2:1019–1025. doi: 10.1038/14819. [PubMed] [Cross Ref]
  • Romo R, Salinas E. Flutter discrimination: neural codes, perception, memory and decision making. Nature Reviews Neuroscience. 2003;4:203–218. doi: 10.1038/nrn1058. [PubMed] [Cross Ref]
  • Romo R, de Lafuente V. Conversion of sensory signals into perceptual decisions. Progress in Neurobiology. 2013;103:41–75. doi: 10.1016/j.pneurobio.2012.03.007. [PubMed] [Cross Ref]
  • Rust NC, Stocker AA. Ambiguity and invariance: two fundamental challenges for visual processing. Current Opinion in Neurobiology. 2010;20:382–388. doi: 10.1016/j.conb.2010.04.013. [PubMed] [Cross Ref]
  • Serre T, Oliva A, Poggio T. A feedforward architecture accounts for rapid categorization. PNAS. 2007;104:6424–6429. doi: 10.1073/pnas.0700622104. [PubMed] [Cross Ref]
  • Tang H, Buia C, Madhavan R, Crone NE, Madsen JR, Anderson WS, Kreiman G. Spatiotemporal dynamics underlying object completion in human ventral visual cortex. Neuron. 2014a;83:736–748. doi: 10.1016/j.neuron.2014.06.017. [PMC free article] [PubMed] [Cross Ref]
  • Tang H, Buia C, Madsen J, Anderson WS, Kreiman G. A role for recurrent processing in object completion: neurophysiological, psychophysical and computational evidence. arXiv. 2014b 1409.2942 https://arxiv.org/abs/1409.2942
  • Tang H, Kreiman G. Recognition of occluded objects. In: Zhao Q, editor. Computational and Cognitive Neuroscience of Vision. Singapore: Springer-Verlag; 2017. [Cross Ref]
  • Ungerleider LG, Galkin TW, Desimone R, Gattass R. Cortical connections of area V4 in the macaque. Cerebral Cortex. 2008;18:477–499. doi: 10.1093/cercor/bhm061. [PubMed] [Cross Ref]
  • VanRullen R, Thorpe SJ. Is it a bird? Is it a plane? Ultra-rapid visual categorisation of natural and artifactual objects. Perception. 2001;30:655–668. doi: 10.1068/p3029. [PubMed] [Cross Ref]
  • Wallis G, Rolls ET. Invariant face and object recognition in the visual system. Progress in Neurobiology. 1997;51:167–194. doi: 10.1016/S0301-0082(96)00054-8. [PubMed] [Cross Ref]
  • Wallis JD, Miller EK. From rule to response: neuronal processes in the premotor and prefrontal cortex. Journal of Neurophysiology. 2003;90:1790–1806. doi: 10.1152/jn.00086.2003. [PubMed] [Cross Ref]
  • Wei W, Wang XJ. Inhibitory Control in the Cortico-Basal Ganglia-Thalamocortical Loop: Complex Regulation and Interplay with Memory and Decision Processes. Neuron. 2016;92:1093–1105. doi: 10.1016/j.neuron.2016.10.031. [PubMed] [Cross Ref]
  • Wyatte D, Curran T, O'Reilly R. The limits of feedforward vision: recurrent processing promotes robust object recognition when objects are degraded. Journal of Cognitive Neuroscience. 2012;24:2248–2261. doi: 10.1162/jocn_a_00282. [PubMed] [Cross Ref]
  • Yuille A, Kersten D. Vision as Bayesian inference: analysis by synthesis? Trends in Cognitive Sciences. 2006;10:301–308. doi: 10.1016/j.tics.2006.05.002. [PubMed] [Cross Ref]
  • Zaksas D, Pasternak T. Directional signals in the prefrontal cortex and in area MT during a working memory for visual motion task. Journal of Neuroscience. 2006;26:11726–11742. doi: 10.1523/JNEUROSCI.3420-06.2006. [PubMed] [Cross Ref]
2017; 6: e25784.

Decision letter

Nicole Rust, Reviewing Editor
Nicole Rust, University of Pennsylvania, United States;

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Dynamic representation of partially occluded objects in primate prefrontal and visual cortex" for consideration by eLife. Your article has been favorably evaluated by David Van Essen (Senior Editor) and three reviewers, one of whom is a member of our Board of Reviewing Editors. The reviewers have opted to remain anonymous.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this letter to crystallize our concerns going forward. We feel the work is important and interesting but key issues remain unresolved that must be addressed satisfactorily to produce an acceptable manuscript.

At this point we are unable to render a binding recommendation and require a response from you indicating the feasibility of your completing the essential tasks in a reasonable period of time – around 2 months. The Board member and reviewers will consider your response and provide a binding decision.

General assessment:

This paper characterizes responses of V4 and vlPFC neurons to partially occluded visual stimuli and suggests that feedback from vlPFC to V4 boosts V4 responses to occluded shapes and helps resolve stimulus identity. The authors recorded vlPFC and V4 neurons from the same monkeys performing the same task, although in separate experimental sessions. They demonstrate that vlPFC neurons respond more strongly and more selectively to occluded stimuli, unlike V4 neurons, which commonly respond most strongly and selectively to unoccluded images. The authors suggest that a subset of V4 neurons respond to occluded images with two distinct peaks (but not all reviewers are convinced that the distinction between subpopulations with one and two peaks are real). The second peak follows the vlPFC response peak and shows similar characteristics to vlPFC. Based on these observations, the authors construct a two-layer neural network in which V4 and vlPFC model units are reciprocally connected and vlPFC responses shape the second peak of V4 units.

The reviewers find the proposal that PFC interacts with V4 to resolve the challenge of solving shape discrimination in the presence of occlusion to be timely and of significant interest. At the same time, the reviewers identified problematic issues with the data analysis that must be resolved before this work could be considered for publication.

Summary:

General concerns about the experimental aspects of the paper include the reproducibility of the main result and the impact of using different approaches when recording from the two brain areas that are compared. General concerns about the model include that it may be unnecessarily complex for the current illustrations provided, and that, even with this complexity, the current illustrations do not reflect population average effects.

Essential revisions:

1) The main claims of the paper rest on the assertion that V4 shape selectivity increases as a function of time. The concerns about this claim are two-fold:

1A) The current illustration is made via an argument that there are two subpopulations of neurons, those that have a second, shape selective peak and those that do not. The reviewers are concerned that the existence of these two subpopulations will not be reproducible. This includes some confusion about the methods that were employed to identify the two peaks, as well as the suspicion that the specific parameters used for this identification were overfit to this particular data set.

The way that the two peaks are identified (subsection “Peak finding algorithm”) is hard to understand: "were within ± 12 ms of at least one third of the peaks identified based on the PSTHs for individual occlusion levels".

How sensitive are these findings to the parameters of the ad hoc peak finding algorithm? Similarly, one may think that the algorithm for detecting double-peak V4 cells has many false positives and the true frequency of cells with two peaks is lower than that suggested in the paper. Can you clarify?

In Figure 6C, how can the "Time to peak" for V4 neurons be greater than 300 ms, if the second peak was required to be before 300 ms ("The second peak was constrained to be no later than 300 ms after test stimulus onset")?

The manuscript states: "However, shape selectivity for occluded shapes, particularly at intermediate occlusion levels (visible area 72- 95%), was different for the two groups: neurons with two peaks had significantly greater shape selectivity than neurons without two peaks in the time interval ~200-260 ms after test stimulus onset (t-Test, p < 0.05)." The supporting figure is Figure 8A vs. B. What this is stating exactly? Is the significance tested for each occlusion level separately and it was significant for visible area conditions of 72, 82, 90, and 95%? If so, was Bonferroni correction or another correction applied? How was this time period selected (~200-260 ms)? Was there a correction for testing multiple time periods? Do the results hold in different time periods?

To examine the tuning of the "second peak" the authors use as baseline the activity in an earlier phase of the visual response. Hence, if the neurons respond strongly early on, their response will decrease more strongly during the second peak. In other words, there is suspicion that the results presented in Figure 7 where activity of the second peak decreases if the visible area is large is simply an artifact of this erroneous choice of a time window to compute the baseline. Instead the authors should use the pre-stimulus activity as baseline for all epochs. This choice implies in the subsection “Response change”, in the first equation that b depends on i, in incoming input. In Figure 7D, E the y-axis is in units of "normalized response change". How was the normalization?

The average PSTH of V4 subpopulation with two response peaks is quite distinct from the single cell examples. For the population, the second peak is not obvious and responses to unoccluded stimuli stay above the occluded shapes, unlike the single cells in Figures 4 and 5. What causes the discrepancy? Is it the variability of the time of the second peak in V4 neurons? It will be helpful to have a supplementary figure with more single cell examples.

1B) More generally, the claim is that V4 shape selectivity changes as a function of time, and missing is the more direct comparison of shape selectivity for the same neurons early versus later in the response.

2) The broad tuning of neurons in vlPFC seems qualitatively inconsistent with it playing a role in enhancing V4 shape selectivity. The model does not currently resolve this nor is an explanation provided.

3) There are concerns that the functional differences between the responses in PFC and V4 may follow from differences in the experimental paradigm. These include:

3A) What was the effect of tailoring stimuli for neurons in one area and not doing so in another area? Can the latency of responses or their tolerance to occlusion be influenced by tailoring stimuli for neurons?

3B) It is unclear why the authors focus on the vlPFC neurons that are influenced by occlusion. There are also many neurons that do not care about the occlusion and we wonder whether these cells include some neurons that are tuned to the shape of the stimuli. If yes, is their tuning better or worse than that of the neurons that are influence by the occlusion?

What would Figure 3D look like if you included all of the 216 vlPFC neurons that were responsive during the test epoch?

Some cells have a stronger response if the visible area increases. Is it possible that these neurons are better tuned to the shapes than those neurons that are most active for high levels of occlusion?

Related to this last point: is it conceivable that the neurons that increase their response if there is more occlusion are tuned to some aspect of the occluders, e.g. the total surface area or the total perimeter of the occluding dots?

This latter possibility seems to be supported by the finding that the vlPFC preference for occlusion decreased when the occluders had the same color as the background (as is stated in the first paragraph of the subsection “Representation of occluded stimuli in vlPFC”).

There were 381 vlPFC neurons, 98 were significantly modulated by occlusion. How many of these 98 neurons are in monkey M vs. O?

4) Can the model replicate all aspects of the data? Including:

The shape selectivity index in vlPFC is considerably lower than in V4. Can the model in figure 9 work with reduced shape selectivity?

V4 cells are divided into two subpopulations, one with two peaks and another with a single peak. How can the feedback model explain that many V4 cells do not show two peaks in their responses to occluded stimuli? Are the authors assuming inhomogeneous feedback from vlPFC to V4? Alternatively, they may be suggesting that V4 population includes a continuum of responses that vary from single peak to double peaks and include in-between responses.

Can the model be adjusted to replicate population data in Figure 6? In its current form the model seems to best explain single cell examples. It is unclear how well these examples represent the population.

5) The gain mechanism that the authors propose in their model may be envisioned as PFC receiving two signals – an intertwined shape and occlusion signal and an occlusion-only signal, and then correcting the intertwined estimate with occlusion information. This seems like a bit of a chicken-and-egg problem: how and why would the brain extract occlusion level from occluded stimuli but not shape (preceding the locus at which it disambiguates shape)? Where could this v4-independent but occlusion dependent gain modulation be coming from? The authors suggest IT as a potential source, but IT units receive input from V4 and also feedback to V4. Does the model assume that gain modulation arrives at vlPFC with the same latency as the V4 inputs? Why can't feedback from IT to V4 be the source of the second V4 response peak? The authors cite their SfN abstract for occlusion selective signals in IT. It will be useful to provide more explanation about the results, especially for readers who did not stop by the SfN poster. What if there are two visible stimuli: one that is occluded and one that is not? Would the gain modulation then be different for the two stimuli? What if one stimulus occludes another stimulus?

6) Can the model be simplified? The proposed dynamic model has multiple degrees of freedom and several nonlinearities. Is all this complexity necessary? In the current version of the paper, it is difficult to gain intuition about the model based on the text. Simplifying the model can shine light on which components and nonlinearities are indispensable and therefore reasonable targets for follow-up studies. For example, the adaptation term for V4 projections to vlPFC seems a little arbitrary. Why such an adaptation does not happen for other connections in the model? A similar question can be asked about the half-rectified feedback from vlPFC to V4. Why other connections in the model are not half-rectified? More generally, we encourage the authors to explore the model space a little more extensively for simpler model architectures. As it stands, the network seems like a high-parameter model that one can tweak to get many different types of outputs. Establishing the necessity of the proposed architecture and parameterization requires a little more work.

7) On the interpretation of the effects in V4:

In the first paragraph of the subsection “Representation of occluded stimuli in vlPFC” the authors argue that the responses of some of the vlPFC neurons that are stronger if occlusion is stronger does not depend on the increased task difficulty or attentional demands, but this reasoning is unclear. Furthermore, the authors seem to have changed their mind in the fourth paragraph of the subsection 2Representation of occluded stimuli in vlPFC”, where they argue that vlPFC may amplify weak signals and that vlPFC is engaged in tasks of greater difficulty or cognitive demand.

In the first paragraph of the subsection “Representation of occluded stimuli in vlPFC” the authors argue that because many vlPFC neurons were tuned to shape, that these neurons therefore cannot reflect task difficulty or attentional demands. This argument does not hold because neurons may well be tuned to multiple aspects of a task.

8) How did you know whether penetrations were indeed in vlPFC? Was histology performed?

9) [Additional comment sent to the authors in response to authors’ plan for revision]: The authors should clarify if there is a real dichotomy between neurons with one versus two peaks, as well as how broad the distribution of the timing for the second peak is. Can they really convince the reader that they are not simply amplifying noise with their analysis? Furthermore, they now make the point that the tuning is stronger during the second peak. We would like to know if this is not simply predicted by the presence of extra spikes – i.e. the presence of a peak implies some extra spikes at a certain point in time.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Dynamic representation of partially occluded objects in primate prefrontal and visual cortex" for further consideration at eLife. Your revised article has been favorably evaluated by David Van Essen (Senior Editor), a Reviewing Editor, and two reviewers.

The manuscript has been improved but there are some remaining issues that need to be addressed before acceptance, as described below by reviewer #3. We envision that these revisions will be straightforward to carry out and that verification can be handled by the Reviewing Editor.

Reviewer #2:

The authors have addressed the most critical comments. There are structural weaknesses in the dataset that keep alternative interpretations plausible. However, I believe the authors' interpretation is strengthened by the new analyses and the paper passes threshold for publication.

Reviewer #3:

In their revision, Fyall et al. have addressed many of my concerns satisfactorily. They have made it clearer that some of the V4 neurons have a second peak that is more prominent in case of occlusion (Figure 4C represents a compelling example). Also, the peak detection method is now more convincing and it is also better documented. It remains unclear whether activity in vlPFC indeed contributes to late V4 activity and it is therefore conceivable that there are additional areas that could contribute to late V4 activity. Yet, I do realize that demonstrating the causal link between dlPFC and V4 would require a different approach, which would be beyond the scope of the present contribution. However, establishing such a causal link might be an important topic for future research, and the authors could mention this point, which could be added to the paragraph of suggested future work (subsection “Response dynamics in V4”, fifth paragraph).

Remaining points:

1) I find it difficult to understand why the vlPFC neurons do not respond so well when the occluders have the same color of the background (subsection “Representation of occluded stimuli in vlPFC”, second paragraph). I would suspect that the processes for shape recognition would remain the same. Or did the monkeys' performance show signs that this was not the case?

2) Quite some p-values are lacking, three examples:

"The responses of most of these occlusion-sensitive 155 neurons (71/98) increased with increasing occlusion level".

"Even for the small subset of vlPFC neurons that responded more strongly to unoccluded stimuli (27/98), shape selectivity was not stronger for unoccluded than occluded stimuli (see Figure 3—figure supplement 2A)."

"Shape selectivity for occluded shapes was significantly higher during the second peak than during the first peak."

3) "Of 85 neurons, 30 neurons (~35%) were classified as having two peaks". How were these cells distributed across the two monkeys?

4) The model with interactions between vlPFC and V4 seems still somewhat simplistic as there are only a few neurons and the variation (in effect size and timing) across neurons shown in the figures is actually a variation across neurons in different models rather than a variation of neurons within the same model. In networks with many units and reciprocal connections, the network dynamics might actually work against variation across neurons. The authors should discuss this. It would be great if it would be possible to show the same range of differences between neurons within a same model, but I will not insist on such a demonstration given that making such a larger model might require a substantial investment of time.

5) "We cannot rule out the possibility that IT responses also contribute to V4 responses during the second transient peak. However, our IT recordings suggest this is unlikely because, as in V4, shape selectivity in IT is stronger for unoccluded than occluded stimuli". Is it conceivable that some IT neurons also have two phases in their response where the second phase is more pronounced in the presence of occlusion? It would be great if the authors could look for this possibility in the previous data set by Namima and Pasupathy, 2016? If the two phases are there it would strengthen the paper, but it would also be interesting if that is not the case.

6) Equations 4/5: I failed to see the logic of these equations, would it be possible to clarify this? Equation 9: what is thr2?

7) I found Figure 8—figure supplement 6A confusing: how do you compute y/z for neurons with one peak?

Essential revisions:

1) The main claims of the paper rest on the assertion that V4 shape selectivity increases as a function of time. The concerns about this claim are two-fold:

1A) The current illustration is made via an argument that there are two subpopulations of neurons, those that have a second, shape selective peak and those that do not. The reviewers are concerned that the existence of these two subpopulations will not be reproducible. This includes some confusion about the methods that were employed to identify the two peaks, as well as the suspicion that the specific parameters used for this identification were overfit to this particular data set.

The way that the two peaks are identified (subsection “Peak finding algorithm”) is hard to understand: "were within ± 12 ms of at least one third of the peaks identified based on the PSTHs for individual occlusion levels".

How sensitive are these findings to the parameters of the ad hoc peak finding algorithm? Similarly, one may think that the algorithm for detecting double-peak V4 cells has many false positives and the true frequency of cells with two peaks is lower than that suggested in the paper. Can you clarify?

We appreciate the reviewers’ general concern regarding the reproducibility of our finding that some V4 neurons have two transient response peaks. We address this concern in our revision in several ways:

i) We have revised and simplified the algorithm employed for peak finding, and we have added a new supplementary figure to explain this procedure schematically (Figure 6—figure supplement 1). Specifically, the steps identified as confusing above have now been removed and replaced with a statistical test (see below).

ii) We have repeated our analyses using different parameter choices, and have added two supplementary figures (Figure 8—figure supplement 3 and Figure 8—figure supplement 4) demonstrating the robustness of the main findings described in the manuscript (Figure 6 and Figure 8) across different parameters.

iii) To mitigate concerns about false positives, we have added a statistical criterion to our peak finding algorithm. Only those neurons with a statistically significant (paired t-Test, p < 0.05, Bonferroni corrected) increase in response at the time of the putative second peak are categorized as having two peaks.

iv) We include an independent, model-based procedure for identifying neurons with two transient response peaks that does not rely on the ad hoc peak-finding algorithm. Briefly, this model-based procedure detects neurons for which the response PSTHs for occluded stimuli cannot be modeled as a linear scaling of the response PSTH for unoccluded stimuli. We present two new supplementary figures (Figure 8—figure supplement 5 and Figure 8—figure supplement 6) demonstrating a good correspondence between the results of this model-based procedure and those of the ad hoc peak-finding procedure used previously.

In Figure 6C, how can the "Time to peak" for V4 neurons be greater than 300 ms, if the second peak was required to be before 300 ms ("The second peak was constrained to be no later than 300 ms after test stimulus onset")?

We have corrected this plotting error.

The manuscript states: "However, shape selectivity for occluded shapes, particularly at intermediate occlusion levels (visible area 72- 95%), was different for the two groups: neurons with two peaks had significantly greater shape selectivity than neurons without two peaks in the time interval ~200-260 ms after test stimulus onset (t-Test, p < 0.05)." The supporting figure is Figure 8A vs. B. What this is stating exactly? Is the significance tested for each occlusion level separately and it was significant for visible area conditions of 72, 82, 90, and 95%? If so, was Bonferroni correction or another correction applied? How was this time period selected (~200-260 ms)? Was there a correction for testing multiple time periods? Do the results hold in different time periods?

For improved clarity, we have revised this analysis and we present the new results in the revised manuscript. We now use two t Tests to ask whether shape selectivity was significantly stronger for neurons with two peaks than neurons without two peaks at two time points: one centered around the first peak (69–99 ms) and another centered around the second peak (199–229 ms) of V4 responses. For occluded stimuli, shape selectivity was significantly stronger for neurons with two peaks during the second peak but not during the first peak. For unoccluded stimuli, shape selectivity was not significantly different between the two neuronal subsets during either time point.

To examine the tuning of the "second peak" the authors use as baseline the activity in an earlier phase of the visual response. Hence, if the neurons respond strongly early on, their response will decrease more strongly during the second peak. In other words, there is suspicion that the results presented in Figure 7 where activity of the second peak decreases if the visible area is large is simply an artifact of this erroneous choice of a time window to compute the baseline. Instead the authors should use the pre-stimulus activity as baseline for all epochs. This choice implies in the subsection “Response change”, in the first equation that b depends on i, in incoming input. In Figure 7D, E the y-axis is in units of "normalized response change". How was the normalization?

As requested by the reviewers, we present average activity plots with respect to a single prestimulus baseline for both epochs in the revised manuscript (Figure 7). The results show clearly that V4 neuronal sensitivity to occlusion level changes over time. We hope that this revised analysis of the data mitigates the reviewers’ concerns regarding the time window over which baseline activity is computed. Additionally, we clarify how the response normalization was implemented in the Materials and methods section and in the figure legend. Briefly, prior to constructing the population average, each neuron’s PSTHs were normalized to the maximum value across time and occlusion level. This strategy ensured that differences in peak firing rates between neurons did not impact the population averages.

The average PSTH of V4 subpopulation with two response peaks is quite distinct from the single cell examples. For the population, the second peak is not obvious and responses to unoccluded stimuli stay above the occluded shapes, unlike the single cells in Figures 4 and 5. What causes the discrepancy? Is it the variability of the time of the second peak in V4 neurons? It will be helpful to have a supplementary figure with more single cell examples.

We clarify this point in the revised Results section. We agree with the reviewers that the second response peak is less obvious in the population response averages than in the responses of individual example neurons. Our new model simulations (Figure 10—figure supplement 6) verify that this difference could arise due to variability in the timing of the second peak across V4 neurons. As requested by the reviewers, we include a new supplementary figure with additional example V4 neuronal responses from six neurons with two transient response peaks (Figure 5—figure supplement 1). We would like to note that for many neurons with two peaks, the responses to occluded stimuli do not exceed responses to unoccluded stimuli during the second peak (e.g. see Figure 5—figure supplement 1A, B, D). Rather, our results suggest that the difference in responses to occluded stimuli is larger than that for unoccluded stimuli during the second peak (Figure 7).

1B) More generally, the claim is that V4 shape selectivity changes as a function of time, and missing is the more direct comparison of shape selectivity for the same neurons early versus later in the response.

As requested by the reviewers, we have added a new scatter plot providing a direct comparison of shape selectivity for occluded stimuli during the first and second peak for V4 neurons with two peaks (Figure 8C). We describe this analysis in the revised Results section. Shape selectivity for occluded stimuli was significantly stronger during the second peak than the first peak. In contrast, we observed no significant difference in shape selectivity for V4 neurons without two peaks. We also observed no significant difference in shape selectivity for unoccluded stimuli for V4 neurons with or without two peaks.

2) The broad tuning of neurons in vlPFC seems qualitatively inconsistent with it playing a role in enhancing V4 shape selectivity. The model does not currently resolve this nor is an explanation provided.

We address this point in the revised Discussion section. Our new model simulations demonstrate that feedback signals from weakly shape-selective vlPFC neurons could enhance shape selectivity in V4 (Figure 10—figure supplement 4). In the behavioral task used, the occluded shape used in each experimental session is restricted to one of two choices, thus simplifying the object recognition problem. In this case, any given vlPFC neuron receiving differential input from V4 neurons that signal the two shape discriminanda could contribute to enhanced V4 shape selectivity. We would need additional experiments to determine how vlPFC and V4 might interact in more general situations. It is likely that recognition and memory-related mechanisms would contribute.

3) There are concerns that the functional differences between the responses in PFC and V4 may follow from differences in the experimental paradigm. These include:

3A) What was the effect of tailoring stimuli for neurons in one area and not doing so in another area? Can the latency of responses or their tolerance to occlusion be influenced by tailoring stimuli for neurons?

The lack of stimulus tailoring cannot explain away our two major vlPFC findings: i) the large fraction of vlPFC neurons that respond preferentially to occluded stimuli, ii) the stronger shape selectivity under occlusion. We address this point extensively in the revised Discussion section.

Tailoring stimuli in V4 may explain the preponderance of neurons in our V4 dataset that responded preferentially to unoccluded shapes. Without tailoring stimuli in vlPFC, we would expect a roughly equal proportion of neurons showing responses that increased and decreased with increasing occlusion level. However, we found that 72% of vlPFC neurons responded preferentially to occluded stimuli, a proportion that deviates significantly from the null hypothesis (binomial test, p < 0.01).

Furthermore, because the visible difference between any two shapes declines with increasing occlusion, we expect shape selectivity to decline with increasing occlusion regardless of whether we tailored the stimuli to each neuron. Stronger shape selectivity in the vlPFC under occlusion defies this expectation. We do not expect that stimulus tailoring (or lack thereof) influenced neuronal response latency for two reasons. First, in vlPFC, we did not find a statistically significant difference in response latency between neurons that were shape-selective (N=66) and neurons that were visually responsive but not shape-selective (N=150) (t Test, p = 0.55). Second, as part of an ongoing study in the lab, we find that V4 response latencies are similar for preferred and non-preferred stimuli (Zamarashkina et al., VSS Abstracts 2017).

3B) It is unclear why the authors focus on the vlPFC neurons that are influenced by occlusion. There are also many neurons that do not care about the occlusion and we wonder whether these cells include some neurons that are tuned to the shape of the stimuli. If yes, is their tuning better or worse than that of the neurons that are influence by the occlusion?

We address this point with a new supplementary figure (Figure 3—figure supplement 1), and revised text in the Results section. Briefly, of 216 visually-responsive vlPFC neurons, 66 were shape selective and, of these, 41 were occlusion-sensitive (two-way ANOVA, p < 0.05). Thus, few neurons (25/216) were classified as shape-selective but not occlusion-sensitive by our statistical tests. We now show average shape selectivity for four different vlPFC neuronal subsets (Figure 3—figure supplement 1): (1) shape-selective but not occlusion-sensitive neurons (N=25), (2) all shape-selective neurons (N=66), (3) neurons that carry task relevant information, i.e. either shape-selective or occlusion sensitive (N=123), (4) all visually-responsive neurons (N=216). Shape selectivity was stronger for shape selective neurons than occlusion-sensitive neurons. Notably, however, shape selectivity was strongest at intermediate occlusion levels, for both types of neurons.

What would Figure 3D look like if you included all of the 216 vlPFC neurons that were responsive during the test epoch?

As requested by the reviewers, we present these data in a new supplementary figure (Figure 3—figure supplement 1D). Compared to Figure 3D, the overall magnitude of shape selectivity is reduced due to the inclusion of more neurons, but otherwise the trends are similar. Specifically, shape selectivity is strongest for stimuli at intermediate occlusion levels in both figures.

Some cells have a stronger response if the visible area increases. Is it possible that these neurons are better tuned to the shapes than those neurons that are most active for high levels of occlusion?

We address this point with a new supplementary figure (Figure 3—figure supplement 2) and revised text in the Results section. This figure shows shape selectivity across occlusion level separately for vlPFC neurons that prefer lower occlusion levels (higher% visible area, Figure 3—figure supplement 2A) and for neurons that prefer higher occlusion levels (lower% visible area, Figure 3—figure supplement 2B). The data are more variable in A due to the inclusion of fewer neurons. Nevertheless, two patterns are evident: (1) even for neurons that respond more strongly to unoccluded stimuli (Figure 3—figure supplement 2A), shape selectivity is not stronger for unoccluded than occluded stimuli (compare black and colored lines); (2) neurons that prefer higher occlusion levels are more shape-selective under occlusion (compare colors in Figure 3—figure supplement 2A versus Figure 3—figure supplement 2B).

Related to this last point: is it conceivable that the neurons that increase their response if there is more occlusion are tuned to some aspect of the occluders, e.g. the total surface area or the total perimeter of the occluding dots?

This latter possibility seems to be supported by the finding that the vlPFC preference for occlusion decreased when the occluders had the same color as the background (as is stated in the first paragraph of the subsection “Representation of occluded stimuli in vlPFC”).

We now clarify this point in the revised Discussion section. vlPFC neurons that prefer higher occlusion levels may be sensitive to the total area of the occluding dots. This possibility is consistent with the finding that neuronal preference for the occluding dots is reduced when these dots are rendered in the same color as the background. However, it is important to note that many occlusion-sensitive vlPFC neurons are also selective for the occluded shape, so their responses reflect the characteristics of both the occluding dots and the occluded shape.

There were 381 vlPFC neurons, 98 were significantly modulated by occlusion. How many of these 98 neurons are in monkey M vs. O?

We include these numbers in the revised Materials and methods section. In monkey M, 71/260 neurons (27%) were occlusion-sensitive. In monkey O, 27/121 neurons (22%) were occlusion-sensitive.

4) Can the model replicate all aspects of the data? Including:

The shape selectivity index in vlPFC is considerably lower than in V4. Can the model in figure 9 work with reduced shape selectivity?

We have revised our model in response to this question and verified that the model can produce vlPFC unit responses with different magnitudes of shape selectivity. Specifically, each vlPFC model unit now receives input from both V4 units and the relative connection strengths from V4 to vlPFC units govern the strength of shape selectivity in the vlPFC unit. Figure 10—figure supplement 4 demonstrates that our model captures the overall trends observed in the neuronal data (i.e. two transient response peaks in V4, and a preference for occluded stimuli in vlPFC) under a wide range of vlPFC shape selectivity magnitudes.

V4 cells are divided into two subpopulations, one with two peaks and another with a single peak. How can the feedback model explain that many V4 cells do not show two peaks in their responses to occluded stimuli? Are the authors assuming inhomogeneous feedback from vlPFC to V4? Alternatively, they may be suggesting that V4 population includes a continuum of responses that vary from single peak to double peaks and include in-between responses.

We address this question with a supplementary figure (Figure 10—figure supplement 5) and new text in the Discussion section. Figure 10—figure supplement 5 demonstrates that, by varying the strength of feedback connections from V4 to vlPFC, the model can produce V4 unit response dynamics that vary along a continuum, with some units having one peak and others having two peaks. Together with the broad range of second peak magnitudes observed in our neuronal data, these model simulations support the hypothesis that V4 neurons with and without two peaks form a continuum. It is also possible that projections from vlPFC to V4 are inhomogeneous. Anatomical estimates of the proportion of V4 neurons that receive projections from vlPFC and quantification of the diversity and strength of these projections would be needed to fully resolve this question.

Can the model be adjusted to replicate population data in Figure 6? In its current form the model seems to best explain single cell examples. It is unclear how well these examples represent the population.

We address this question in two supplementary figures (Figure 10—figure supplement 5 and Figure 10—figure supplement 6). Figure 10—figure supplement 5 demonstrates that, by varying the synaptic delays and strengths in the feedback from vlPFC to V4, the model can produce a variety of V4 responses, with second response peaks that have a range of magnitudes and peak times. Figure 10—figure supplement 6 demonstrates that a simulated population average of 258 V4 model units produced a population-level PSTH that resembled the neuronal data.

5) The gain mechanism that the authors propose in their model may be envisioned as PFC receiving two signals – an intertwined shape and occlusion signal and an occlusion-only signal, and then correcting the intertwined estimate with occlusion information. This seems like a bit of a chicken-and-egg problem: how and why would the brain extract occlusion level from occluded stimuli but not shape (preceding the locus at which it disambiguates shape)?

We expand on this point in the revised Discussion section. In our behavioral task, the occluding dots and occluded shapes have different colors. In this simple case, extracting information about occlusion level is easy, and can be mediated by a neuron sensitive to the total area and color of the occluding dots. In the more complex case of natural vision, extracting information about occlusion level could be substantially harder. Future experiments in which features of occluding objects (e.g. color, shape and size) are varied across trials will be needed to extend our findings to more naturalistic cases.

Extracting information about the occluded shape is hard even for our simple task because occlusion makes parts of the shape inaccessible to the visual system. So, a neuron sensitive to boundary form or to the area rendered in a specific color will show reduced responses under occlusion and thus can only provide an intertwined shape and occlusion signal.

Where could this v4-independent but occlusion dependent gain modulation be coming from? The authors suggest IT as a potential source, but IT units receive input from V4 and also feedback to V4. Does the model assume that gain modulation arrives at vlPFC with the same latency as the V4 inputs? Why can't feedback from IT to V4 be the source of the second V4 response peak?

We address these questions with a supplementary figure (Figure 10—figure supplement 3) and new text in the revised Results and Discussion sections. It is possible that occlusion-only signals arise first at the level of V4 or, alternatively, IT. Future experiments comparing the preponderance and latency of occlusion-only signals in the two cortical areas are needed to uncover their likely origin. We have now implemented changes to the model to allow for the occlusion signals to arrive either from V4 or from IT cortex (Figure 10—figure supplement 3) by the varying the delay in the time of arrival of the gain modulation signal relative to shape sensitive signals.

We cannot rule out the possibility that IT responses contribute to V4 responses during the second transient peak. However, our IT recordings (Namima and Pasupathy, 2016) suggest this is unlikely because, as in V4, shape selectivity in IT is stronger for unoccluded than occluded stimuli. Thus, putative feedback from IT may not be well-suited for enhancing V4 shape selectivity at intermediate occlusion levels.

The authors cite their SfN abstract for occlusion selective signals in IT. It will be useful to provide more explanation about the results, especially for readers who did not stop by the SfN poster.

As requested by the reviewers, we expand on this point in the revised Discussion section. Briefly, recordings performed in IT cortex of one monkey performing the same behavioral task used in the current study show that the responses of many IT neurons are consistent with encoding the total area of the occluding dots (Namima and Pasupathy, 2016).

What if there are two visible stimuli: one that is occluded and one that is not? Would the gain modulation then be different for the two stimuli? What if one stimulus occludes another stimulus?

We speculate on these points briefly in the revised Discussion section, although these questions are best addressed with new experiments. When multiple stimuli are presented within a V4 neuron’s receptive field, past studies (e.g. Moran and Desimone, 1985) suggest that neuronal responses are dictated largely by the attended stimulus. Extending our results to the case of two stimuli, one occluded and one unoccluded, we expect gain modulation to be low or high depending on whether the unoccluded or the occluded stimulus is the attended, behaviorally-relevant stimulus.

When one stimulus occludes another and the attributes of the occluders are not known a priori, identifying which object is occluded, and by how much, could be challenging. To extend our simple model to tackle more complex, naturalistic cases would likely require the incorporation of attention and memory processes.

6) Can the model be simplified? The proposed dynamic model has multiple degrees of freedom and several nonlinearities. Is all this complexity necessary? In the current version of the paper, it is difficult to gain intuition about the model based on the text. Simplifying the model can shine light on which components and nonlinearities are indispensable and therefore reasonable targets for follow-up studies. For example, the adaptation term for V4 projections to vlPFC seems a little arbitrary. Why such an adaptation does not happen for other connections in the model? A similar question can be asked about the half-rectified feedback from vlPFC to V4. Why other connections in the model are not half-rectified? More generally, we encourage the authors to explore the model space a little more extensively for simpler model architectures. As it stands, the network seems like a high-parameter model that one can tweak to get many different types of outputs. Establishing the necessity of the proposed architecture and parameterization requires a little more work.

We address this question in the revised manuscript with six supplementary figures (Figure 10—figure supplement 16) and two new Results sections titled, “Parsimonious model construction” and “Heterogeneity in V4 and vlPFC responses”.

We demonstrate the necessity of four mechanisms included in the model: i) feedback from vlPFC to V4, ii) synaptic adaptation in the feedforward inputs from V4 to vlPFC, iii) a half-wave rectifying nonlinearity on feedback signals from vlPFC to V4, and, iv) gain modulation signals on vlPFC. First, a network with only feedforward connections from V4 to vlPFC units fails to generate a second response peak in V4 units (Figure 10—figure supplement 1A). Second, without synaptic adaptation on the feedforward connections from V4 to vlPFC, the feedforward-feedback loop positively reinforces neuronal activity in V4 and PFC units (Figure 10—figure supplement 1B), resulting in “ringing” and “blow up” of responses, which is inconsistent with the neuronal data. Third, without half-wave rectification on feedback signals from vlPFC units to V4 units, the model produces a large second peak even in V4 responses to non-preferred stimuli (Figure 10—figure supplement 2), which is inconsistent with the neuronal data (Figure 8—figure supplement 2). Given that such nonlinearities occur at synapses, our model suggests that feedback from vlPFC may arrive in V4 di-synaptically. Finally, we also illustrate the necessity of the occlusion-dependent gain modulation on vlPFC, without which vlPFC unit responses decrease with increasing occlusion (Figure 10—figure supplement 1C). We also repeated our model simulations for a range of parameter values associated with V4–vlPFC feedforward connection strengths (Figure 10—figure supplement 4), feedback synaptic strengths and delays (Figure 10—figure supplement 5 and Figure 10—figure supplement 6) and the timing of gain modulation (Figure 10—figure supplement 3). We hope these results validate the necessity of the proposed model architecture and parameter regimes.

7) On the interpretation of the effects in V4:

In the first paragraph of the subsection “Representation of occluded stimuli in vlPFC” the authors argue that the responses of some of the vlPFC neurons that are stronger if occlusion is stronger does not depend on the increased task difficulty or attentional demands, but this reasoning is unclear. Furthermore, the authors seem to have changed their mind in the fourth paragraph of the subsection 2Representation of occluded stimuli in vlPFC”, where they argue that vlPFC may amplify weak signals and that vlPFC is engaged in tasks of greater difficulty or cognitive demand.

We have revised relevant portions of the text to clarify our points:

“Given that many vlPFC neurons are selective for the shape of the occluded stimulus, it is unlikely that the observed vlPFC responses solely reflect task difficulty level or attentional demands. If difficulty or attention could fully explain vlPFC responses, then occlusion sensitive neurons would not be shape-selective (i.e. the PSTHs in Figure 2A and Figure 2B would be identical).”

“Our results are consistent with vlPFC’s engagement in tasks of greater difficulty or cognitive demand (Crittenden and Duncan, 2014). Importantly, when the task becomes difficult, rather than reflecting difficulty per se, we propose that vlPFC responses amplify behaviorally relevant signals to facilitate perceptual decisions.”

In the first paragraph of the subsection “Representation of occluded stimuli in vlPFC” the authors argue that because many vlPFC neurons were tuned to shape, that these neurons therefore cannot reflect task difficulty or attentional demands. This argument does not hold because neurons may well be tuned to multiple aspects of a task.

We have revised the text to clarify our point (see our responses to the previous point). The reviewers are correct in that vlPFC neurons can be tuned to multiple aspects of a task. Our assertion is simply that the responses of vlPFC neurons in our dataset cannot solely reflect attentional demands or task difficulty because these responses also signal the identity of the occluded shape.

8) How did you know whether penetrations were indeed in vlPFC? Was histology performed?

We have added text to the Materials and methods section to address this question. Briefly, we conduced structural MRIs for each monkey and localized the principal sulcus and arcuate sulcus based on these data. We then positioned vlPFC recording chambers based on these stereotactic coordinates, centered roughly at 21 mm anterior of interaural zero and 19 mm lateral to the midline.

With regards to histological confirmation, the monkeys are currently contributing to neurophysiological studies of IT, thus precluding histological analysis.

9) [Additional comment sent to the authors in response to authors’ plan for revision]: The authors should clarify if there is a real dichotomy between neurons with one versus two peaks, as well as how broad the distribution of the timing for the second peak is. Can they really convince the reader that they are not simply amplifying noise with their analysis?

We address this concern with several new analyses and supplementary figures (Figure 8—figure supplement 5 and Figure 8—figure supplement 6). We summarize our approach to these manuscript revisions below.

i) Our revised peak-finding algorithm now includes a statistical test. Only those neurons that show a statistically significant increase in response around the time of a putative second peak are classified as having two peaks.

ii) To further guard against false positives, we present a new, model-based procedure for identifying neurons with two peaks that is independent of the ad hoc peak-finding algorithm. We show that this model-based procedure works well for example neurons (Figure 8—figure supplement 5) and that the results generated show good correspondence with the results from the revised peak-finding algorithm (Figure 8—figure supplement 6).

iii) Rather than a complete dichotomy, we propose a continuum of functional response properties, akin to the continuum of V1 simple and complex cells. The strength of connections between V4 and vlPFC are likely to be heterogeneous, and our model simulations demonstrate that even weak functional interactions between the two areas could result in a small, second response peak in V4 that may be undetectable in highly variable responses (Figure 10—figure supplement 4 and Figure 10—figure supplement 5).

iv) With regards the timing of the second peak, we designed our peak-detection algorithm to identify response transients within a 300 ms window from test stimulus onset. This choice of temporal window was motivated by the fact that a large fraction of vlPFC neurons responded within 150ms of test stimulus onset. No additional neurons passed the criterion for significance when we extended this window to 500 ms (to exclude responses associated with the offset of the test stimulus at 600 ms). Thus, the distribution of second peak times shown (Figure 6C) cannot be attributed to our choice of temporal window.

Furthermore, they now make the point that the tuning is stronger during the second peak. We would like to know if this is not simply predicted by the presence of extra spikes – i.e. the presence of a peak implies some extra spikes at a certain point in time.

We address this concern with a new analysis and a supplementary figure (Figure 8—figure supplement 2).Shape selectivity for occluded stimuli was stronger at the time of the second peak even for neurons with two peaks whose responses to the preferred shape were stronger during the first peak than the second peak (Figure 8—figure supplement 2A). Thus, extra spikes at the time of the second peak cannot entirely predict stronger shape selectivity during this epoch. For neurons that showed stronger shape selectivity during the second peak, the magnitude of the second peak relative to the first was larger for the preferred shape than the non-preferred shape (Figure 8—figure supplement 2B). We argue that this differential enhancement of responses to the preferred shape serves to amplify shape selectivity.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Reviewer #3:

In their revision, Fyall et al. have addressed many of my concerns satisfactorily. They have made it clearer that some of the V4 neurons have a second peak that is more prominent in case of occlusion (Figure 4C represents a compelling example). Also, the peak detection method is now more convincing and it is also better documented. It remains unclear whether activity in vlPFC indeed contributes to late V4 activity and it is therefore conceivable that there are additional areas that could contribute to late V4 activity. Yet, I do realize that demonstrating the causal link between dlPFC and V4 would require a different approach, which would be beyond the scope of the present contribution. However, establishing such a causal link might be an important topic for future research, and the authors could mention this point, which could be added to the paragraph of suggested future work (subsection “Response dynamics in V4”, fifth paragraph).

We thank the reviewer for raising this point, which we have now incorporated into the Discussion.

Remaining points:

1) I find it difficult to understand why the vlPFC neurons do not respond so well when the occluders have the same color of the background (subsection “Representation of occluded stimuli in vlPFC”, second paragraph). I would suspect that the processes for shape recognition would remain the same. Or did the monkeys' performance show signs that this was not the case?

We did not observe a difference in behavioral performance (% correct) when the occluders were in the same color as the background. However, given that we used a fixed duration task design, we do not know if behavioral reaction times might have been affected by the color of the occluders. The stronger vlPFC responses we recorded when the occluders were in a contrasting color supports our hypothesis that vlPFC responses arise from the modulation of occlusion-dependent, shape-selective feedforward signals from V4 by another feedforward signal that is dependent only on occlusion level. The latter signal may be weaker when the occluders were in the same color as the background, thus yielding weaker vlPFC responses.

2) Quite some p-values are lacking, three examples:

"The responses of most of these occlusion-sensitive 155 neurons (71/98) increased with increasing occlusion level".

"Even for the small subset of vlPFC neurons that responded more strongly to unoccluded stimuli (27/98), shape selectivity was not stronger for unoccluded than occluded stimuli (see Figure 3—figure supplement 2A)."

We have re-written the relevant text to clarify this analysis, and to provide the results of significance testing for these two instances pointed out by the reviewer.

Neurons that were classified as occlusion-sensitive based on a two-way ANOVA (p < 0.05) were divided into two groups based on the sign of the regression slope between occlusion level and responses: 71 neurons had a negative slope, indicating that their responses were stronger for occluded stimuli, whereas 27 neurons had a positive or zero slope, indicating that their responses were strongest or comparable for unoccluded stimuli. Of the 98 occlusion sensitive neurons, 59 neurons had a slope that was significantly less than zero whereas 17 neurons had a slope that was significantly greater than zero (p < 0.05).

"Shape selectivity for occluded shapes was significantly higher during the second peak than during the first peak."

We have added the significance level to the end of the sentence (t Test, p < 0.01).

3) "Of 85 neurons, 30 neurons (~35%) were classified as having two peaks". How were these cells distributed across the two monkeys?

Of the 30 neurons classified as having two peaks, 14 were recorded in monkey O and 16 were recorded in monkey M. We include this information in the revised manuscript.

4) The model with interactions between vlPFC and V4 seems still somewhat simplistic as there are only a few neurons and the variation (in effect size and timing) across neurons shown in the figures is actually a variation across neurons in different models rather than a variation of neurons within the same model. In networks with many units and reciprocal connections, the network dynamics might actually work against variation across neurons. The authors should discuss this. It would be great if it would be possible to show the same range of differences between neurons within a same model, but I will not insist on such a demonstration given that making such a larger model might require a substantial investment of time.

We agree with the reviewer that in a model with many neurons and many incoming connections per neuron, dynamics across neurons may be similar due to “averaging” across all the inputs converging onto each neuron. However, in the limiting case where the neural population is large in size, but the number of active incoming connections per neuron is relatively small, substantial variation in inputs and in the response dynamics of individual neurons could persist. A detailed study of the precise conditions of connectivity and active network size needed to account for the observed data would be an interesting future direction for validating the model presented in this manuscript. We now mention this point in Discussion.

5) "We cannot rule out the possibility that IT responses also contribute to V4 responses during the second transient peak. However, our IT recordings suggest this is unlikely because, as in V4, shape selectivity in IT is stronger for unoccluded than occluded stimuli". Is it conceivable that some IT neurons also have two phases in their response where the second phase is more pronounced in the presence of occlusion? It would be great if the authors could look for this possibility in the previous data set by Namima and Pasupathy, 2016? If the two phases are there it would strengthen the paper, but it would also be interesting if that is not the case.

Our preliminary analyses reveal that only a few IT neurons (8/102) exhibit a second transient response peak that is generally smaller in amplitude than what we have observed in V4. Pending further analyses of the IT dataset, our current thinking is that the second transient response peak is much more prevalent and prominent in V4 than in IT cortex.

6) Equations 4/5: I failed to see the logic of these equations, would it be possible to clarify this? Equation 9: what is thr2?

We have revised the Methods section to clarify the logic behind Equations 4 and 5, and the termrthr2.

Equations 4 and 5 were designed to simulate the input to V4 units (e.g. Figure 10A) with an onset latency of 30 ms, a strong initial transient response, and a gradually declining sustained response, collectively lasting ~500 ms. Specifically, a difference of Gaussian filter (k) was convolved with the ramp (R), then cubed, normalized and half-wave rectified (Equation 4) to produce the desired input. The ramp function (R), defined separately for the preferred (i = 1) and nonpreferred (i = 2) stimuli (Equation 5), increases monotonically with the% visible area (c) and declines over time with a support of 500 ms (30 < t < 530ms). Note that the precise function defininguFFi is not critical as long as it produces strong input signals for the preferred shape that decrease with increasing occlusion level, thus capturing the observed V4 neuronal response properties.

In Equation 9, rthr2 represents a threshold value on vlPFC activity. When vlPFC activity exceeds rthr2 (10 spk/sec, see Table 1), the steady state feedforward connection from V4 to vlPFC, w∞,ff, goes to 0, and any subsequent input from V4 will fail to activate vlPFC. This synaptic adaptation term was introduced to prevent the second response peak of V4 units from inducing a second response peak in vlPFC units (see Figure 10—figure supplement 1B).

7) I found Figure 8—figure supplement 6A confusing: how do you compute y/z for neurons with one peak?

We now provide additional clarification in the legend of this figure supplement.

V4 neurons were classified as having one response peak if: i) their responses did not include a candidate second peak with a sizable amplitude and trough-to-peak modulation, or, ii) if their responses did include a candidate second peak that did not pass the statistical criterion. For 43 V4 neurons without a candidate second peak, we cannot compute y/z and these neurons are assigned y/z = 0 in Figure 8—figure supplement 6A. For 12 neurons with a candidate second peak that did not pass the statistical criterion we can compute y/z and the values are as shown.


Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd