|Home | About | Journals | Submit | Contact Us | Français|
Human recognition performance is characterized by abrupt changes in perceptual states. Understanding the neuronal dynamics underlying such transitions could provide important insights into mechanisms of recognition and perceptual awareness. Here we examined patients monitored for clinical purposes with multiple subdural electrodes. The patients participated in a backward masking experiment in which pictures of various object categories were presented briefly followed by a mask. We recorded ECoG from 445 electrodes placed in 11 patients. We found a striking increase in gamma power (30–70 Hz) and evoked responses specifically associated with successful recognition. The enhanced activation occurred 150–200 ms after stimulus onset and consistently outlasted the stimulus presentation. We propose that the gamma and evoked potential activations reflect a rapid increase in recurrent neuronal activity that plays a critical role in the emergence of a recognizable visual percept in conscious awareness.
An essential feature of human sensory perception is the abrupt changes that can be induced in perceptual states due to minimal changes in incoming stimuli, leading to the sigmoidal shape of the psychometric curve (Green and Swets, 1988). This sharp increase in performance level is indicative of crossing the perceptual threshold, a value of the sensory input levels around which the transition into perceptual awareness occurs (Quiroga et al., 2008). Furthermore, correlations between candidate neuronal signals and subjects’ perceptual reports, at the sensory threshold, can provide evidence for the involvement of these signals in the mediation of perception (Grill-Spector et al., 2000).
Several studies of the human visual system have utilized this approach in the search for cortical regions that play a role in recognizing a visual image, for instance during an image categorization task. Previous research has demonstrated that such category recognition is also characterized by perceptual thresholds (Grill-Spector and Kanwisher, 2005). In the context of the present study, this specific aspect of perception will be termed object recognition (i.e., of image category).
A useful technique for examining recognition thresholds is the so-called “backward masking” paradigm (Breitmeyer, 1984). In backward masking, a coherent “target” image, e.g., a face or a tool, is briefly presented, shortly followed by a visual “mask,” which is a meaningless picture, typically consisting of high-contrast random elements aimed at disrupting the recognition process (see Figure 3 for a specific implementation). Manipulating the length of the interval between onset of target and mask (known as the stimulus onset asynchrony, or SOA) and relating it to the level of correct image recognition, or recognition performance, yields a typical sigmoid psychometric function with a consistent threshold per subject (Del Cul et al., 2007; Grill-Spector et al., 2000).
In fMRI studies of backward masking, we (Grill-Spector et al., 2000) and others (Bar et al., 2001) have found a striking correlation between the BOLD signal and recognition performance. This correlation was specific to high-order ventral stream visual areas such as the lateral occipital complex and the fusiform face area (Kanwisher et al., 1997; Malach et al., 1995). These results, demonstrating a nonlinear dependency of both neuronal activity and recognition performance on sensory inputs, support the notion that neuronal activity levels in these areas are a correlate of perception. However, fMRI studies are limited by the sluggish nature of the hemodynamic response, which may fail to detect complex activity patterns involving rapid neuronal dynamics (Nir et al., 2008).
Recently, backward masking studies using single-unit recordings, conducted in patients implanted in medial-temporal lobe (MTL) structures for clinical diagnostic purposes, have also revealed a robust response tightly correlated with recognition (Quiroga et al., 2008). While this result appears to be in agreement with the fMRI findings from human visual areas, it should be noted that the recordings were obtained in regions that are likely further downstream from the visual representations proper, as was also evident by the long latency (~300 ms) of the visual responses of the MTL neurons. Hence, the possibility remains that the recognition-related activity originates locally in the MTL structures rather than in the visual representations proper.
An alternative approach for addressing this issue is through electroencephalography (EEG)—measurement of mass electrical activity from large groups of neurons as recorded on the scalp (Başar, 1980). So far, attempts to study the activity of high-order visual areas in the context of backward masking have been confined to this noninvasive method. Recently, one such study used scalp EEG to investigate response linearity in different processing stages, using source localization modeling and response latency as a measure of the hierarchical level of the sources (Del Cul et al., 2007). It was argued that the responses in high-order visual areas are actually a graded function of the sensory inputs rather than the highly nonlinear “ignition-like” dynamics suggested by fMRI.
Although EEG can be used to reveal global cortical activations, it is limited by a lack of spatial resolution. A powerful compromise between fMRI and EEG, allowing both high temporal resolution and a relatively good spatial resolution, is provided by electrocorticography (ECoG), the subdural measurement of local field potentials (LFP) (Allison et al., 1999; Kreiman et al., 2006; Liu and Newsome, 2006).
Recent electrophysiological recordings in human sensory cortex reveal that the power in the induced high frequencies of the LFP (“broadband gamma power”) reflects the global firing rate in the sampled region (Nir et al., 2007) and is significantly correlated to the BOLD signal (Logothetis et al., 2001; Mukamel et al., 2005; Niessing et al., 2005; Nir et al., 2007; Privman et al., 2007). On the other hand, the time-locked visual evoked response critically depends on tight synaptic synchrony. Importantly, ECoG recordings in high-order visual areas reveal robust category-selective gamma (Lachaux et al., 2005) and evoked (Allison et al., 1999) activations surpassing those found in BOLD (Privman et al., 2007) and EEG (Bentin et al., 1996) signals.
Here we report on ECoG recordings in patients participating in a backward masking task. Our results reveal that high-order object- and face-selective electrodes show an abrupt increase in activity whenever the subjects crossed the recognition threshold. This recognition-related effect was evident both in the amplitude of the evoked N170 component (Allison et al., 1999; Bentin et al., 1996) and in the increase in broadband gamma power, which far outlasted the physical stimulus presentation, suggesting a reverberatory source for the nonlinear activation. The results point to the abrupt, “ignition-like” increase in reverberatory neuronal activity as a likely correlate of perceptual awareness in the human visual system.
Our data were based on 445 ECoG electrodes placed in 11 patients (see Experimental Procedures for details) who performed a backward masking (BM) experiment. All patients also underwent an initial object category (OC) experiment, in which images belonging to six categories (faces, houses, man-made objects, as well as cars, birds, and inverted faces) were presented at the rate of 1 Hz, while subjects were engaged in a one-back memory task (see Experimental Procedures for details). The purpose of this experiment was to establish the response selectivity of the electrodes under normal viewing conditions. Following this experiment, the patients conducted the BM task in which pictures were flashed briefly followed by a mask that was separated by various durations of interstimulus intervals (see Figure 3 below and Experimental Procedures).
The location of all electrodes in the entire patient population is shown in Figure 1, projected on a cortical reconstruction of a healthy subject from a previous fMRI study by our group. Individual cortical reconstructions and precise electrode placements for four patients are shown in Figure S1. The relationship of these electrodes to visually activated and inactivated cortical areas, as defined by conventional fMRI mapping (see Experimental Procedures), is shown in Figure S2. Note the extensive coverage, including frontal, parietal, and occipito-temporal sites (for a detailed description of the anatomical locations of key electrodes in the patients, see Table S1). The electrical activity recorded in each electrode was carefully examined for any sensory or task-related responses.
Our analysis focused on two aspects of the neural activity: (1) the “evoked” response, obtained with standard visual-evoked potential (VEP) analysis methods (Allison et al., 1999), and (2) “induced” changes in spectral power of the signal. The evoked responses and induced power changes reflect two different aspects of the neural response. The evoked response is the averaged ECoG signal, time locked to the stimulus onset, and is consequently dependent on synchronized timing of neuronal inputs which is at a fixed latency from stimulus onset (similar to P100 and N170 [Allison et al., 1999; Bentin et al., 1996; Privman et al., 2007]). In contrast, the induced power changes are measured in each trial, ignoring the spectral phase, and then averaged across trials, and are hence more loosely tied to the stimulus onset (Başar-Eroglu et al., 1996). Note that unless the evoked response is subtracted, the induced power changes will also reflect the time-locked signal. Induced power changes are usually examined within the limits of a certain frequency band. Here we obtained the induced signal by first band-pass filtering the raw signal in the 30–70 Hz (gamma) range and then averaging the changes in power in this band, rather than the raw signal, relative to stimulus onset and across trials (see Supplemental Data). This measure is termed the gamma band limited power (BLP).
We first identified the electrodes that were activated by the BM task. Electrodes were defined as task-related during BM when their gamma BLP showed a significant increase up to 1 s following stimulus onset compared to the prestimulus baseline (see Experimental Procedures) in trials with SOA of 200 ms, where recognition performance was almost perfect (96.8%). We focused on the induced gamma response because (1) we have previously shown (Mukamel et al., 2005; Nir et al., 2007) that it is the gamma BLP, rather than the evoked response, which is correlated with the global firing of neurons and (2) analysis of the data from the current experiment indicated that the evoked signal may reflect nonlocal aspects of neural activity (see below).
This analysis revealed 90 out of 445 (20.2%) task-related electrodes. Note that these responses may have included, in addition to the visual response, also later processes such as the verbal report, auditory responses, etc. In contrast to the gamma responses, examining the evoked responses revealed a more widespread activation pattern. We found 143 electrodes that showed a significant change in evoked response amplitude at short and mid latency in response to visual stimuli. Of these electrodes, 83 (58.0%) were located outside visual areas in temporal, frontal, and parietal regions. We discuss these electrodes below.
Of the task-related electrodes, we limited our investigation in the present study to the short- to mid-latency (up to 250 ms post-target onset, see Experimental Procedures) gamma responses, since these were more likely to be associated with the initial visual recognition state. This analysis revealed 48 (53.3%) of the task-related electrodes that responded with short to mid latency. Figure 1 delineates in yellow and orange the locations of the short-mid latency visual responsive electrodes. The long-latency electrode responses (250–1000 ms) will be dealt with in a separate publication.
The short-mid latency gamma responses were almost exclusively (45 of the 48) confined to visual areas, mainly in occipito-temporal cortex (compare to well-established borders of visual areas in healthy subjects as mapped with fMRI, Figure 1, Figure S2, and Figure S4, see Experimental Procedures for details). Of the three remaining electrodes, one was located in anterior temporal cortex and the others were located in motor regions. All three did not respond in the essentially visual OC experiment (see below).
Closer inspection of the neuroanatomical location of the responsive visual cortex electrodes revealed that 12 (26.7%) were located in low-order retinotopic cortex, as judged by overlap with fMRI maps obtained in healthy individuals and normalized to Talairach coordinates (see Experimental Procedures and Table S1); the majority (33, 73.3%) were located in higher-order visual areas (see Figures 1 and Figure S4). The former group was labeled low-order and the latter high-order visual electrodes.
An important difference between the two groups of visual electrodes was in their stimulus selectivity. Note that this parameter was particularly important for the present study. Because, in the BM paradigm, the mask shortly followed the target, a weak response to the mask was critical for allowing us to isolate the responses to the target. Of the visual electrodes, 20 electrodes had a significantly larger gamma increase (two-sample t test, p < 0.01) during the BM experiment (SOA = 200 ms) in response to their preferred object category than to the mask stimulus (see Experimental Procedures). Such significant difference was found exclusively in the high-order electrodes. We termed these electrodes target-selective. Among the 20 target-selective electrodes, the majority (14) showed a preferential response to face images and 6 to man-made objects. As can be seen in Figure S4, these electrodes were consistently located in high-order visual cortex, in or close to category-selective regions known from fMRI scans.
The target selectivity was also clearly evident during the object category experiment. An illustration of such responses is provided in Figure 2, which depicts three examples of visual responses of high-order and one low-order visual electrode (a full “gallery” of the responses is shown in Figure S5). The responses are presented as time-frequency decompositions, which depict the trial-averaged spectral power changes induced (i.e., ignoring spectral phase) by the visual stimuli. The decompositions are shown without subtracting the averaged evoked responses. A separate presentation of the averaged gamma power modulations (BLP) as well as the evoked responses of all category-selective electrodes is shown in Figure S3.
Although target selectivity was determined on the basis of a different trial set from that used for subsequent recognition-related analysis, namely those trials with SOA of 200 ms, they did belong to the same experimental paradigm. To rule out the possibility of any interdependence between selection criteria and response analysis, we also used an alternative electrode classification, where the criterion was external to the BM paradigm, namely category selectivity in the OC experiment (see Experimental Procedures). The electrode set was largely unchanged, as were the results—see Figure S13 and the discussion of Figure 6 below.
Inspecting the spectral decompositions revealed that in addition to the enhanced activation in high-frequency gamma responses induced by the visual stimuli, there was a prominent reduction in spectral power at low (15–25 Hz) frequencies. This effect, which has been termed evoked response desynchronization (ERD) (Crone et al., 2006), was a highly robust and common feature of the LFP responses both in high- and low-level electrodes. However, as can be discerned in Figure 2, the ERD response was much more broadly tuned to object category, showing an essentially nonselective response to the different categories.
Interestingly, the target-selective electrodes showed a sustained gamma response following target presentation, which, during the BM experiment, substantially outlasted the 16 ms target stimulus duration (556.1 ms ± 182.7 ms, mean ± SD). A similar effect was noted in these electrodes also during the OC experiment (453.9 ms ± 131.1 ms), again outlasting the 250 ms target presentation duration.
Comparing the evoked and induced gamma responses in these electrodes revealed that 90% (18 out of 20) showed an early negative evoked component, albeit often with a relatively smaller amplitude than the gamma response.
To evaluate the relationship between target recognition and neural activation, we compared the electrode responses during successful recognition with those generated when subjects failed to recognize the target images. We first analyzed the target-selective electrodes. Discarding one such electrode which lacked sufficient responses at the critical SOA left 19 target-selective electrodes for recognition-related analysis. This analysis revealed a significant recognition-related gamma BLP enhancement (p < 0.01, see Experimental Procedures) in 15 out of 19 (78.9%) of the electrodes (10 of 14 face electrodes and 5 out of 5 man-made object electrodes). As for the evoked response, 15 of 19 electrodes (12 of 14 face electrodes) showed a significant (p < 0.01) recognition-related evoked response. The category selectivity and recognition effect for each of the high order electrodes are detailed in Table S2.
Figure 4 shows examples of recognition-related changes in the gamma BLP (see Figures 4A and 4B) and amplitude of evoked response (Figures 4C and 4D) following the preferred category presentation for four highly selective electrodes. The figure depicts the averaged gamma BLP (Figure 4A) and evoked (Figure 4C) responses at the perceptual threshold. Responses when subjects were able to successfully recognize the target image (“recognition” trials) are indicated in red, and when they failed to do so (“no-recognition” trials) in blue. As can be seen, responses in the recognition trials were characterized by a step-like rise in gamma BLP, which was significantly higher (p < 0.001, by t test, 100–300ms after stimulus onset) compared to the no-recognition trials. A similar increase in response amplitude was observed in the negative-going N170 evoked response (p < 0.01). The response in the control condition (green curve), in which a blank was shown instead of the target image (followed by the usual mask), showed a tendency to be weaker than in the no-recognition condition, although this effect rarely achieved significance (e.g., p < 0.01 in 1 of the 4 shown electrodes).
In these examples, the sets of recognition and no-recognition images were of the same category and were visually similar, although not identical (for the full set of recognized and no-recognition face images in four of the patients, see Figure S6). To control for the possibility that slight differences between the two exemplar sets contributed to the enhancement effect, we compared the activation to the two sets of images but at longer SOAs, when both sets of exemplars could be recognized. Under such conditions no significant difference was found in the response profiles for the two sets (Figures 4B and 4D).
The results from these electrodes presented in time-frequency decomposition are shown in a spectrogram form in Figure 5. Top spectrograms in each panel show the responses during successful recognition, while middle spectrograms show the responses when recognition failed. It can be seen that the sustained gamma BLP responses shown in Figure 4 were due mainly to frequencies above 50 Hz. The evoked (N170) component of the response is reflected in the low-frequency range of roughly up to 20 Hz, where a transient rise in the power occurred preferentially in the “recognition” condition.
In order to assess how general the activity enhancement associated with recognition was, and in order to rule out the possibility that subtle, within-category exemplar differences may have accounted for the enhanced activations, we calculated the neuronal responses during successful and unsuccessful recognition trials, but this time keeping target exemplars identical across the two perceptual conditions while allowing a minimal change in SOA duration between the target-recognized and unrecognized trials (see Experimental Procedures). To isolate the target contribution to the response, we first subtracted out the mask-only (“blank trial”) responses in each electrode.
Figure 6 shows the grand average of all 14 face-selective electrodes. Although the precise neuroanatomical locations of these electrodes varied, of course, from patient to patient, they nevertheless were all located in high-order occipito-temporal cortex (see Figure S4). Similar analysis for the object-selective electrodes is shown in Figure S7. The mean (±SEM) increase in gamma BLP (in normalized units, see Supplemental Data) is shown for all recognition (red) and no-recognition (blue) trials. Similar to the effect shown in individual electrodes in Figure 5, successful recognition was associated with a global and highly significant (p = 5.43 × 10−5 at t = 255 ms, paired t test) increase in gamma BLP. The gamma BLP enhancement effect was sustained, far outlasting the stimulus presentation, and persisted from 90 ms to 680 ms after stimulus presentation (Figure 6C). There was also a significant recognition-related increase in the amplitude of the N170 component (Figure 6A). See quantitative details in Supplemental Data. The significance of these results was verified using a shuffle control (see Figure S14 and Supplemental Data). As remarked above, the same analysis was also conducted for groups of electrodes as classified on the basis of category selectivity in OC rather than target selectivity in BM. The results are shown in Figure S13 and are similar to those shown in Figure 6 and Figure S7.
Note that the target duration was kept constant at 16 ms during both recognition and no-recognition trials. However, the average SOA was somewhat shorter in the no-recognition trial set than in the recognition one (28.8 ± 2.65 ms and 40.0 ± 4.72 ms, respectively). To examine the possibility that the longer SOA contributed to the increased neuronal response observed in the successful recognition trials, we examined the effect of SOA duration on electrode responses in recognition trials only. The results are shown in Figures 6B and 6D. As can be seen, a much larger extension of SOA duration (from 16 to 66ms) without the accompanying perceptual change failed to produce a significant increase in gamma BLP (p > 0.08). Interestingly, extending the SOA to 200 ms did produce a significant gamma BLP enhancement compared to the shorter SOAs, which reached a maximal difference of 54% (at 200–380 ms; p < 10−4, see Figure S8). There were insufficient trials at SOAs of 33 ms and 66 ms for conducting the same analysis for the no-recognition condition.
Apart from the gamma range, we performed the same analysis for low-frequency (15–25 Hz) BLP, at which a stimulus-triggered ERD was evident (see above). We found, as expected, an early onset, recognition-related enhancement in power which was related to the increase in the N170 evoked response amplitude; however, we failed to find a significant recognition-related differential effect on the negative-going ERD (see Figure S11).
An interesting question is whether the activity enhancement associated with successful recognition was specific for the optimal category or was a more general effect. To address this issue, we analyzed the responses to the second-best category. Applying the same analysis as in Figure 6 to the 14 face electrodes in trials with man-made objects as targets and to 5 man-made object electrodes with face targets yielded a significant (p < 0.01), albeit smaller, recognition-related increase in gamma responses for both face-selective and object selective electrodes (see Figure S10).
In low-level visual areas, electrodes typically responded strongly to the mask stimulus, obscuring possible differences in the response to the targets in trials with small SOA. In an attempt to nevertheless examine such potential modulations, we subtracted the mask-only response from the target + mask responses and analyzed this subtracted signal during successful and unsuccessful recognition. At the single-electrode level, no electrode showed the recognition-related gamma increase observed in the target-selective, high-level visual electrodes (p > 0.01).
A population-level analysis of the low-level visual electrodes also failed to reveal a significant difference, in the mask-subtracted average evoked amplitude or gamma BLP, associated with the change in perceptual state (n = 12 electrodes, see Figure S9A and S9B, p > 0.1 for gamma, p > 0.025 for evoked, paired t test,).
The same analysis carried out for those high-order electrodes which were not found to be target-selective and had sufficient data (n = 10, see Figures S9C and S9D) showed only a long latency increase of gamma BLP in the recognition versus no-recognition condition (p = 0.0087 at t = 515 ms). This increase was likely related to the nonvisual aspects of the BM task.
In addition to electrodes located in visual areas, we also found mid-latency (up to 250 ms post-target onset) responses in more frontal regions. These responses differed from those of the visual electrodes in several regards. First, they were strictly of the evoked type, lacking a clear association with increase in gamma BLP. Second, the main peak of activation in these electrodes had a similar latency to that of the evoked N170 peak, but with an inverted polarity, i.e., showing a positive deflection. Finally, the evoked responses among adjacent electrodes tended to be similar.
At the population average level (n = 77 electrodes with sufficient data, see Figures S9E and S9F), these electrodes showed no significant difference in evoked or gamma BLP response between the two recognition conditions at short latencies. Evoked responses diverged briefly 470 ms after target onset (p = 0.0077), whereas the gamma BLP showed a clear increase at long latency (p = 8.1 × 10−7 at t = 960 ms), which may have been related to the verbal responses in the “recognition” condition, as this group of electrodes covered also motor regions. This late effect is not discussed in the current work.
The main finding reported here is that reaching perceptual awareness, in our experimental paradigm, was associated with an abrupt and significant increase in both broadband gamma power and evoked responses as measured with ECoG (e.g., see Figure 4) in high-order visual areas.
Due to technical limitations, we could not study the neural responses associated with perceptual changes while the visual inputs were kept precisely identical (Quiroga et al., 2008), as the recognition and no-recognition trials in our experiment differed slightly in their stimulus composition. Consequently, one cannot entirely rule out the possibility that minute changes in visual inputs contributed to the observed changes in neural activity in our recording sites.
Thus, a critical issue is whether the observed changes in neuronal activity were associated with reaching perceptual awareness—i.e., during the transition from no-recognition to recognition trials—or were due to these subtle changes in stimulus properties. We addressed this issue in the analysis shown in Figure 6 for face-selective electrodes and in Figure S7 for man-made object-selective electrodes. Here a clear, significant increase in both gamma power and evoked responses was associated with successful recognition. Note that it is highly unlikely that this change was solely due to differences in the physical attributes of the stimuli between the recognition and no-recognition conditions. First, the duration of target presentation and the target exemplar sets were identical for both conditions. Importantly, although there was a small (~11 ms) difference in mean SOA between the two conditions, this difference was unlikely the source of the neuronal activity enhancement, since changing the SOA by a larger amount (50 ms) during successful recognition trials failed to produce a significant change in the neural response (Figures 6B and 6D).
In summary, we can conclude that significant increases in neural activity—both gamma power and evoked response amplitude—were tightly associated with successful recognition. This finding suggests that this enhancement in neural activity in high-order visual areas plays a critical role in the emergence of a conscious visual percept.
The short and mid (up to 250 ms) latency visual responses described in the current study, of both the evoked and gamma BLP types, were fairly localized. They were found mainly in electrodes located in visual areas of the occipito-temporal cortex (see Figure 1).
Although these results do not refute models which posit spread to fronto-parietal cortex as an obligatory stage in perceptual awareness (Del Cul et al., 2007; Gaillard et al., 2009; Rees et al., 2002), they do constrain such process to begin at least 300 ms following stimulus onset. Such long-latency activations were indeed observed in several frontal electrodes in our study, but examining these responses in detail will be dealt with in another publication. On the other hand, we cannot rule out the possibility that short-latency gamma power effects may have occurred in fronto-parietal sites which were not covered by our electrode placements, or that additional aspects of neuronal activity (e.g., precise spike timing changes) may play a significant role in addition to the observed increases in LFP activity.
Regarding a possible relationship to recognition thresholds, our results did not reveal a significant difference in low-level electrodes between the recognition and no-recognition states. However, note that subtle recognition-related modulations of the target responses may have been obscured by the strong response to the mask. Furthermore, it is important to note that the LFP signal reflects averaged mass activity of large neuronal groups (Nir et al., 2007), so it could be that recognition-related effects may have occurred within small neuronal assemblies at these low-order electrode sites, or were reflected in more subtle neuronal activity patterns for which the LFP was not sensitive enough.
Recently, it has been reported in scalp EEG recordings that recorded bursts of gamma activity are in fact an artifact of visually induced eye movements (Yuval-Greenberg et al., 2008). Could our findings be accounted for by differences in eye movement patterns during the two perceptual states?
Several points argue against such a contribution in our data. First, the gamma EEG artifact is a spatially homogenous signal extending throughout the entire EEG electrode set, while a prominent feature of our gamma recordings was their highly localized nature (see Figure 1). Second, the eye-movement gamma artifact is characterized by a narrow time window (about 100 ms) (Yuval-Greenberg et al., 2008), while our recordings show long-lasting responses lasting for ~450 ms for the target-selective high-order electrodes (see above). Moreover, this response duration clearly varied with electrode location. Finally, the eye movement artifacts are very broad-band while the visually induced gamma responses in our recordings were mainly in the higher (50–70 Hz) gamma sub-band, and the BLP time-course in this sub-band was clearly different from that in the range 30–50 Hz. Thus, multiple lines of evidence rule out the possibility that our neural activity was contaminated by eye-movement artifacts.
A small number (three) of the electrodes in our sample that showed short-mid latency, task related responses, were located outside of occipito-temporal visual cortex. However, these electrodes failed to show a similar response to the more prominent visual stimuli during the OC experiment, indicating that their responses were not associated with the visual aspect of the responses but rather with the nonvisual aspects specific to the BM task.
With regard to the evoked responses, we found, in addition to the responses in visual areas, clear evoked responses in more anterior cortical electrodes. These responses consisted only of evoked activity without the accompanying gamma power increases and were of opposite signal polarity to the visual evoked potentials of the occipital visual regions, particularly the N170 component. This signal inversion suggests a possible similarity between the anterior responses and the positive-going peak termed the vertex positive potential (VPP) (Botzel and Grusser, 1989; Jeffreys, 1989; Joyce and Rossion, 2005) in scalp EEG recordings, the source of which is still debated. However, at this stage we cannot rule out the alternative possibility—namely, that the frontal evoked responses do represent some generalized activity that is reflected in the local neuronal population. Clearly, more data are needed to resolve this issue.
Can the present results be informative with regard to the firing activity of the neuronal population contributing to the recorded ECoG signals? Recently, we have demonstrated, using multielectrode recordings of single neurons in auditory cortex of patients, that gamma power increase in human sensory cortex indicates an overall increase in neuronal firing rates (Nir et al., 2007), suggesting that reaching perceptual awareness in our study was linked to an abrupt and long-lasting increase in spiking activity among a large population of neurons.
Furthermore, we have shown previously that in human sensory cortex, BOLD fMRI provides a reliable measure of both gamma LFP and the overall spiking activity (Mukamel et al., 2005)— thus, the present results are also compatible with earlier fMRI findings showing highly nonlinear increases in BOLD responses in human high-order visual areas associated with perception in the backward masking paradigm (Bar et al., 2001; Grill-Spector et al., 2000). Such enhanced responses were also found to be related to a perceptual transition in the context of ambiguous figure perception (Andrews et al., 2002; Hasson et al., 2001), and in the fMRI contrast response, which shows strong non-linearity in high-order visual areas (Avidan et al., 2002).
Interpreting the stimulus-locked evoked response, which is likely to be dependent on synchronous synaptic inputs, is more complex. However, the significant recognition-related increase in the evoked response amplitude points to an abrupt and time-locked neuronal response to stimulus onset.
A simple model that can account for the observed findings suggests that conscious perception of a visual target (e.g., a face) was associated with a rapidly increasing burst of neuronal firing in high-order visual cortex. Importantly, this burst of activity was not merely a reflection of changes in the visual inputs but was rather a highly disproportionate perception-related activation even when the visual inputs were largely constant. Such a highly nonlinear stimulus-response relationship can be metaphorically described as an “ignition”—i.e., a process in which a tiny change in visual inputs (“lighting a match”) can induce a large change in neural activity (igniting the “flame”). This intense and long-lasting neural activation can explain, on the one hand, the observed increase in gamma and evoked responses and, on the other hand, the emergence of a visual percept.
What could be the anatomical substrate for such perceptually relevant “ignitions” of neuronal activity in high-order visual cortex? Among the various potential circuits, an attractive candidate would be the dense “halos” of local intrinsic connections, which, intriguingly, are particularly pronounced in high-order visual areas of the primate brain (Amir et al., 1993). More generally, it has been proposed that input “amplification” in such local cortical circuits is a fundamental property of the cortical circuitry (Douglas and Martin, 2007). Furthermore, rapid “ignitions” are supported by network models (Loebel et al., 2007; Tsodyks et al., 2000) and recognition performance (Kirchner and Thorpe, 2006). Such models are compatible with the notion that at least part of the evoked response may be attributable to local intrinsic processing rather than synchronized inputs from low-order areas. This is compatible with our inability to find significant recognition effects in low-order visual electrodes, yet finding a robust recognition effect reflected in the evoked response amplitude in downstream areas. Interestingly, neuronal ignitions have been demonstrated to spontaneously occur even in in vitro-grown neuronal networks (Eytan and Marom, 2006).
An important characteristic of nonlinear, recurrent networks is that the temporal evolution of their activity is dissociated from that of their inputs, often showing sustained activity lasting far beyond input termination. Indeed, we often found gamma response duration to be longer than 100 ms (see Figure 2 and Figure 4–Figure 6), outlasting both target presentation (a mere 16 ms) and SOA. Such temporal nonlinearity, amounting to short-term memory of the system, is compatible with models of reverberatory network activity (Amit, 1992). Similar temporal nonlinearities have also been observed in the BOLD response, specifically in high-order visual areas (McKeeff et al., 2007; Mukamel et al., 2004). Our present results are thus compatible with such local nonlinear behavior.
On the other hand, a prominent aspect of ignition-like behavior, or “network spiking,” is its all-or-none nature. This was not the case in our data—instead, we found that trials in which the patients failed to recognize the stimuli often showed a low but significant level of activation (e.g., Figure 6C, blue line). A number of factors may account for this low-amplitude activity. However, establishing its potential sources will require further study.
Previous ECoG studies in which recognition was manipulated through inversion of Mooney images—which, unlike the present study, constituted a substantial change in the physical properties of the stimuli (Lachaux et al., 2005)—revealed a complex behavior in which some electrodes showed an effect while others did not.
While this paper was under review, a relevant ECoG study (Gaillard et al., 2009) reported a backward masking experiment using word stimuli as targets, arguing that recognition effects were mainly of long latency and were widespread throughout the brain. However, unlike Gaillard et al., 2009, our study focused on electrodes in brain regions where the target and masked responses were clearly distinguishable, hence it is difficult to compare these results to the present findings.
Finally, a number of studies in behaving monkeys have similarly demonstrated highly nonlinear responses in the backward masking paradigm (Kovacs et al., 1995; Op de Beeck et al., 2007; Rolls and Tovee, 1994).
In the broader context, an analogous technique of dissociating stimulus inputs from perception using perceptual switches has been used both in human fMRI experiments (Hasson et al., 2001; Tong et al., 1998) and with behaving monkeys (Logothetis, 1998; Maier et al., 2008; Wilke et al., 2006). All these studies demonstrate tight correlation between neuronal activity and perceptual states. Taken together with the present results, the available evidence converges on the conclusion that rapid and persistent increases in neuronal activity in high-order visual cortex are a necessary component of the mechanism underlying the emergence of a conscious visual percept. Whether such nonlinear “ignitions” are also sufficient for perceptual awareness remains a difficult issue, the resolution of which will undoubtedly require a large body of future studies.
Recording of electrical activity was obtained from 11 neurosurgical patients (7 female) with pharmacologically intractable epilepsy, monitored for potential surgical treatment. Electrode location was based solely on clinical criteria. Each patient was implanted with subdural electrode arrays containing 40–80 contact electrodes (Adtech, Racine, WI). In total, 445 electrodes were examined. Electrodes were arranged in one-dimensional strips or in two-dimensional grids placed directly on the cortical surface. Each electrode was 2 mm in diameter, with 8 mm spacing between adjacent electrodes. Recordings were monopolar and were referenced to an extracranial electrode. The signal was sampled at a rate of 200 Hz and filtered electronically between 1 and 70 Hz (Grass Technologies). Stimulus-triggered electrical pulses were recorded along with the ECoG data for precise synchronization of the stimuli with the electrical responses.
All sessions were conducted at the patient’s quiet bedside while the patient was sitting upright in bed, after periods of at least 3 hr without any identifiable seizures. Stimuli were presented via a standard laptop screen, and verbal responses were recorded using a portable recording device.
Patient age was 31.5 ± 7.5 (mean ± SD). All patients functioned in the average to low-average general cognitive range (IQ range: 80–107, mean ± SD: 90.8 ± 10.2). All patients were regular recipients of standard medication for treatment of epilepsy (including: oxcarbazepine, benzodiazepine, phenazepam, topiramate, valproic acid), although during hospitalization, doses were typically lowered.
Patients provided written informed consent to participate in the experiment. The experimental protocol was approved by the Tel Aviv Sourasky Medical Center Committee for Activities Involving Human subjects.
Computed tomography (CT) scans following electrode implantation were coregistered to the preoperative MRI using iPlan Stereotaxy software (BrainLAB) to determine electrode positions. The three-dimensional brain image thus mounted with electrode locations was normalized to Talairach coordinates (Talairach and Tournoux, 1988) and rendered in BrainVoyager software in two dimensions as a surface mesh, enabling precise localization of the electrodes both with relation to the subject’s anatomical MRI scan and in standard coordinate space. For joint presentation of all subjects’ electrodes and to aid comparison to previous fMRI mapping performed in our lab (see below), electrode locations were projected onto a cortical reconstruction of a specific healthy subject, which is routinely used to visualize results in our mapping studies (Figure 1). The spatial coverage of the recording electrodes is described in the Supplemental Data.
Images were presented on a standard laptop display (60 Hz refresh rate). All images were grayscale, with a width of ~8°(700 pixels), and were superimposed with a small red fixation dot.
In the OC experiment (Privman et al., 2007), we used stationary images from six categories (faces, houses, man-made objects, cars, birds, and inverted faces). Each image was presented for 250 ms, followed by a gray screen for an interval of 750 ms. The subject’s task was to fixate on the central fixation dot and to overtly perform a one-back memory task.
In the BM experiment, each trial consisted of a target image belonging to one of three categories (faces, houses, and man-made objects), followed by a single mask stimulus. The subject’s task was to report the category of the objects they perceived in each trial.
To prevent category discrimination by low-level image contours, target images were blurred around the edges, along an elliptic profile, by Gaussian spatial smoothing. The mask image was a pattern consisting of high-contrast random elements. Exactly the same mask was used in all the trials.
The target image was shown for 16 ms at the start of each trial block. Between target and mask presentation, a blank gray screen was presented for a duration of 0–184 ms, yielding a varying stimulus onset asynchrony (SOA), defined as the time interval between target and mask onset, of 16–200 ms. The mask was presented for a total duration of 250 ms, followed by another blank screen lasting until the end of the trial block, for a total block duration of 3 s. The subject had to make the verbal response during this interval (see Figure 3).
Subjects were instructed to verbally report the category of the object they perceived in each trial and to report recognition failure (“unable to recognize”) when this occurred. Hence, there were four options: “face,” “house,” “object,” and “did not recognize.” The behavioral responses were manually extracted offline from the auditory soundtrack recording. A trial in which a correct response was given was counted as a “successfully recognized” trial, i.e., one in which target recognition had occurred. Successful recognition rates of the subjects under all conditions are presented in Table S3 and Figure S12. Overall, subjects had a significantly superior ability in recognizing the face stimuli (e.g., at 16 ms SOA average successful recognition was 57.2% for faces, and 22.3% for all other categories). In 79.8% of the no-recognition trials, subjects reported “unable to recognize” and in the rest (20.2%) they made an erroneous report by naming one of the remaining two categories. The breakup into the three possible response types in no-recognition trials is given in Table S3. Since it may be argued that the neural processing could differ between trials where no category was perceived and those in which the wrong one was perceived, we also performed the analysis shown in Figure 6 without taking into account the erroneously reported trials. The results were very similar to those of the original analysis (see Figure S15) and we therefore included these types of responses together.
Most sessions focused on the “critical” SOA for the subject, i.e., the integer multiple of 16.66 ms (the monitor refresh time) for which the rate of successful recognition of the optimal category (faces or man-made objects) was closest to 0.5. A small proportion (16%) of easy trials with a long (200 ms) SOA and relatively clear image were interspersed within the sequence of trials in each session. Also interspersed in each session were “blank” trials, in which a blank screen substituted the initial target. These were used as a control for the effect of the mask alone.
Each experimental session contained 100 trials, with a short pause for rest after each 20 trials. All sessions employed the same stimulus set of 46 (with two repetitions each) or 92 target images, shown in pseudorandom order. The eight remaining trials were blank and mask trials.
(A) “Task-related” electrodes were defined in the BM task, which included both visual stimulation and verbal responses. Quantitatively, they were defined as those electrodes that showed a significant gamma power increase at some time point in the interval 0–1000 ms post-target onset, compared to the baseline period, defined as the 500 ms preceding target onset, in trials with SOA of 200 ms.
(B) “Visually responsive” (Figure 1) electrodes were defined as those with short- to mid-latency responses (up to 250 ms post-target onset). Latency was estimated quantitatively as the time when gamma power first became significantly greater than its prestimulus baseline value. Categorization of the electrodes was always based on gamma BLP response, although a similar analysis was done for comparison for the evoked response (see text).
(C) “Low and high order”—visually responsive electrodes were further classified as belonging to low-order and high-order regions according to their Talairach-normalized location on the cortex with relation to multi-subject maps obtained in previous fMRI visual localizer experiments by our group (Levy et al., 2001). Electrodes falling within positive regions of the “pattern > object” BOLD contrast, corresponding to early visual cortex (see Figure 1), were defined as “low-order,” whereas electrodes located in the remaining visually responsive regions were labeled “high-order.” Both groups of electrodes were analyzed separately.
(D) “Target-selective” electrodes were defined as those electrodes whose gamma BLP response was significantly higher for at least one target category than for the mask, in a 200 ms time bin centered at the time point of maximal response in the interval 200–400 ms post-target onset (see Statistical Analysis in Experimental Procedures). Target-selective electrodes were examined separately from others for recognition effects (see Figure 4–Figure 6). All target-selective electrodes also belonged to the neuroanatomically identified high-order electrode category.
As an alternative, and an entirely independent, criterion for selecting the high-order selective electrodes, we used object selectivity in the OC experiment. Note that this criterion was based on data external to that of the BM experiment. An electrode was considered selective in this respect if it showed, during the OC experiment, a significantly (p < 0.01) higher gamma BLP response (one-tail two-sample t test) to one category than to the other two, in a 200 ms time bin centered at the time point of maximal response in the interval 200–400 ms post-target onset.
Additional details concerning data preprocessing are provided in the Supplemental Data.
For each of the target-selective electrodes, individually, mean evoked and gamma BLP responses were compared for recognition and no-recognition trials at the critical SOA (t test, see below, and see Figures 4A and 4C). Temporal smoothing of 200 ms was first applied to the gamma BLP. In addition, we calculated the mean responses to the same image sets in all “recognition” trials with SOA of 200 ms (Figures 4B and 4D).
The effect of recognition was also examined at the population level for target-selective electrodes (divided into face- and man-made object electrodes, n = 14 and n = 5, respectively; see Figures 6 and Figure S7. For same analysis based on object selectivity, as discussed above, see Figure S13). In this analysis, the recognition and no-recognition responses were averaged per electrode in such a way as to balance the two trial sets in terms of stimulus exemplars, while collapsing the measures across different SOAs as follows: Only images which appeared in at least one recognition and one no-recognition trial at the critical SOA and, if required, the next largest SOA available, were considered. From each response we subtracted the mean blank + mask response for its SOA. All trials were averaged to obtain the mean recognition and no-recognition responses for that exemplar. All exemplars were then averaged per condition, yielding an average “recognition” and “no-recognition” response per electrode.
As a control for the effect of the slightly different mean SOAs between the two conditions, we compared mean responses, but this time only for the recognition trials, for the different SOAs that contributed to the “stimulus-balanced” means of Figures 6A and 6C. No significant differences between recognition trials with different SOAs were found at the time points corresponding to the recognition effect (Figures 6B and 6D).
In all cases, statistical significance was determined by means of the Student’s t test. Where the actual timing of the neural event was important (defining task-related response and its latency, effect of recognition), the comparisons were made per individual time point, whereas in the remaining cases (testing for category selectivity) they were carried out on gamma BLP averaged over a time bin of 200 ms, centered on the time of maximal response for the specific electrode between 200 and 400 ms after stimulus onset.
Tests for increase of gamma BLP levels were one-tailed, whereas tests for change of evoked amplitude were two-tailed. In tests for electrode response versus baseline, independent one-sample t test was used. In the test for target selectivity, target category and mask response were compared using the independent two-sample t test. In testing for effect of recognition in the individual target-selective electrodes (including the long-SOA control),we used the independent two-sample t test, whereas at the population level, recognition and no-recognition responses were compared across electrodes using the paired two-sample t test. The individual-SOA comparison control (see Figures 6B and 6D) was done using the independent two-sample t test.
α was set as 0.01 for all tests. Where necessary, Bonferroni correction was applied to account for comparisons over multiple target categories, SOAs and electrodes.
For the shuffle control, see Supplemental Data.
For comparison of known early and high-order visual areas, we have used our maps obtained from previous experiments using face, man-made object, and house images for category-selective regions (Levy et al., 2001) and retinotopic stimuli (vertical and horizontal meridian stimulations) for delineating the borders of early retinotopic areas (see Hasson et al., 2003, for details).
We thank the participants for volunteering to take part in the study and D. Yossef, S. Nagar, R. Cohen, C. Yosef, G. Yehezkel, and the EEG technicians for assistance at the Tel Aviv Medical Center. This study was funded by the Israel Science Foundation 160/07 and Minerva grants to R.M. and by the WIS-Ichilov fund to R.M. and I.F.
Supplemental Data include Supplemental Experimental Procedures, References, three tables, and 15 figures and can be found with this article online at http://www.cell.com/neuron/supplemental/S0896-6273(09)00883-6).