|Home | About | Journals | Submit | Contact Us | Français|
Multistable perception occurs when a single physical stimulus leads to two or more distinct percepts that spontaneously switch (reverse). Previous ERP studies have reported reversal negativities and late positive components associated with perceptual reversals. The goal of the current study was to localize the neural generators of the reversal ERP components in order to evaluate their correspondence with previous fMRI results and to better understand their functional significance. A Necker-type stimulus was presented for brief intervals while subjects indicated their perceptions. Local auto-regressive average source analyses and dipole modeling indicated that sources for the reversal negativity were located in inferior occipital-temporal cortex. Generators of the late positive component were estimated to reside in inferior temporal and superior parietal regions.
Ambiguous figures are physically unchanging images that can be alternately perceived in two or more different ways. One of the best known is the Necker cube, which can be perceived as a three-dimensional (3D) cube facing either leftward or rightward. Such figures are experimentally useful for analyzing the neural basis of perceptual experience because sensory input remains fixed while distinct perceptual changes (“reversals”) occur.
Recordings of event-related potentials (ERPs) have identified two components associated with perceptual reversals, the “reversal negativity” (RN) and the “late positive component” (LPC; Basar-Eroglu, Struber, Stadler, Kruse, & Basar, 1993; Britz, Landis, & Michel, 2009; Isoglu-Alkac et al., 1998; Kornmeier & Bach, 2004, 2005, 2006; Kornmeier, Ehm, Bigalke, & Bach, 2007; O’Donnell, Hendler, & Squires, 1988; Pitts, Gavin, & Nerger, 2008; Pitts, Nerger, & Davis, 2007; Struber, Basar-Eroglu,Miener, & Stadler, 2001). The RN component, which is maximal over parietal-occipital scalp regions, begins at ~170 ms after the stimulus, peaks at 250 ms, and persists until ~350 ms. This component can be isolated by presenting an ambiguous figure intermittently, time-locking ERP recordings to stimulus onset, and comparing trials with the same percept as on the previous trial (stable) to those with the alternate percept (reversal). The LPC, which has a central/parietal scalp distribution, begins at approximately 300 ms, peaks around 450 ms, and persists beyond 550 ms. The LPC was initially reported in studies that presented sustained ambiguous figures and time-locked recordings to subjects’ motor responses indicating perceptual reversal (e.g., Basar-Eroglu et al., 1993), but it was also evident in tasks employing the intermittent presentation method (e.g., Pitts et al., 2008).
Although the ERP studies discussed above have consistently identified the RN as the earliest reversal-related component, the underlying neural generators have not been localized. FMRI studies using similar tasks have reported activations in posterior parietal and/or ventral occipital-temporal cortex associated with perceptual reversals (Inui et al., 2000; Kleinschmidt, Buchel, Zeki, & Frackowiak, 1998; Slotnick & Yantis, 2005). On the basis of the parietal-occipital scalp topography (voltage distribution) of the RN, we hypothesized that neuroanatomical sources might be located in one of these two (parietal or occipital-temporal) regions.
Estimating locations of neural generators of ERP components is often referred to as the “inverse problem,” because one must use the electrical potentials that are directly measureable at the scalp to estimate the neural sources of these signals (Silva & Foreid, 2005). Although, mathematically, any set of scalp potentials has an infinite number of “inverse solutions,” by incorporating a few basic assumptions about the sources and the volume conductor it is possible to derive a best-fit solution for a given set of ERPs (Michel et al., 2004). Currently, dipole modeling is the most widely used approach for deriving such solutions. A commonly used algorithm for dipole modeling is the Brain Electrical Source Analysis (BESA; Scherg & Picton, 1991). BESA assumes that sources are pointlike current dipoles. For a given time window, dipole locations and orientations are determined through an iterative process that minimizes the difference between the ERP scalp distribution and the potentials that would be produced by proposed sources (Grave de Peralta Menendez, Gonzalez, Lantz, Michel, & Landis, 2001). The BESA approach has limitations, however, namely, the number of sources for a given ERP component must be known a priori and the source space is assumed to be a nondiscrete sphere that does not incorporate neuroanatomical information.
Another approach for solving the inverse problem is distributed source modeling (Michel et al., 2004). Distributed source estimation is useful when sources are likely to occupy an extended cortical region as opposed to a single point. To utilize this approach one must apply a mathematical model to relate the measured scalp potentials to source intensities within a source space. The source space is a dense 3D grid of fixed point sources at all possible locations in which only orientation and source strength are free to vary. With this approach, no a priori assumptions about the number of sources are necessary. Because each point in the solution space is assigned a source intensity value, it is possible to carry out statistical analyses across individual subjects to determine the significance of the estimated sources. Limitations of the distributed source approach depend on the specific algorithms employed and the constraints incorporated into each algorithm (Michel et al., 2004). Minimum norm estimation, for example, is limited by its sensitivity to noise and its tendency to favor surface, as opposed to deep, neural sources (Silva & Foreid, 2005). Local Autoregressive Averaging (LAURA) source estimation is a type of minimum norm approach that incorporates biophysical constraints (e.g., the known locations of gray matter) into its solution space, includes regularization algorithms for noise estimation, and corrects for the surface-over-deep source bias by incorporating coefficients based on the laws of signal falloff over distances (for details, see Grave de Peralta Menendez et al., 2001; Grave de Peralta Menendez, Murray, Michel, Martuzzi, & Gonzalez, 2004; Michel et al., 2001, 2004).
In the current study, the neuroanatomical generators of the RN and LPC components were estimated by employing a combination of dipole and LAURA source localization techniques. Combining these two techniques allowed us to capitalize on their individual strengths and cross-validate their solutions. Specifically, LAURA solutions for the RN and LPC components were calculated for individual subjects and were used to test the statistical significance of the estimated neural sources. The LAURA solutions were then utilized to constrain the BESA dipole models.
Twenty-four people (16 women; ages 18–46 years, mean age 21 years) participated in this experiment. Eye dominance was determined via simple dichoptic tests, and visual acuity was assessed with a high-contrast Bailey–Lovie acuity chart. Visual acuity for all participants was 20/40 or better. All procedures adhered to federal regulations and were approved by the Colorado State University institutional review board; written informed consent was obtained from each person prior to participation in the experiment.
A modified version of the Necker cube (see Slotnick & Yantis, 2005) served as the stimulus for the experiment (Figure 1). The image of the cube subtended a 4° × 1.5° viewing angle and was centrally presented on a computer monitor with a frame rate of 85 Hz. Stimuli were viewed monocularly with the dominant eye to eliminate binocular depth cues that can occasionally lead to “flatter” appearances of these two-dimensional stimuli. Participants maintained their gaze on a small (0.2°) centrally located fixation cross that was visible throughout all stimulus presentations and blank interstimulus intervals (ISIs). Stimuli were presented with 800 ms durations and fixed (500 ms) ISIs (Figure 1). The stimulus durations were deliberately kept short to ensure that, on a given trial, only one perception was experienced. ISIs were set at 500 ms to ensure that reversal rates were comparable to those found under continuous viewing conditions (see Klink et al., 2008; Kornmeier & Bach, 2004). A single run consisted of 50 trials and lasted 1 min 5 s each.
For each trial, subjects were instructed to make a button-press response with either their left or right thumb to indicate which direction the Necker cube appeared to be facing. Half of the subjects responded with the left/right hand for left/right facing cubes and half responded with the right/left hand for left/right facing cubes. The subjects’ task was to respond as soon as they were confident of their perception. Because prolonged viewing of Necker cube stimuli occasionally results in “flat” or “in-between” perceptions, subjects were given the option of not responding if they were ever unsure of their perception. The nonresponse trials were excluded from further analysis.
Participants were comfortably seated 1.4 m from the computer monitor. Prior to any EEG recordings, subjects practiced viewing the stimulus. If a subject was initially unable to perceive both left and right facing cubes, the experimenter helped guide the subject by tracing the outline of the two possible “near-faces” on the computer monitor until the subject could easily perceive both configurations. Subjects then practiced reporting their perceptions during four 1-min blocks. The practice trials helped familiarize subjects with the timing of stimulus presentation, the importance of fixating on the fixation cross, and the operation of the response box.
Following the practice trials, 22 experimental blocks were administered while the continuous electroencephalogram (EEG) was recorded, resulting in a total of 1,100 trials per subject. Short breaks after each block helped to alleviate subject fatigue. Each experimental session, including EEG preparation, lasted approximately 1 h.
The EEG was recorded using a 128-channel Geodesic EEG System, NetAmps 200 (Electrical Geodesics Inc.). Each carbon-fiber electrode consisted of a silver-chloride carbon fiber pellet, a lead wire, a gold plated pin, and a potassium chloride-soaked sponge. Horizontal and vertical electrooculograms (EOGs) were recorded by means of electrodes at the left and right external canthi and electrodes below each eye, respectively. A vertex electrode served as the reference for all scalp and EOG channels during the recordings. The experimenter individually adjusted each of the sensors until its impedance was less than 50 kω. Impedances were checked halfway through the experiment and were adjusted if necessary. Analog voltages were amplified with a gain of 10,000, hardware bandpass-filtered at 0.1–100 Hz, and digitized at a sampling rate of 500 Hz.
ERPs were time-locked to stimulus onset, baseline corrected at −200 to 0 ms, and low-pass filtered at 50 Hz. Trials were discarded from analysis if they contained an eyeblink or eye movement artifact (EOG > 70 µV) or more than 20% of electrode channels exceeded defined signal amplitudes (sustained amplitude > 200 µVor transient amplitude > 100 µV). On average, 12%of trials per individual were rejected due to a combination of these artifacts. Averaged-mastoid-referenced ERPs were computed for each channel by calculating the differences between each channel and an average of the left and right mastoid channels.
Recordings were sorted according to the reported perception on each trial and the reported perception on the preceding trial. Trials in which the reported perception was the same as the preceding trial were classified as “stable” trials, and trials in which the reported perception was different from the preceding trial were classified as “reversal” trials.
Separate analyses of variance (ANOVAs) were carried out within the latency windows corresponding to the RN and LPC to test for significant voltage deviations relative to the mean baseline voltage. For each electrode in left (50, 57, 58[P7], 59, 63, 64[P9], 65[PO7], 66, 69, 70[O1]) and right (101, 100, 96[P8], 91, 99, 95[P10], 90[PO8], 94, 89, 83[O2]) parietal-occipital regions, the RN was measured as the mean amplitude across a 230–280-ms time window. The LPC was measured as the mean amplitude across a 400–470-ms interval for electrodes in central scalp locations (6[FCZ], 7, 106, 37[CP1], 31, 129[CZ], 80, 87[CP2], 54, 55[CPZ], 79, 62[PZ]). For each component, the effects of perceptual reversals were assessed in ANOVAs with within-subject factors of Perception (reverse vs. stable trials), Electrode Channel (channels listed above), and, for the RN, Hemisphere (left vs. right) and a between-subject factor of Response Hand (left/right vs. right/left).
The neural generators of the RN and LPC components in the grand average difference wave (ERP to reversal trials minus ERP to stable trials) were modeled using a minimum-norm linear inverse solution approach: LAURA (Grave de Peralta Menendez et al., 2004). The difference wave voltages from 125 scalp locations were interpolated to 111 channels and served as the basis of the inverse solutions. The LAURA solution space included 4,024 evenly spaced nodes (6 mm3 spacing), restricted to the gray matter of the Montreal Neurological Institute’s average brain. No a priori assumptions were made regarding the number or location of active sources. Time windows for estimating the sources of the RN and LPC components were the same as in the ERP statistical analyses. The principal cortical regions identified by LAURA for each component served as regions of interest (ROIs) for subsequent analyses (see below).
Dipole modeling (BESA 2000, version 5) was carried out on the grand average difference waves (reversal minus stable). The BESA algorithm calculates the scalp distribution that would be obtained for a given model (forward solution) and compares it to the actual ERP scalp distributions (Scherg & Picton, 1991). The algorithm interactively adjusts (fits) the location and/or orientation of the dipole sources in order to minimize the residual variance (RV) between the model and the observed spatiotemporal ERP distribution. The strategy used here was to seed mirror-symmetric pairs of dipoles at the locations of maximum source intensity in the grand average LAURA solutions. One pair of sources was identified in the RN interval and two pairs in the LPC interval. Accordingly, the final model consisted of one pair of dipoles for the RN and two pairs of dipoles for the LPC. After seeding the dipoles at these locations, they were fit in orientation, within the same time intervals as those used for LAURA, in order to account for the RN and LPC in the difference waves.
LAURA solutions were also computed for each individual subject’s difference waves (across the same time windows as the grand average source models), transformed into the standardized coordinate system of Talairach and Tournoux (1988) and exported into the AFNI software package (Cox, 1996) for statistical analyses. Source intensity values for the 4024 nodes were transposed into a 91 × 91 × 109 voxel space (2 mm3 voxel sizes). One-tailed t-tests were conducted over voxels within the ROIs identified in the grand average solutions (two for the RN; four for the LPC) using an alpha threshold level of p < 1 × 10−8 to correct for multiple significance testing. For visualization purposes, statistical maps were projected onto a structural brain image supplied by MRIcro (Rorden & Brett, 2000). The centers of mass and mean t values were calculated for each ROI.
Reversals occurred on average every 3.41 s (SEM = 0.26) or every 2.62 stimulus presentations (SEM = 0.20). The right-face near perception was slightly dominant on average (53%) compared to left-face near (47%). After artifact rejection, the mean number of trials used to derive each ERP for each subject was 355 (SEM = 22). The mean reaction time (i.e., time from stimulus onset to reporting of perceptions) was 587 ms (SEM = 35 ms).
The RN and LPC occurred during time windows and at scalp locations consistent with previous reports (e.g., Pitts et al., 2008). Figure 2 shows ERPs and difference waves for both components. The RN was elicited over 200–400 ms with maximal amplitudes at 230–280 ms; the LPC was elicited over 300–600 ms with maximal amplitudes at 400–470 ms. For the RN, there was a significant main effect of perception (reversal vs. stable, see Table 1) and no main effect or interactions involving the response hand factor. The Perception × Hemisphere interaction was also significant because the RN was larger over the right parietal-occipital scalp compared to the left (see Table 1). For the LPC, an ANOVA revealed a highly significant main effect of perception and no Perception × Response Hand interaction. There was a main effect of response hand for the LPC, F(1,22) = 5.56, p = .028. Responding with the left hand to left-face-near perceptions (and vice versa) led to slightly larger amplitudes than responding with the left hand to right-face-near perceptions (and vice versa). The lack of interaction between the response hand and perception factors, however, justified collapsing across response hand for source analysis. The mean amplitudes, F, and p values for the main effects of perception for both components and the Perception × Hemisphere interaction for the RN are provided in Table 1.
LAURA source estimations, based on the grand average difference waves, revealed a strong source in the right inferior occipital-temporal/fusiform gyrus region and a weaker source in the same region in the left hemisphere during the latency of the RN component. A pair of symmetric dipoles placed at these locations and fit for orientation accounted for 91% of the variance over the interval of 230–280 ms. When the source strengths estimated by LAURA were tested across individual subjects, the RN sources were significant in both right and left hemisphere ROIs.
LAURA sources for the LPC component in the grand average were located in bilateral anterior/inferior temporal regions and superior parietal regions. Two pairs of symmetric dipoles placed at the centers of these LAURA sources and fit for orientation accounted for 95% of the variance over the interval of 400–470 ms. Statistical analyses of source strengths estimated by LAURA for each ROI confirmed their significance. Overall, the six-dipole model, including the dipoles fit to the RN and LPC, accounted for 93% of the variance across a broad (200–500 ms) time window. Figures 3 (RN) and 4 (LPC) show the LAURA grand average solutions, orientations of the dipoles, and source waveforms for each dipole. The Talairach coordinates for the centers of mass of the LAURA sources for the RN and LPC components in the grand average waveforms as well as the mean t values and associated alpha levels for LAURA statistical analyses are given in Table 2 and Table 3, respectively.
The results of this study provide evidence that a distinct electrophysiological configuration associated with perceptual reversals (the reversal negativity; RN peak at 230–280 ms) is generated by sources in ventral occipital-temporal cortex. This finding suggests a relationship between previous ERP experiments reporting an early (~170–370 ms) negativity over parietal-occipital scalp regions (Kornmeier & Bach, 2004, 2005, 2006; Britz et al., 2009; Kornmeier et al., 2007; Pitts et al., 2007, 2008) and fMRI studies reporting both posterior parietal and ventral occipital-temporal activations (Inui et al., 2000; Kleinschmidt et al., 1998; Slotnick & Yantis, 2005) associated with perceptual reversals. Due to the temporal imprecision of fMRI data and the spatial imprecision of ERP data, it was previously unclear how these two sets of results fit together. By employing inverse solutions to estimate the locations of the neural generators of ERP difference wave scalp topographies, the current study identified a right-greater-than-left inferior occipital-temporal source for the RN component. This source location matches one of the two areas identified in the previous fMRI studies (Inui et al., 2000; Kleinschmidt et al., 1998; Slotnick & Yantis, 2005). Although ERP source localization involves certain limitations (described in the introduction section), techniques have improved in recent years, making it possible to better estimate the locations of neural generators of ERP components.
The intracranial sources of the second component commonly reported in ERP studies of multistable perception, the late positive component, were, however, more difficult to estimate. This component has been described in previous reports as resembling the well-known P300 component (O’Donnell et al., 1988; Pitts et al., 2008; Struber et al., 2001). Although the neural generators of the visual P300 have proven difficult to localize, various source analysis techniques have suggested that it may stem from both parietal and inferior temporal sources (Bledowski et al., 2004; Halgren, Marinkovic, & Chauvel, 1998; Knight & Scabini, 1998; Moores et al., 2003; Yamazaki, Kamijo, Kiyuna, Takaki, & Kuroiwa, 2001). The results of our LAURA source estimation for the LPC in the current paradigm also suggest sources in superior parietal and inferior temporal regions. To better evaluate the sources of the LPC, future studies should directly manipulate and isolate potential contributing factors such as reversal probability, trial-by-trial reversal/stability sequences, and task relevance (discussed below).
Although psychologists and philosophers have been interested in multistable perception for many years, the neural networks underlying this complex phenomenon are still poorly understood. An important contribution from top-down attention mechanisms, mediated by fronto-parietal networks, has recently been inferred from brain imaging (Slotnick & Yantis, 2005) electrophysiological (Pitts et al., 2008), and neuropsychological (Windmann, Wehrmann, Calabrese, & Gunturkun, 2006) investigations. In each of these studies, subjects were asked to “voluntarily” switch their perceptions while viewing various ambiguous figures. In one study (Windmann et al., 2006), it was found that patients with prefrontal cortical lesions were unable to “speed-up” perceptual reversals voluntarily compared to matched control subjects (see also Meenan & Miller, 1994; Ricci & Blundo, 1990). In another study, Slotnick and Yantis acquired fMRI in normal volunteers while they either voluntarily perceived a specific orientation of the Necker cube or voluntarily attended to objects in locations equivalent to the end faces of the cube. Similar cortical regions were activated during perceptual and attentional transitions, including medial frontal, posterior parietal, and ventral extrastriate regions. Slotnick and Yantis argued that, although the timing of these activations could not be determined from fMRI data alone, it was likely that the parietal activity was associated with the attentional shift and preceded the more ventral extrastriate activity. The ventral extrastriate regions, in this view, are the recipients of the attention-shift biasing signal, whereas fronto-parietal networks are the source (Serences, Schwarzbach, Courtney, Golay, & Yantis, 2004). Data from the current study suggest that the RN is generated in ventral extrastriate cortex. Therefore the RN is not likely to reflect the attention-shift itself, but may be associated with the receipt of attention-biasing signals.
In the current study, however, subjects were asked to not voluntarily switch their perceptions but rather to simply allow the reversals to occur spontaneously. The role of involuntary attention shifts in multistable perception is currently unclear, although Leopold and Logothetis (1999) speculate that involuntary perceptual reversals share much of the same neural circuitry as voluntary perceptual reversals and that both are based on top-down systems. Data from fMRI studies seem to support this theory (Kleinschmidt et al., 1998; Slotnick & Yantis, 2005). Similar patterns of activity in frontal, parietal, and ventral occipital-temporal regions were identified by Slotnick and Yantis in a voluntary reversal task and by Kleinschmidt et al. in an involuntary reversal task. In a recent ERP study, Pitts et al. (2008) manipulated voluntary control over perceptual reversals and found that the RN was generated by both voluntary and involuntary reversals, its amplitude being enhanced under the voluntary reversal condition. Taken together, these results suggest that the occipital-temporal RN generators are activated by changes in perception of ambiguous figures, which in turn may be influenced by both voluntary and involuntary shifts in attention.
Kornmeier and Bach (2004) found that, in addition to endogenous reversals of an ambiguous figure, the RN component was generated when reversals were “forced” by presenting unambiguous stimulus variants. Although this finding may initially seem to suggest that RN generation is attention independent, in this task subjects were still required to discriminate unambiguous cubes in order to report reversals. Under these conditions, attention may be exogenously drawn toward stimulus features that favor one of the two perceptual interpretations. For example, brighter or occluding cube edges may attract attention and be perceived as “near,” whereas dimmer or occluded edges do not attract attention and will be perceived as “far.” Alternatively, it is possible that attention shifts mediate endogenous reversals of ambiguous figures but not exogenous reversals of unambiguous figures. The RN, in this view, still reflects the perceptual transitions that occur for both types of figures, but in the case of exogenous reversals, no input from the fronto-parietal attention network would be necessary. Future studies could further evaluate these theories by testing whether the RN is generated by reversals of unambiguous figures when subjects’ attention is distracted by a concurrent attentionally demanding task. According to the attention-shifting account, the RN should not be generated under such conditions. The alternative view predicts RN generation even under conditions of high attentional load.
An often-cited alternative to the attention-shifting hypothesis is the “neural satiation” (or “adaptation”) hypothesis (Long & Toppino, 2004). According to this view, prolonged viewing of an ambiguous figure results in neural fatigue specific to networks underlying the representation of one of the possible perceptions. When this fatigue reaches a critical threshold, networks mediating the competing representation suddenly dominate, resulting in a perceptual reversal. A competitive adaptation-recovery cycle between the two representational networks is then instantiated. Although these two interpretations (attention-based vs. neural-satiation-based) seem to be at odds, Long and Toppino convincingly argue for a hybrid theory of multistable perception. In this theory, low-level sensory factors interact with high-level cognitive factors at intermediate levels of perceptual processing. The RN component with its ventral occipital-temporal generators may well be situated at such an intermediate level. RN generation would therefore be expected under passive reversal conditions, while also being present and susceptible to modifications (i.e., amplitude enhancements) under voluntary control conditions. What remains to be resolved is whether spontaneous reversals are due to low-level adaptation-recovery cycles or involuntary shifts in attention or an interaction between the two. The RN component may ultimately prove useful in addressing these lingering questions.
If the RN reflects a transient, stimulus-locked enhancement of perceptual salience and this enhancement is ultimately controlled by fronto-parietal attention networks, then why were we unable to detect earlier electrophysiological differences generated by frontal or parietal sources? One possibility is that the shift in attention (whether voluntary or involuntary) is not time-locked to stimulus onset and therefore would not appear in the averaged waveforms. In this view, the shift in attention could occur at any time between one perception and another, that is, during the ISI or during the initial phase of the next stimulus presentation. This attention shift would nevertheless result in a perceptual reversal, and the component representing this change (the RN) is time-locked to stimulus onset.
To measure such a non-timed-locked component, a different experimental or analysis technique may be required. Recently, using an almost identical experimental design (intermittent Necker cube presentation, 800 ms on, 600 ms off, spontaneous reversals), Britz et al. (2009) employed single-trial EEG spatial mapping techniques and found differences in the spatial configurations of scalp voltage maps immediately preceding stimulus onset in reversal versus stability trials. Source analyses of these voltage maps indicated reversal-related generators in right inferior parietal cortex. It is likely that this inferior parietal activity is directly associated with involuntary attention shifts that, in turn, bias perceptual representations in occipital-temporal regions.
Regardless of the eventual explanation of the role of attention shifts in perceptual reversals, data from the current study strongly suggest that the RN is generated by ventral stream mechanisms. When incorporated with analogous data from the fMRI literature as well as recent EEG findings (Britz et al., 2009), it seems likely that the RN generators may be modulated by the fronto-parietal attention network. This network controls shifts in attention (both voluntary and involuntary), which leads to changes in perceived object configuration and ultimately to perceptual reversals.
In most previous multistable perception ERP paradigms, subjects were asked to press a button whenever perceptual reversals occurred and to not respond when perceptual stability occurred (Britz et al., 2009; Kornmeier & Bach, 2004; Pitts et al., 2007, 2008; Struber et al., 2001). Kornmeier and Bach counterbalanced their response task so that, on half of the trials, subjects pressed a button to indicate reversals and, on the other half of the trials, to indicate stability. When analyzed separately, no ERP differences were found between the two response tasks, and the RN was generated in both cases (Kornmeier & Bach, 2004). Regardless of whether reversals or stability are indicated by a button press, this type of task requires the subject to hold the last perception in memory and compare the subsequent perception with the memory trace in order to make a response. Results from the current study suggest that the RN component is not dependent on such comparisons or on selectively responding to reversals (or stability). Subjects in the current study were simply asked to report which percept they experienced on each trial. Even under these conditions, the RN was still generated.
It is the case, however, that, when subjects perceived a reversal, they had to prepare and execute a response different from that of the previous trials. In the current study, as well as in previous studies, comparisons between reversal versus stable ERPs inevitably involved concurrent comparisons between response switching and no response switching. Although motor response switching may contribute to ERP modulations, components associated with such switches usually occur later in time (>300 ms), are recorded at fronto-central scalp regions, and are estimated to be generated in frontal and anterior cingulate cortices (Swainson et al., 2003). In contrast, the RN is elicited at an earlier latency (by ~170 ms), recorded over occipital-parietal regions, and, as our findings suggest, generated in ventral occipital-temporal cortex.
A related argument can be made for the survival of the LPC in the current perceptual-reporting task. Using the reversal response task, Pitts et al. (2008) speculated that the LPC might represent visual short-term memory (VSTM) updating; that is, every time a reversal occurred, VSTM required updating, and this process might generate the LPC. The task employed in the current paradigm, however, did not require subjects to maintain any explicit information pertaining to their previous perception, and the LPC was still generated. VSTM may nevertheless be involved in LPC generation, possibly to automatically update information relevant to behavioral responses, but an explicit comparison between one’s current perception and remembered perception is not a requirement for LPC generation.
Because spontaneous reversals inevitably occur with lower frequency than stable trials and are unpredictable, the LPC might be related to perceptual novelty detection or the categorization of events based on contextual expectancies. Numerous studies have reported enhanced P300 amplitudes in response to lower probability stimuli and/or lower perceptual expectancies (Courchesne, Hillyard, & Galambos, 1975; Donchin & Coles, 1988; Kok, 2001; Spencer, Dien, & Donchin, 2001; Squires, Wickens, Squires, & Donchin, 1976). This interpretation suggests a perceptual monitoring mechanism that works to automatically detect (and categorize) potentially interesting or behaviorally relevant novelties. If the LPC and the visual P300 (specifically the P3b) are related, multiple LPC generators in inferior temporal and parietal regions would be expected, as found in previous P3b source analyses (Halgren et al., 1998; Menon, Ford, Lim, Glover, & Pfefferbaum, 1997; Moores et al., 2003; Yamazaki et al., 2001). For example, Bledowski et al. (2004) identified an inferior temporal source for the P3b, and argued that this generator is most likely associated with the categorization of visual stimuli.
In multistable perception tasks, regardless of whether the task is to report perceptual reversals across trials or to report which percept one experiences on each individual trial, a visual categorization process is required. Bledowski et al. (2004) suggested that parietal activity reflected in the P3b might represent goal-directed attention and/or visuomotor integration. Both of these processes are likely to be involved in perceptual reversal tasks, implying an additional role for visual attention, albeit at a later stage of stimulus evaluation. In this view, the RN represents the outcome of an attention-biasing signal, which influences how the figure is disambiguated, whereas the LPC represents a later processing stage in which attention is required to determine whether a reversal occurred in order to decide how one should respond.
Perceptual reversals are consistently associated with two ERP components, the reversal negativity and late positive component. Source localization analyses suggest that the RN is generated by ventral occipital-temporal sources. These results, along with previous fMRI and EEG studies, suggest that the RN represents the receiving end of attention-biasing signals. The LPC component is most likely associated with postperceptual processes such as novelty detection, categorization, and/or visual-motor integration and may share some of the same generators and functional characteristics as the visual P3b component.
This work was supported in part by NIH Grants 5 T32 MH20002 & 2 R01 EY016984-35. The Cartool software (http://brainmapping.unige.ch/Cartool.php) used for LAURA source analyses was programmed by Denis Brunet, from the Functional Brain Mapping Laboratory, Geneva, Switzerland, and is supported by the Center for Biomedical Imaging (CIBM) of Geneva and Lausanne.