|Home | About | Journals | Submit | Contact Us | Français|
The event related potential (ERP) technique uses a process of simple averaging to minimize noise in continuous electroencephalography (EEG) recordings. It relies on the assumption that incidental voltage fluctuations occur independently of trial timing. Using ERPs, Miltner and colleagues (1997) discovered a negative voltage deflection that occurs when humans receive task-related feedback, indicating errors or failures to obtain rewards. This feedback-related negativity (FRN) is similar to a second ERP component, called the error related negativity (Falkenstein et al., 1990; Gehring et al., 1993). Holroyd and Coles' (2002) reinforcement learning theory suggests that intrinsic signaling of the mesocortical dopamine system gives rise to both the FRN and error related negativity.
To date, the lack of a clear animal model of error and reinforcement ERPs has been a major hindrance. Work carried out with human subjects has obvious limitations. If error and feedback ERPs could be observed in an animal model, many new corridors for research would be opened. Researchers would be able to use operant conditioning paradigms to describe relationships between these ERP components and primary reinforcement. They would be able to carry out invasive pharmacological and electrophysiological manipulations in order to tease these components apart at systems and cellular levels. And they would be able to perform intracranial recordings to place these signals in context of the existing neurobiological error and reinforcement monitoring literature (e.g. Stuphorn et al., 2000; Ito et al., 2003; Amiez et al., 2005; Sallet et al., 2007; Emeric et al., 2008).
In a recent paper in the Journal of Neuroscience,Vezoli and Procyk (2009) claim to have discovered the first such animal model, a homologue of the FRN in nonhuman primates. To carry out this investigation, the authors adopted a problem-solving task in which monkeys chose one of four stimulus locations in search of a liquid reward. In brief, 4 disk-shaped stimuli were presented in the corners of a touch screen after each trial onset. Selection of a certain disk location led to delivery of reward. After discovering the rewarded disk location, monkeys were allowed to select it several times in order to obtain additional rewards. Thus, monkeys underwent a “search period” during which they guessed the rewarded stimulus location and a “repetition period” during which they reaped rewards from that location. In the version of the task used to record physiological data, visual feedback was presented to inform monkeys that their selection would or would not be rewarded providing a necessary control on low level sensory processing of the liquid reward itself.
In the interest of brevity, I will omit behavioral results and focus on Vezoli and Procyk's electrophysiological findings. With the exception of one control, all electrophysiological data were collected from the 500 ms period following visual feedback. The main reported finding was a significant difference in ERP waveforms when negative vs. positive feedback was delivered. Viewed as a difference wave, the putative feedback related potential appeared as a positive deflection peaking at ~170 ms followed by a negative deflection peaking ~300 ms. The positive deflection may have had a fronto-medial distribution, but the electrode placement precludes definite conclusions; the deflection was maximal at the posterior portion of the recording site. The reported difference in ERPs was present after 7 months of testing, albeit at slightly different latencies, and it was attenuated after systemic administration of the dopamine antagonist haloperidol. The latter observation is important, because it suggests that the putative feedback related components may depend on intact dopaminergic signaling, placing them within the framework of the reinforcement learning theory and linking them to the FRN described in the human literature. Additionally, a small difference was noted between feedback-related components on the first correct response at the close of the search period, and on consecutive correct responses during the repetition period. The putative feedback-related components appeared to be slightly larger during the repetition period. As the authors note, this observation runs contrary to predictions made by the reinforcement learning theory, but it only reached statistical significance in a single subject and was inconsistent across recording sessions. Below, I will demonstrate that the authors' behavioral task and recording techniques have left open the possibility that artifacts may obscure task related ERP components. I will then demonstrate how one type of artifact in particular, the electrooculogram (EOG) can provide a parsimonious explanation for all findings cited.
Several possible artifacts render these findings difficult to evaluate with confidence. First, consider the feedback stimuli. Green disks always informed the animals of upcoming reward. Red disks or yellow stars informed the animal that reward would not be obtained. Different patterns of visually evoked ERP components may have been elicited by different colored stimuli, regardless of their associated reinforcement valence. No mention is made that the stimuli were matched for luminance. The authors provide some evidence against this interpretation by demonstrating context-dependent effects of the negative feedback, but this logic is not unassailable. A more appropriate control would have been to counterbalance the stimuli such that green disks represented negative outcomes during half of the sessions. Then, by collapsing the grand average waveforms across visual feedback conditions, the potential for a visual confound would have been removed.
A more serious potential confound arises from a failure to control for eye movements after the presentation of visual feedback. The retinal potential is a large voltage asymmetry that can cause biphasic EOG artifacts associated with eye movements in EEG recordings. These artifacts can often be an order of magnitude larger than neural signals of interest and are almost always controlled for in experiments with humans (Luck 2005). Since Vezoli and Procyk present findings concerned with the difference between two conditions, we may imagine that EOG artifacts present little or no difficulty. To be precise, if we assume that eye movements were made in the positive and negative feedback conditions with the same average and variance of latency, amplitude, and direction, we can conclude that the EOG contribution will be similar with positive and negative feedback. In this case it will simply be subtracted leaving only neural data in the difference wave. However, these assumptions are likely to be false for several reasons. First, during the training monkeys received to complete the previous version of the task, a break in fixation caused a loss of reward. This probably caused residual fixation behavior on positive feedback trials. But all incentive to fixate would have been lost with presentation of the feedback on negative trials, leaving the monkey free to shift gaze back to the location of the fixation point, or to look around the room or at the experimenter in frustration. Second, we have investigated eye movement behavior following responses in another task and have found monkeys tend to make more eye movements of larger amplitude following errors (unpublished results).
If an EOG artifact contaminated the signal in the negative feedback condition of Vezoli and Procyk's study, we would expect the following pattern of results. First, we would predict a biphasic potential in the difference wave peaking within the range of normal saccade latencies for monkeys (~150-400 ms, μ ~200 ms) and showing spread or “smear” proportional to the variance of saccade latencies (compare Figure 1 with Vezoli and Procyk Figure 3c). Second, because the EEG measures the difference in voltage between each electrode site and a common reference, and since the electrode closest to the eyes was used as reference in the current study, we would expect the difference wave to be maximal at the most posterior electrodes (see Vezoli and Procyk Figure 3c). Third, because mean saccade latencies can vary from day to day based on a number of variables, we would not expect the peak latency of the difference wave to remain stable; it should vary with recording session. As predicted, the authors report changes of 20 and 30 ms in both directions for the peak latencies of difference waves recorded 7 months after the initial recordings. Fourth, we would expect the amplitude of the difference wave to decrease after haloperidol administration. At first glance, this prediction may seem unwarranted, but it is a simple and straightforward expectation based on an assumption of lengthening saccade latencies. The authors report increased reaction times after haloperidol administration, so it is reasonable to admit the possibility that saccade latencies were similarly increased. This would place the EOG artifacts later in time, outside of the period of interest, leading to much more similarity between the ERPs (note the shortening of the abscissa in Vezoli and Procyk Figure 5 panels C and D). In order to address this confound, the authors should have removed portions of the data that were contaminated by saccade artifacts, or excluded trials containing saccades altogether. This would have been trivial to do, since eye movements were monitored throughout the experiment with an infrared eye tracker.
Eye movements may have varied slightly between the first correct trial at the close of the search period and correct trials during the repetition period. If so, similar logic can be applied to explain the marginal differences observed between these conditions. Animal models of error and feedback ERPs are sorely needed to advance our understanding of the neural mechanisms of performance monitoring. However, appropriate controls must be carried out to ensure the validity of these findings. Until potential confounds are addressed, a parsimonious explanation for the findings presented by Vezoli and Procyk is that the data reflect differences in eye movements between conditions.
The author would like to thank Dr. Jeffrey Schall for his helpful comments and suggestions throughout the preparation of this manuscript.
The author has been supported by T32-MH064913 and R01-MH055806.