Participants

Forty-five healthy, right-handed, adult volunteers (22 male) participated in this study [ages, 18–31 years; mean (M) = 23.05]. Participants gave written informed consent and were financially compensated for their time ($15/hour). They received an extra bonus (M = $12.21, standard deviation (SD) = $7.75) proportional to the points earned during the experimental session. All procedures were approved by the Duke University Health System Institutional Review Board. Four participants were excluded from further data analysis due to technical difficulties during their experimental sessions, leaving a final sample of 41 participants (20 male).

Stimuli and task

We designed a probabilistic decision-making task using elements from the experimental designs of

Gehring and Willoughby (2002) and

Frank et al. (2004). Participants sat in front of a computer screen and performed 800 trials over the course of a single experimental session divided into 40, roughly 1.7-minute blocks. Subjects were told that each trial would start with the presentation of two symbols, and that some symbols tended to precede losses and other symbols tended to precede gains. They were instructed to try to learn which symbols were associated with which outcomes and to use that information to bet either 2 points or 8 points on each trial. Also, they were told that the probabilistic relationship between symbols and gains/losses would remain constant during the entire task. Subjects were also informed that although a monetary bonus proportional to the points earned during the session would be given, no information regarding the conversion from points to money would be provided until the end of the experiment. Before data collection, participants completed a 20-trial practice session using a set of symbols different from that used during data collection.

The temporal sequence of the task as it unfolded over a single trial is shown in . Each trial started with the presentation of a pair of symbols (Higrana characters) and a fixation cross, which were displayed for 1500 ms. The pair of symbols presented on each trial was randomly selected, without replacement, from the set of 20 possible pairings of 5 unique symbols (). Considering that these were right-left counterbalanced, these 20 pairs actually corresponded to 10 unique (non-matching) combinations of symbols, so that each unique combination was presented twice per block.

After an inter-stimulus interval (ISI) jittered between 100 and 300 ms, two white squares with the numerals “8” and “2” depicting the wager-amount choices appeared randomly on the right and left of the fixation cross. Participants chose their wager amount for the trial by pressing a button with the hand corresponding to the side of the screen containing their wager preference. Feedback concerning the outcome of the trial was presented after an ISI jittered between 600 and 1000 ms and appeared as a green box around the chosen number if the participant won on that trial (i.e., gained that number of points) or as a red box around the chosen number if the participant lost that number of points. If no response was made within 1200 ms, the words “no response” and a box corresponding to losing 8 points were presented on the screen. The next trial started after an inter-trial interval (ITI) jittered between 800 and 1200 ms. Participants were instructed to maintain fixation on the fixation cross throughout the experimental runs.

The outcome’s valence (win or loss) on each trial was probabilistically determined according to the probability of winning [p(win)] associated with the presented stimulus pair (). The p(win) associated with each pair was calculated as an adjustment from 50% determined by each symbol: p(win)=0.5+p_{L}+p_{R}, where p_{L} and p_{R} are the adjustments associated with the symbol presented to the left and right of the screen, respectively (A = +0.3, B = +0.15, M = 0, Y = −0.15 and Z = −0.3). For example, the stimulus pair presented in corresponds to symbol labels A and Y, and following , p(win)_{AY} = chance + p(win)_{A} + p(win)_{Y} = 0.5 + 0.3 − 0.15 = 0.65.

Most importantly for our research questions, participants could make a choice that would influence the magnitude of outcomes, but they had no control over the valence of the result. Optimal behavior entailed betting 8 points each time that a likely winning pair (i.e., p(win) > 0.5) was presented, and betting 2 points each time that a likely losing pair (i.e., p(win) < 0.5) was presented.

Besides magnitude (small or large) and valence (win or loss), feedback in the task also conveyed information about the relative value of the feedback compared to the outcome that “would-have-been” if the alternative wager amount had been selected. This variable, which accords roughly with intuitions of “rejoice” or “regret”, was labeled as “relative outcome”. This can be seen in . Thus, the +8 and −2 outcomes reflect the best possible gain and loss, respectively, given that in each case the alternative outcome would be 6 points worse (i.e. +2 and −8, respectively).

EEG recording and pre-processing

The electroencephalogram (EEG) was recorded continuously from 64 channels mounted in a customized, extended coverage, elastic cap (Electro-Cap International,

www.electro-cap.com) using a bandpass filter of 0.01 – 100 Hz at a sampling rate of 500 Hz (SynAmps, Neuroscan). All channels were referenced to the right mastoid during recording. The positions of all 64 channels were equally spaced across the customized cap and covered the whole head from slightly above the eyebrows in front to below the inion posteriorly (

Woldorff et al., 2002). Impedances of all channels were kept below 5k Ω, and fixation was monitored with electro-oculogram (EOG) recordings. Recordings took place in an electrically shielded, sound-attenuated, dimly lit, experimental chamber.

Offline, EEG data were exported to MATLAB (MathWorks) and processed using the EEGLAB software suite (

Delorme and Makeig, 2004) and custom scripts. The data were low-pass filtered at 40 Hz using linear finite impulse response (FIR) filtering, down-sampled to 250 Hz and re-referenced to the algebraic average of the left and right mastoid electrodes. For each participant, we implemented a procedure for artifact removal based on independent component analysis (ICA). This approach has been used in a number of studies (

Debener et al., 2005;

Eichele et al., 2005;

Scheibe et al., 2010) to obtain EEG data with diminished contribution of ocular/biophysical artifacts. First, we visually rejected unsuitable portions of the continuous EEG data. This procedure resulted in the exclusion of 20 trials on average (± SD = 8.36 trials) from the original 800-trial-long dataset for each participant. Secondly, we separated the data into 1200-ms feedback-locked epochs, spanning from 400 ms before to 800 ms after the onset of the feedback stimulus, with a prestimulus baseline period of 200 ms. Thirdly, we performed a temporal infomax ICA (

Bell and Sejnowski, 1995). With this analysis, independent components with scalp topographies and signals that could be assigned to known stereotyped artifacts (e.g., blinks) based on their distribution across trials, their component waveform, and/or their spectral morphologies, were removed from the data (

Jung et al., 2000a;

Jung et al., 2000b;

Delorme et al., 2007). The remaining components were back-projected to the scalp to create an artifact-corrected dataset.

Previous studies have consistently found that the FRN has a frontocentral distribution with a peak of amplitude over the standard 10–20 FCz location at around 250 ms after feedback onset (

Miltner et al., 1997;

Gehring and Willoughby, 2002;

Nieuwenhuis et al., 2004). On the other hand, the P3 has been conceptualized as being formed by two subcomponents: the P3a with a frontocentral distribution and a maximum amplitude between 300 and 400 ms following stimulus presentation, and the P3b with a parietocentral distribution and a peak of amplitude occurring between 60 and 120 ms later (

Nieuwenhuis et al., 2005;

Polich, 2007). In order to assess the FRN and the P3a we used a region-of-interest (ROI) cluster of seven sensors centered on the canonical channel FCz as a frontal region of interest (frontal-ROI). In order to assess the P3b we used a parietal-ROI cluster of seven sensors centered on channel Pz.

On frontal sites, the FRN appears superimposed on the P3a, and as several studies have noted, the FRN peak can be shifted depending on the amplitude of this frontal P3 (

Yeung and Sanfey, 2004;

San Martin et al., 2010;

Chase et al., 2011). This is consistent with the idea that scalp-recorded neuroelectrical activity corresponds to the linear sum of the activity of a discrete set of neural sources (

Baillet et al., 2001). Thus, to more effectively quantify the FRN amplitude accounting for differences in the P3-induced baseline, we used a mean-amplitude-to-mean-amplitude approach. More specifically, the FRN amplitude for each trial was calculated in the frontal-ROI as the average potential across a 204–272 ms window post-feedback (i.e., relative to the feedback stimulus onset) minus the average voltage potential from a short 188–200 ms window preceding it (Note that the effective sampling rate was 250 Hz, and thus these window lengths were all multiples of 4 ms). This approach accounts in part for the overlap between the FRN and P3 (cf.,

Yeung and Sanfey, 2004;

Frank et al., 2005;

Bellebaum et al., 2010;

Chase et al., 2011).

In addition, given that differences between conditions were in fact observed before the onset of the FRN, we decided to also include an earlier window into our analyses. We refer to this activity as the P2, noting that it may represent an early stage of the slower-wave P3a. We measured the P2 amplitude on the frontal-ROI as the average ERP voltage potentials from a 152–184 ms post-feedback window. The P3a was quantified as the average potential from a 284–412 ms window in the frontal-ROI and the P3b as the average potential from a 416–796 ms window in the parietal-ROI, both relative to prestimulus baseline.

Overview of the data analysis

Through our analyses we wished to explore the relationship between individual differences in feedback-elicited brain activity and individual differences in choice behavior, particularly in gain-maximization and loss-minimization. However, our paradigm has learning and choice components that are difficult to distinguish from each other during the initial part of the experiment. In order to focus on the choice components of the processing, we excluded from our analyses the trials from the first quarter of the experimental session, using only the last three quarters of the session, which we took as representative of stable learned behavior (see and Behavioral results section).

Using behavioral metrics derived from subjects’ choices, we tested the hypothesis that neural differences between the worst gain (i.e., +2) and the best gain (i.e., +8) would scale with gain-maximization, while the neural differences between the worst loss (i.e., −8) and the best loss (i.e., −2) would scale with loss-minimization. In addition, we assessed the association between the amplitude of ERP components and trial-to-trial behavioral adjustment.

Behavioral data analysis

To extract individual scores in gain-maximization and loss-minimization, we characterized each subject by his/her observed probability to bet the larger amount on likely winning trials [p(win) > 0.5], neutral trials [p(win) = 0.5] and likely losing trials [p(win) < 0.5]. We then expressed these probabilities on a logit-function scale

, where *p* is the probability to bet the larger amount on a given trial. This logit transform allows for better characterizations of differences in probability at the low and high ends of the scale. γ coefficients were estimated for each of the three types of trials for each participant. For all our subsequent analyses, the strength of gain-maximization was measured by the γ estimate for likely winning pairs: the more positive that value was for a participant, the more likely the participant was to bet high on likely winning trials. On the other hand, the strength of loss-minimization was measured by the γ estimate for likely losing pairs multiplied by −1, such that the more positive this value was for a participant the more likely the participant was to bet low on likely losing trials.

Gain-maximization, loss minimization, and ERPs for worst vs. best outcomes

Our first ERP data analysis tested the hypothesis that the difference between the neural responses elicited by feedback stimuli indicating the worst (+2) versus best (+8) gains would scale with gain-maximization and that the difference between the neural responses elicited by feedback stimuli indicating the worst (−8) versus best (−2) losses would scale with loss-minimization. As such, we computed four ERPs for each participant, each ERP corresponding to averaged EEG activity time-locked to the presentation of each of the four feedback stimulus types. After removing the first-quarter of the data to minimize the effect of learning (see above), 154 trials on average went into the ERP for +8 (SD = 32.70), 139 into the ERP for +2 (SD = 39.18), 207 trials into the ERP for −2 (SD = 40.01), and 84 trials into the ERP for −8 (SD = 35.45). Then we computed the difference in the ERP signal for conditions +2 minus +8 and −8 minus −2 for each participant. Finally, we performed a multiple linear regression using gain-maximization and loss-minimization scores for each participant (i.e., γ

_{gain-max} and γ

_{loss-min}) as explanatory variables for such ERP differences, according to the following equations:

where ε is a vector of error terms.

We performed this analysis separately for the P2, FRN, P3a and P3b components. For correction of multiple comparisons, we applied the step-down method described by

Holm (1979).

ERPs correlates of behavioral adjustment

The last analysis was intended to assess directly the association between each ERP component and the behavioral adjustment on subsequent trials. We considered each trial *t* in terms of the two cue symbols that were presented (S1_{t} & S2_{t}) and the bet that was wagered. We then asked whether the wager choice (high or low) was the same or different on the next trial that presented S1_{t} and/or S2_{t}. This analysis was done separately for the next appearance of each one of these symbols, regardless if they appeared paired together or individually with other symbols. That is, a trial on which symbols labeled A and Y were presented was compared both with the next trial to present A and the next trial to present Y (which may have been the same trial). We defined a switch variable that assumed the value 0 if the same bet was chosen on each of these trials, 1 if the trial with one of the two cue symbols involved the same bet and the other the opposite, and 2 if neither trial resulted in the same bet. Note that this particular scoring ignores the temporal delay between subsequent presentations of the stimuli and precludes any interaction between the elements of each pair.

Evoked responses corresponding to the P2, FRN, P3a, and P3b components were each entered as dependent variables to fit a linear mixed model using different levels of ‘*outcome’* and ‘*adjustment level’* as fixed effects and a participant’s identifier (μ) as random effect.

where ‘

*outcome’* was a categorical variable with three levels (+2, −2 and −8), and ‘

*adjustment level’* was likewise a categorical variable with two levels (1 = switching for one symbol, 2 = switching for both symbols). We used the +8 outcome and the not switching condition as constant for the model. In other words, the β estimates that we report below for the fixed effects reflect expected deviations from the ERP components elicited by the best gain (i.e., +8), when such a result was followed by the same choice (i.e., a large bet) on the next immediately next trial(s) wherein the same cue symbols were presented. For correction of multiple comparisons we applied the method described by

Holm (1979).

After removing the data from the first-quarter of the trials, 49 trials went into each of these 12 ERPs on average (SD = 27.64). The condition with the maximum number of trials across participants was ‘not switching after −2′ [Mean (M) = 93.23; SD = 38.85; range = 11 – 167] and the condition with the minimum number of trials was ‘not switching after −8′ (M = 20.51; SD = 19.31; range = 8 – 109). No condition was associated with fewer than 7 trials in any of the 41 participants included in the analysis.