|Home | About | Journals | Submit | Contact Us | Français|
The majority of decision-related research has focused on how the brain computes decisions over outcomes that are positive in expectation. However, much less is known about how the brain integrates information when all possible outcomes in a decision are negative. To study decision-making over negative outcomes, we used fMRI along with a task in which participants had to accept or reject 50/50 lotteries that could result in more or fewer electric shocks compared to a reference amount. We hypothesized that behaviorally, participants would treat fewer shocks from the reference amount as a gain, and more shocks from the reference amount as a loss. Furthermore, we hypothesized that this would be reflected by a greater BOLD response to the prospect of fewer shocks in regions typically associated with gain, including the ventral striatum and orbitofrontal cortex. The behavioral data suggest that participants in our study viewed all outcomes as losses, despite our attempt to induce a status quo. We find that the ventral striatum showed an increase in BOLD response to better potential gambles (i.e., fewer expected shocks). This lends evidence to the idea that the ventral striatum is not solely responsible for reward processing but that it might also signal the relative value of an expected outcome or action, regardless of whether the outcome is entirely appetitive or aversive. We also find a greater response to worse gambles in regions previously associated with aversive valuation, suggesting an opposing but simultaneous valuation signal to that conveyed by the striatum.
Many real world decisions involve the possibility of both good and bad outcomes, but sometimes the choices are between bad and worse. Consider, for example, an individual who purchases a cell phone plan only to realize that the reception with that carrier is terrible. The individual is then faced with the decision to either stay with the carrier and suffer bad reception, or pay an exorbitant cancellation fee. In either case, the outcome is bad. The recent advent of neuroeconomics has brought new methods of analysis to the study of human decision-making, but the vast majority of these studies have focused on decisions in which all possible outcomes are non-negative (Knutson et al., 2001; McClure et al., 2004; Padoa-Schioppa and Assad, 2006; Preuschoff et al., 2006; Tobler et al., 2007). But because relatively few studies have examined decisions made entirely in the domain of losses, it is not clear how the brain gages relative value when all of the outcomes are bad. One hypothesis regarding valuation in the brain suggests that the utility of positive outcomes is evaluated by a separate neural system from that of negative outcomes. In its simplest form, the dual-systems hypothesis associates the ventral striatum and the orbitofrontal cortex (OFC) exclusively to the evaluation of gains (Mirenowicz and Schultz, 1996), and the amygdala and insula exclusively to the evaluation of losses (Yacubian et al., 2006).
There is some evidence that the striatum and other orbitostriatal structures are involved in both gain and loss processing (Delgado et al., 2003; Seymour et al., 2007; Tom et al., 2007). However, most of these studies pitted a potential gain against a loss, used a medium that is generally rewarding (money), or focused solely on the anticipation of the gain or loss, and thus it is not clear that when all outcomes are negative, whether the striatal system would still be engaged or whether a separate system would perform the decision-making processing. Some research suggests that anticipation and experience of aversive stimuli activate the striatum (LaBar et al., 1998; Becerra et al., 2001; Jensen et al., 2003; Seymour et al., 2004). Indeed, there are populations of dopaminergic neurons that respond to aversive stimuli (Coizet et al., 2006; Matsumoto and Hikosaka, 2009). Thus, we hypothesized that the striatal system also processes the value of non-rewarding stimuli during the decision-making process itself, as opposed to solely the anticipation of the stimuli. For painful outcomes (electric shocks), we predicted that fewer electric shocks from a reference amount would be viewed as a “gain” and more electric shocks as a “loss.” Furthermore, we predicted that the ventral striatum would be involved in processing “gains” (fewer electric shocks) even though the overall outcome medium was always unpleasant.
To test these hypotheses, we used fMRI along with a gambling task involving electric shocks. In a manner similar to that in the task used by Tom et al. (2007), participants were asked to accept or reject a 50/50 gamble of “more” or “fewer” electric shocks compared to a reference amount that they received at the beginning of each trial. If participants rejected the gamble, they received the reference amount of shocks. If they accepted the gamble, they either received “more” or “fewer” shocks from the reference amount. Using this task, we tested whether participants’ choice behavior was consistent with an adaptation of their status quo to the reference level of shocks. Our analysis of the neuroimaging data focused on the period in which participants decided whether to accept or reject these lotteries. This allowed us to identify regions involved specifically in decision-making as opposed to the anticipation of the outcomes.
Thirty-six participants (18 female, 18 male; 18–45years) were recruited from the Emory University campus. All participants were right-handed, reported no psychiatric or neurological disorders, or other characteristics that might preclude them from safely undergoing fMRI, and provided informed consent to experimental procedures approved by the Emory University Institutional Review Board. Participants received a base pay of $40.
A Biopac STM100C stimulator module with a STMISOC isolation unit (Biopac Systems, Inc., CA, USA) was used to deliver electric shocks cutaneously to the dorsum of the left foot through shielded, gold electrodes placed 2–4cm apart. The STMISOC unit controlled current output to the electrodes, with each pulse lasting 15ms. The stimulator module was connected via a serial-interface to a laptop which controlled the timing and delivery of the shocks.
Prior to scanning, shock intensity was calibrated by finding each participant's “maximum shock intensity”, Imax. Participants were told that their maximum shock intensity would be set to the highest intensity that they could bear. For the calibration procedure, each trial consisted of 18 shocks over 340ms (the maximum number per trial in the subsequent experiment). The current was slowly increased until participants notified the experimenter that they couldn't bear it anymore, and this current level was set as their Imax. The current level for all shocks throughout the experiment was set at 90% of Imax.
To gain familiarity with the different numbers of shock outcomes, participants were passively exposed to all possible outcomes. An attempt to induce a status quo of 10 shocks was made by subjecting participants to the 10 shocks at the beginning of each trial. On each outcome, the number of shocks (SN) was evenly spaced in time over 340ms, yielding an inter-pulse interval of 340/(SN-1). This was done to avoid confounding the number of shocks with the total duration of shocks. The number of shocks, SN, within a trial was 2, 3, 4, 5, 6, 8, 10, 12, 15, or 18. These numbers were determined based on previous literature that suggests that the Weber fraction for many stimuli range from 0.01 to 0.10, meaning that a difference of at least 1–10% between stimuli is needed in order to be distinguishable from each other (Teghtsoonian, 1971; Lavoie and Grondin, 2004). To insure that participants could distinguish between different numbers of shocks, a difference in number of at least 25% between shocks was used. The status quo was set at 10 shocks, and so a relative gain was framed as “2, 4, 5, 6, 7, or 8 less” and a relative loss as “2, 5, or 8 more.”
Following the calibration phase of the experiment, participants entered the scanner to begin the experimental phase, which was modeled after a monetary gambling paradigm used by Tom et al. (2007). Each trial began with a status quo (10 shocks), which was indicated by the presentation of a circle with the text “Sure Thing” centered in the middle (see Figure Figure1).1). After 2s, this circle turned yellow, indicating the impending onset of the shocks, which occurred after a further 2s. Following an interstimulus interval (ISI) of 3s, a 50/50 gamble appeared with the words “Accept” and “Reject” below it. This gamble consisted of two possible outcomes, indicated by separate, equally sized slices of the circle, where the left side was always more potential shocks and the right side always fewer potential shocks. The number of shocks more and less than the reference amount varied between trials, such that every possible combination of shocks was presented.
Two seconds after presentation of the gamble, participants were allowed to “Accept” or “Reject” the gamble by using a button box in the scanner. If participants accepted the gamble, a pink ball flipped between options for a varying amount of time between 3 and 6s, landing with a 50/50 chance on the more shocks or fewer shocks outcome. The side on which the ball landed turned yellow, indicating the outcome of the gamble and impending shocks, which occurred 4.7s after the outcome was revealed. If participants rejected the gamble, an identical presentation including the ball-flip and outcome selection occurred. However, in this case the reference shocks were the only possible outcome. After the shocks were administered, the outcome remained on screen for 3s, and was followed by an inter-trial interval (ITI) of 3s. The experimental phase consisted of three runs with 18 trials per run (54 trials in total). Trials were randomly ordered for each run within-subjects, but remained the same between-subjects. COGENT 2000 (FIL, University College London) was used for stimulus presentation and response acquisition for this phase.
To confirm that participants could distinguish between the different numbers of shocks and that increasing shocks were increasingly averse, participants rated all possible sets of shocks relative to the reference shocks (after the above procedure but while still in the scanner). A visual analog scale (VAS) was presented on screen, with a white arrow in the center labeled as “reference shocks.” Participants were given the reference shocks, and then were given another set of shocks, blinded to the number. They were asked to rate “How much better or worse it is from your reference,” by moving the arrow on screen either left (“better”) or right (“worse”). All possible sets of shocks were given three times each for a total of 30 data points.
Functional imaging was performed with a Siemens 3 T Trio whole-body scanner. T1-weighted images (TR=2300ms, TE=3.04ms, flip angle=8,192×146 matrix, 176 sagittal slices, 1mm cubic voxel size) were acquired for each subject prior to the three experimental runs. For each experimental run, T2*-weighted images using an echo-planar imaging sequence were acquired, which show blood oxygen level-dependent (BOLD) responses (echo-planar imaging, TR=2350ms, TE=30ms, flip angle =90, FOV=192mm×192mm, 64×64 matrix, 35 3-mm thick axial slices, and 3mm3 voxels).
fMRI data were analyzed using SPM5 (Wellcome Department of Imaging Neuroscience, University College London) using a standard 2-stage random-effects regression model. Data were subjected to standard preprocessing, including motion correction, slice timing correction, normalization to an MNI template brain and smoothing using an isotropic Gaussian kernel (full-width half-maximum=8mm).
Four main regressors were included in the first-level models. (1) The status quo shock at the beginning of each trial was modeled as an impulse function. (2) The “decision” period, during which a decision to accept or reject the gamble was required, was modeled from the onset of gamble presentation until button press. The expected value of the gamble was also included as a parametric modulator for this period. (3) The “ball” period, in which the gamble outcome was resolved over a varying period of time between 3 and 6s, was modeled as a variable duration function. (4) The “wait” period was modeled from the display of the gamble outcome to the receipt of the shocks. For this period, the number of shocks received was included as a parametric modulator. Subject motion parameters were also included as regressors. All regressors were convolved to the standard HRF function.
Because we were interested in investigating the neural basis of decision parameters that affect choice, the second-level analysis focused on the decision period (#2 above). To identify regions involved in valuation during choice, we first identified regions showing correlations with the expected value of the gamble. We assumed shocks are “bad” and have negative value; for example, the reference shocks would have an expected value (EV) of −10. We calculated the expected value of the gambles with the equation: EVgamble =−10 +(number of shocks less –number of shocks more)/2. EVgamble ranged from −7 for the best gamble, and −13 for the worst gamble. This parameter was expected to directly affect choice, because a less negative EVgamble would indicate a better gamble and a more negative EVgamble a worse gamble, assuming individuals find electric shocks unpleasant. To further analyze the interaction between potential outcomes with less or more shocks within identified regions, we performed an ROI analysis using beta estimates from a different first-level model in which the number of shocks less and the number of shocks more than the reference amount were modeled by separate parametric modulators. This allowed us to identify the extent to which better and worse potential outcomes separately contributed to EVgamble.
Finally, another first-level model was constructed in order to extract BOLD responses for each individual gamble type during the decision period. Instead of a single lottery period modulated by the number of shocks less and number of shocks more than the reference amount, this model included each lottery period associated with a different gamble as a separate regressor, such that there were 18 columns in the design matrix for the decision period, along with the remaining regressors that appeared in the primary first-level model described above. This allowed the average BOLD activity during the decision period for each separate gamble to be extracted. These values were then used to create “heat maps” of activation which give snapshots of how a particular region responds to all possible gambles.
For monetary payments, if an individual prefers to receive a certain payment rather than a gamble with the same expected value, he is said to be risk averse. If he instead prefers the gamble, he is said to be risk seeking. Prior research with monetary payments shows that on average, individuals are risk-averse (risk-seeking) for positive (negative) payoffs. We consider whether the shock quantities in our experiment are treated in the same way. To consider the issue, participant behavior in symmetric lotteries was analyzed. Symmetric lotteries were lotteries with the same amount of shocks less and more than the reference amount, and therefore had the same expected value as the reference shocks. Averaged across all runs for all participants, the symmetric lotteries were chosen over the reference shocks 74% of the time, which suggests risk-seeking behavior. For the individual symmetric lotteries of 8/8, 5/5, and 2/2, participants chose the lottery 56%, 78%, and 89% of the time, respectively. Interestingly, this was significantly different between the three symmetric lottery types (F(2,105) =10.21; p <0.0001).
As another indicator of overall risk-preference, the average indifference point across participants was determined by graphing the probability of choosing the lottery as a function of the expected value of the gambles. A sigmoidal curve, shown in Figure Figure2,2, was fit to the data using a logistic function to determine the average indifference point. If participants on average were risk neutral, their indifference point would equal the expected value of the reference shocks (−10). If participants were risk-seeking, their indifference point would be less than −10. The average indifference point was −10.94 shocks (f(−10.94) =0.500±0.218), indicating risk-seeking behavior. The reference point of −10 did not lie within the 95% confidence interval of the logistic fit (f(−10)=0.720 ±0.214), and therefore it is likely that this observed indifference point was significantly different from risk-neutrality.
To determine individual risk-preference, the curvature of the utility function, u(x)=xα was estimated for each participant using a non-linear least-squares regression. Participant values were not normally distributed nor were they lognormal, and therefore non-parametric statistics were used to test for significance. A Wilcoxon signed-rank test indicated that, on average, participant α values (median α=0.934, SD=0.309) were significantly different from one (p=0.0381). Due to the method of estimation (where a larger expected value is a more unfavorable gamble), an α<1 indicates convexity over losses and therefore a preference for risk-seeking behavior, whereas an α=1 indicates a risk-neutral preference. In addition, average VAS ratings for each possible outcome in the study were computed and normalized to the reference shock ratings. When plotted, these ratings revealed a convex function resembling a value function over losses (see Figure Figure3).3). The slope of the VAS rating over more and less potential shocks were computed for each participant, using linear regression. A paired-samples t-test revealed that the slope for less potential shocks (M=1.974, SD=0.644) was significantly greater than the slope for more potential shocks (M=0.920, SD=0.548), p<0.001, consistent with a convex value function.
The expected value of the gambles, EVgamble, was used to identify brain regions involved in the valuation of gambles during the decision period (see Figure Figure4).4). Used as a parametric modulator, this allowed for identification of regions of the brain whose BOLD signal correlated with the objective gamble value. Positive correlations between EVgamble and BOLD activity were found in the visual cortex, intraparietal sulcus, frontal eye fields, and the left ventral striatum, among other areas (see upper portion of Table Table1).1). A less negative EVgamble indicated a better gamble, which demonstrates that these regions responded in a graded manner to comparatively better possible outcomes – even though all outcomes were still painful. Given that all outcomes were aversive, it is interesting that ventral striatum activity increased for relatively “less bad” outcomes. Regions with negative EVgamble correlations, or a greater response for worse outcomes (more expected shocks), included the posterior cingulate, anterior cingulate (ACC), inferior parietal lobule, insula, and the lateral OFC (lower portion of Table Table11).
To determine how these regions responded to the individual components of the gambles (less or more potential shocks), beta values for the lottery×number of shocks less and lottery×number of shocks more condition were extracted from regions identified in the lottery×EVgamble contrast mentioned above. The left ventral striatum showed significant positive and negative correlations with the number of potential shocks less (better) and number of potential shocks more (worse), respectively. Other areas identified in the positive EVgamble contrast revealed the same relationship: a significant positive correlation with the number of potential shocks less and negative correlation with the number of potential shocks more. The opposite trend was seen for several regions identified in the negative EVgamble contrast: significant positive correlations with the number of potential shocks more and negative correlations with the number of potential shocks less were observed in the insula, intraparietal sulcus, and dorsomedial prefrontal cortex (DMPFC). To visualize activity to each individual gamble type, we extracted beta values from ROIs in the lottery×EVgamble contrast for each gamble type. In the left ventral striatum, gambles with a higher EVgamble were associated with less deactivation, and gambles with a lower EVgamble were associated with more deactivation, as revealed in a heat map (see Figure Figure5).5). A heat map of beta values from the DMPFC for each gamble revealed less activation to gambles with a higher EVgamble, and more activation to gambles with a lower EVgamble (see Figure Figure5).5). In other words, more potential shocks elicited above-baseline BOLD activity in these regions. Similar activity was observed in the genual ACC (see Figure Figure5),5), with less deactivation for gambles with a lower EVgamble.
Contrary to the simplest form of the dual-systems view, which would predict no response from the ventral striatum to gambles consisting solely of losses, our results indicate that the ventral striatum encodes information regarding value irrespective of the type of outcome (e.g., “more” or “less” shocks) and whether the outcomes are globally “good” or “bad” (e.g., appetitive or aversive). In particular, the positive correlation of left ventral striatal activity with the expected value of the shock lotteries supports its role in valuation and extends this to include the relative valuations of “bads.” While previous neuroimaging studies have demonstrated the role of the striatum in integrating the value of rewards with a variety of costs (Tom et al., 2007; Croxson et al., 2009; Talmi et al., 2009), our results extend these findings to the domains of pain and loss even when there is no possibility of gain.
That these decisions were viewed as occurring in the loss domain is reinforced by the fact that, despite being exposed to the reference shocks for each trial, participants viewed every outcome as a “loss.” This was evidenced by consistent risk-seeking behavior over the full range of lotteries and a larger slope for less shocks than more shocks relative to the status quo for the VAS ratings. These results are consistent with past research showing risk-seeking behavior over hypothetically painful outcomes (Eraker and Sox, 1981). Interestingly, this risk-seeking behavior cannot explain the changes in striatal activation as others have suggested (Fiorillo et al., 2003; Preuschoff et al., 2006) because the variance of the best and worst lotteries is the same in our task. One possible reason for this lack of status quo inducement is the transient nature of the reference shocks. Although participants were presented with reference shocks between each trial, the majority of the time participants were not experiencing painful stimuli. It is possible that a constant painful stimulus, such as would arise with the use of capsaicin to induce a constant state of pain which can then be attenuated or exacerbated with temperature, might be more effective in inducing a status quo (Seymour et al., 2005).
It is important to distinguish between the loss of something desirable, which has been investigated in a considerable number of prior studies, and the receipt of something undesirable, which has received less attention. Previous neuroeconomic studies of loss aversion have shown that the ventral striatum deactivates to the prospect of monetary loss (Tom et al., 2007). Similarly, striatal deactivation has been observed with increased effort and pain to obtain a monetary gain (Croxson et al., 2009; Talmi et al., 2009). These results point to the integrative role of the striatum in determining net value for monetary rewards but do not directly address its role in the relative valuation of things that are universally bad. Evidence exists, however, that the striatum dynamically scales for relative coding of value (Seymour and McClure, 2008). In a similar manner, dopamine neurons have been observed to adaptively code reward value (Tobler et al., 2005), so it is plausible that the striatum could exhibit adaptive signaling even in the realm of painful outcomes – for which we find strong evidence here.
Beyond the striatum's adaptive coding of value, its more general role in pain processing has been hotly debated (Leknes and Tracey, 2008). Some studies have shown ventral striatal activity during the anticipation of painful stimuli (Becerra et al., 2001; Jensen et al., 2003), a finding echoed by PET evidence of dopamine release to pain (Scott et al., 2006), while others have argued this activity merely reflects the anticipated relief (Baliki et al., 2010). Still others have suggested that the ventral striatum functions more generally in motivated behavior (Horvitz, 2000; Zink et al., 2003, 2004; Delgado et al., 2004; Nicola et al., 2004; Leknes and Tracey, 2008). Our results showed increased ventral striatal activity in anticipation of fewer shocks, which suggests that the striatum is not simply functioning to prime the system to avoid pain – i.e., an analgesic effect (Scott et al., 2006, 2007; Wood and Holman, 2009). If that were the case, we would expect to see increased striatal activity to more potential shocks. Instead, we observed the opposite trend, precluding an analgesic explanation.
Although the aforementioned discussion pertains to the role of the striatum in relative valuation, we also find evidence for such signals in cortical regions classically associated with pain and punishment evaluation (Bechara et al., 1998; O'Doherty et al., 2001; Koyama et al., 2005; Kringelbach, 2005; Raij et al., 2005; Seymour et al., 2005). These regions appear to signal valuation in an inverse manner from the striatum, with both systems operating in synchrony during the decision period. Indeed, evidence for the co-existence of both appetitive-valuation and aversive-valuation signals in the brain exists, with the aversive-valuation signals residing in some of the same regions that we observe, namely in the lateral OFC and genual ACC (O'Doherty et al., 2001; Small et al., 2001; Gottfried et al., 2002; Seymour et al., 2005; Nitschke et al., 2006). In our study, the lateral OFC and genual ACC convey this valuation information during decision-making itself over painful stimuli, as opposed to only during passive learning tasks, which build on prior evidence for these structures roles in signaling bad outcomes, perhaps to facilitate reversal-learning or changes in action, as has been suggested (Kringelbach and Rolls, 2003; Seymour et al., 2005). Furthermore, given past research showing lateral OFC and genual ACC activation to non-painful but aversive stimuli, such as monetary loss (O'Doherty et al., 2001; Liu et al., 2007) and unpleasant odors (Gottfried et al., 2002; Rolls et al., 2003), this information is likely coded in a “common currency”, as has been suggested of activity in the orbitofrontal-striatal system (Montague and Berns, 2002; Murray et al., 2007).
In addition to valuation, the increase in BOLD response to worse gambles that we observed could be related to attention or cognitive control in general, which refers to the process by which attention, memory, and other cognitive abilities are shifted to accomplish a variety of goals. In addition to the lateral OFC and ACC, we found that the DMPFC signaled worse gambles with above-baseline activation in a location that has been recently implicated in decision-related control (Venkatraman et al., 2009), and that has been shown to be more active for more difficult decisions and for decisions that run counter to overall behavioral strategy (Paulus et al., 2002; Zysset et al., 2006; Hampton and O'Doherty, 2007). Though our experimental design does not allow us to separate these functions from valuation or vice-versa, it is likely the case that there exists a complex interplay between regions signaling aversive valuation, such as the lateral OFC, and higher-level decision-control regions which integrate these signals, possibly the DMPFC. Much like the striatum, activity in the DMPFC has been demonstrated during decision-making over a variety of stimuli, suggesting that its role might be independent of the type of outcome that is being decided on (Rushworth et al., 2005; Pochon et al., 2008; Venkatraman et al., 2009).
The current body of research in decision-making points to the idea of a universal valuation system that signals how “good” or “bad” a potential outcome is, relative to some reference point. Structures that were originally thought to be involved solely in reward processing during decision-making are increasingly being shown to be involved in the processing of punishing stimuli as well. Similar activity in these orbitofrontal-striatal regions is observed between more abstract punishments (e.g., monetary losses) and painful stimuli as we have shown here, much like the similar activation patterns for a wide variety of rewarding stimulus modalities. Future research might focus on how a baseline is determined for this valuation activity and whether it is directly related to the status quo, and whether loss aversion can be observed for non-monetary outcomes once a status quo has been set.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Supported by grants from NIDA (R01 DA016434 and R01 DA025045).