|Home | About | Journals | Submit | Contact Us | Français|
Reward-predicting cues evoke activity in midbrain dopamine neurons that encodes fundamental attributes of economic value including reward magnitude, delay and uncertainty. Here, we demonstrate that dopamine release in rat nucleus accumbens encodes anticipated benefits but not effort-based response costs unless they are atypically low. This neural separation of costs and benefits indicates that mesolimbic dopamine scales with the value of pending rewards but does not encode the net utility of the action to obtain them.
For individuals to prosper in diverse environments, they need to use predictive sensory information to optimize outcomes in a flexible manner. Decision-making processes weigh the benefits of a reward with the cost of obtaining it to determine the overall subjective value (utility) of the transaction1,2. One neural substrate highly implicated in this valuation process is dopamine. Midbrain dopamine neurons encode fundamental economic parameters pertaining to predicted rewards (magnitude, probability, delay and uncertainty) in their firing rate3–6 and innervate areas implicated in economic decision making (prefrontal cortex, amygdala, dorsal striatum and nucleus accumbens)7–9. Moreover, dopamine in the nucleus accumbens core (NAcc) enables animals to respond to cues and overcome effortful response costs10,11. However, to fully understand decision-making computations encoded by the mesoaccumbens dopamine pathway, the nature of the valuation signal needs to be deconstructed: specifically, how it accounts for changes in anticipated costs as well as benefits.
We employed fast-scan cyclic voltammetry to record phasic dopamine transmission in NAcc while rats performed decision-making tasks that independently manipulated either benefits or cost. All procedures were approved by the University of Washington Institutional Animal Care and Use Committee. Animals were trained to select between a reference option (sixteen lever presses for one food pellet) and an alternative that differed in either the reward magnitude (four or zero food pellets: benefit conditions) or response requirement (two or thirty two lever presses: cost conditions) as described in Supplementary Methods. Cues signaling the availability of the reference and/or alternative options were presented either separately in “forced” trials or simultaneously in choice trials (Fig. 1a). Forced trials allowed the evaluation of cue-evoked dopamine for one option without the confound of another option present, and choice trials provided a measure of behavioral preference. Data were evaluated after animals reached a behavioral criterion – choosing one option on ≥75% of choice trials. To prevent side-bias, assignment of high-/low-utility options to the two levers were always reversed from the previous session, and counterbalanced sessions for each contingency pair were included in the analysis.
Across all contingency pairs, animals consistently chose the option with the highest benefit or lowest cost (Fig. 1b; see Supp. fig. 4a for rate to criterion). Subjective preference was also evident on post-criterion forced trials where response latencies were significantly faster to higher-benefit or lower-cost options (all p<0.001; Supp. fig. 4c). Furthermore, when the high-benefit (4 pellets for 16 lever presses) and the low-cost (1 pellet for 2 lever presses) options were presented as concurrent choices in a decision-making session, animals were indifferent, demonstrating equivalent utility (Supp. fig. 5). Thus, not only was the utility of reward options successfully modulated as expected by both benefit and cost conditions (i.e. increased utility conferred to the option with greater benefit or lower cost), the additional utility conferred by increased benefits was equivalent to that conferred by decreased costs.
Despite predictable behavior, cue-evoked NAcc dopamine release did not track utility under all conditions. Manipulating reward magnitude led to a corresponding increase (main effect of reward size: F1,5=15.61, p=0.01) or decrease (F1,4=19.88, p=0.01) in cue-evoked dopamine compared to the reference option (Fig. 1b, Supp. fig. 6). Manipulations of response cost, on the other hand, did not always alter dopamine release. When the response cost of the alternative was increased, there was no difference in dopamine release between the reference and alternative option (main effect of response cost: F1,4=0.05, p=0.84, Fig. 1b) despite the strong behavioral preference for the reference option. When the response cost was reduced, there was greater dopamine release to the low-cost cue than to the reference (F1,4=25.38, p=0.007), but this was only significant in the first of two counterbalanced sessions in each animal (session×option interaction: p=0.03, F1,4=10.92; Supp. fig. 6). Post-hoc tests indicated that this effect was driven by a reduction in dopamine release to the low-cost (p=0.0006) but not the reference cue (p=0.20) across sessions.
To further investigate across-session effects, we performed regression analysis between utility encoding and experience with any "alternative" contingency prior to recording. Experience-related changes in cue-evoked dopamine release were only observed in the reduced-cost condition where the preferential dopamine release for the low-cost cue diminished over time (Pearson’s r=−0.830, p=0.005, n=9; Spearman’s rho=−0.817, p=0.007; Fig. 2a). Additional experimentation with a cohort of animals given more experience (>9 sessions) with the high-benefit option prior to recording verified that both behavioral preference and preferential encoding of the higher benefits was maintained with extended training (p=0.007, t=4.08, df=6, n=7 session; Fig. 2b). Conversely, in a parallel experiment with the low-cost option, cue-evoked dopamine release did not preferentially encode the low-cost option after additional experience prior to recording (p=0.16, t=1.55, df=8, n=9 sessions), even though behavioral preference is preserved (Fig. 2b). These data are consistent with the notion that, while preferential encoding of high benefit by dopamine release is stable over training, low costs are only preferentially encoded early in training. Further analyses of the neurochemical data with respect to contextual framing, choice trials and within-session learning are included in the Supplementary Results.
In making sound economic choices, one must consider a reasonable cost to obtain an outcome based on its perceived benefit. The data presented here demonstrate that phasic NAcc dopamine transmission reliably reflects the magnitude of the benefit, but only correlates with effort-discounted utility in situations where the response cost is both novel and better than the reference. Incorporating these findings with previous studies showing that dopamine enables effortful responses, we reason that representation of reward magnitude by phasic dopamine provides a threshold to determine worthwhile cost expenditures in familiar situations10–12. Moreover, in novel situations dopamine provides an additional opportunistic mechanism for exploitation of low-cost rewards that become available unexpectedly12–13. Thus, we show a dissociation between dopaminergic encoding of anticipated costs and benefits, demonstrating that, while dopamine release in the nucleus accumbens scales with the value of a pending reward, it is not sufficient to describe the net utility of the action to obtain it.
We would like to thank Scott Ng-Evans for invaluable technical support, Christina Akers and Sheena Barnes for assistance, and Jeremy Clark, Stefan Sandberg and Matthew Wanat for helpful comments. This work was funded by the National Institutes of Health (R01-MH079292, R21-AG030775, P.E.M.P.) and a Wellcome Trust Advanced Training Fellowship (M.E.W.). J.O.G. was supported by the National Institute of General Medical Sciences (T32-GM007270, Kimelman).
Author contributionsMEW and PEMP conceived the study. JOG and MEW collected and analyzed the data. All authors contributed to experimental design and preparation of the manuscript.