|Home | About | Journals | Submit | Contact Us | Français|
Recent theories addressing mesolimbic dopamine’s role in reward processing emphasize two apparently distinct functions, one in reinforcement learning (i.e. prediction error) and another in incentive motivation (i.e. the invigoration of reward-seeking elicited by reward-paired cues). Here we evaluate the latter.
Using fast-scan cyclic voltammetry, we monitored, in real-time, dopamine release in the nucleus accumbens core of rats (n=9) during a Pavlovian-to-instrumental transfer task in which the effects of a reward-predictive cue on an independently-trained instrumental action were assessed. Voltammetric data were parsed into slow and phasic components to determine whether these forms of dopamine signaling were differentially related to task performance.
We found that a reward-paired cue, which increased reward-seeking actions, induced an increase in phasic mesolimbic dopamine release and produced slower elevations in extracellular dopamine. Interestingly, phasic dopamine release was temporally-related to and positively correlated with lever-press activity, generally, while slow dopamine changes were not significantly related to such activity. Importantly, the propensity of the reward-paired cue to increase lever-pressing was predicted by the amplitude of phasic dopamine release events, indicating a possible mechanism through which cues initiate reward-seeking actions.
These data suggest that those phasic mesolimbic dopamine release events thought to signal reward prediction error may also be related to the incentive motivational impact of reward-paired cues on reward-seeking actions.
Recent theories addressing mesolimbic dopamine’s role in reward processing emphasize two apparently distinct functions. Actor-critic reinforcement learning models (1–3) posit phasic mesolimbic dopamine signaling serves a teaching function, reporting errors in reward prediction to areas involved in reward learning, such as the nucleus accumbens core (NAc). This hypothesis is based largely on data showing that both dopamine cell firing (4, 5) and phasic mesolimbic dopamine release (6) shift from reward delivery to reward-predictive cues during Pavlovian conditioning and discrete-trial instrumental tasks (7–9). The second popular hypothesis, based mainly on pharmacological manipulation and lesion data, proposes dopamine is responsible for mediating the incentive motivational properties of reward-predictive cues (10–13), allowing them to elicit appetitive behaviors (approach) and invigorate actions instrumental to obtaining reward (14–19).
Dopamine’s putative roles as prediction error signal and mediator of incentive motivation have been argued to reflect separate neurobehavioral processes, perhaps involving distinct phasic and tonic signaling, respectively (5, 11, 20, 21). However, several recent computational models of action selection have begun to integrate these concepts (22–24) and a recent study has suggested that phasic dopamine in particular displays the properties of a teaching signal in a Pavlovian learning task, but only in individuals in which reward-paired cues acquire incentive motivational properties (13, 25). Indeed, in evaluating the role of phasic dopamine in reward-seeking we recently found evidence that these two functions of dopamine, prediction error teaching signal and incentive motivation, may be linked. For rats performing a self-paced instrumental sequence task phasic mesolimbic dopamine release shifted from the reward to more distal elements of the sequence, ultimately preceding sequence initiation (26), consistent with actor-critic models. Interestingly, however, the amplitude of phasic mesolimbic dopamine release preceding action sequence initiation predicted the speed with which that sequence was completed (26), suggesting that the very same dopamine-mediated reward prediction error signal thought to be responsible for reinforcement learning (1–3) may also be related to the incentive motivational influence of cues on reward seeking (26–28). Therefore, here we examined the profile of both phasic dopamine release and slower changes in extracellular dopamine by measuring NAc dopamine release, with fast-scan cyclic voltammetry (FSCV), during a Pavlovian-to-instrumental transfer task in which the effects of a reward-paired cue on a separately-trained instrumental reward-seeking action were assessed. Unlike more commonly-used instrumental and Pavlovian cued-response tasks, in which the reward-paired cue is directly associated with either the instrumental action (e.g. lever press) or Pavlovian conditioned response (e.g. food cup approach), in this test there is no direct association between the reward-paired cue and the action; the Pavlovian-to-instrumental transfer test, therefore, provides a pure incentive motivation measure because the reward-paired cue does not signal predictive information with respect to the reinforcement of actions (14, 29).
Detailed methods are described in the Methods in Supplement 1. Briefly, male, Sprague Dawley rats (n=9) were implanted with carbon fiber microelectrodes into the NAc core. Chronically-implantable microelectrodes, as described previously (30), were chosen to minimize test-day stress by allowing training tethering habituation and removing electrode implantation stress. Such stress can disrupt reward-related behaviors, including Pavlovian-to-instrumental transfer (31). Figure 1 presents histological data. Rats were first given Pavlovian training wherein an auditory cue (the CS+) was paired with grain pellet reward delivery. In a second training phase rats received single lever instrumental training for the same grain pellet reward, in the absence of the previously-trained Pavlovian cue. At test FSCV was used to record dopamine concentration changes in the NAc core during a Pavlovian-to-instrumental transfer test in which rats were allowed to respond on the lever while the CS+ was presented 4 times intermixed with 4 presentations of a control cue (CS−).
Electrochemical data were analyzed using software written in LabVIEW (National Instruments). Chemometric analysis (32, 33) was used to isolate current changes due to dopamine from the FSCV data. Three main measures of dopamine were quantified and analyzed: dopamine transient frequency, dopamine transient amplitude and average non-transient slow dopamine change. Data analysis details can be found in the Methods in Supplement 1. All statistical tests were conducted with Graph Pad Prism (San Diego, CA) and SPSS (Chicago, IL). For all hypothesis tests the alpha level for significance was set to p<0.05.
This experiment was conducted in 3 phases. In phase 1, Pavlovian training was used to pair an auditory cue (the CS+) with grain pellet reward delivery and an alternate auditory cue (the CS−) with no reward. By the end of Pavlovian training all rats entered the food cup magazine significantly more during the CS+ probe period (at the CS+ onset prior to reward delivery) relative to the pre-CS period (t8=3.0, p=0.02). Phase 2 involved single lever instrumental training for grain pellet reward, in the absence of the previously-trained Pavlovian cues. In Phase 3 rats were given a general Pavlovian-to-instrumental transfer test whereby, following a short extinction period, both the CS+ and CS− were presented in alternation while the lever was available, but pressing was not rewarded. FSCV was used to monitor NAc dopamine concentration changes during this test. The core region of the accumbens was specifically selected here based on the described necessity of the accumbens core, but not the shell, for the expression of general Pavlovian-to-instrumental transfer (34).
The general Pavlovian-to-instrumental transfer test is an ideal measure of the Pavlovian incentive motivation that mediates the ability of a reward-paired stimulus to acquire general appetitive, motivational properties and thereby enhance reward approach behaviors and invigorate the performance of actions instrumental to gaining rewards (35–38). As seen in Figure 2, there was a clear Pavlovian-to-instrumental transfer effect marked by elevated lever pressing during the CS+ relative to the CS−. Statistical analysis showed main effects of CS (F1,8=14.25, p=0.005) and trial (F3,24=4.40, p=0.01), as well as a marginally insignificant interaction between these factors (F3,24=2.58, p=0.07). Bonferroni post hoc comparisons clarified this interaction; there was a significant increase in lever pressing induced by the CS+, relative to the CS−, during the first pair of trials only (p<0.001; p>0.05 for trials 2–4).
Figure 3A shows the effects of CS+ and CS− cue presentation on NAc dopamine levels averaged across trials. As can be seen in this figure, the onset of the reward-paired cue, but not the CS−, initiated a slow increase in dopamine that was sustained for the 2-min stimulus duration. Indeed, statistical analysis reveals that the average dopamine concentration change, with transient dopamine events filtered out, during the 2-min CS+ period was significantly greater than that during the CS− (t8=2.23, p=0.05; Figure 4A). Analysis of the average dopamine concentration across the 2-min CS time frame detected a main effect of CS (F1,8=5.01, p=0.05), but neither an effect of time (12, 10s epochs; F11,88=1.09, p=0.38), nor a time × CS interaction (F11,88=0.83, p=0.61). These data suggest that the reward-paired cue induced a sustained increase in extracellular dopamine concentration.
Such time-averaging of the data in Figure 3A obscures the rapid transient fluctuations in dopamine that were detected during both cue periods. To elucidate the effects of the reward-paired cue on phasic dopamine release events we identified and quantified such events during each CS period. The representative traces shown in Figure 3B and C provide examples (marked with asterisks) of dopamine transients that reached the threshold (2.5× root-mean square-noise) during the first CS− and CS+ presentation (for representative traces from trials 2–4 see Figure S2 in Supplement 1). In these single trial traces both the frequency and amplitude of phasic dopamine release events appear to be increased during the CS+ (Figure 3C) relative to the CS− (Figure 3B). Indeed, the average number of dopamine transients during CS+ presentations was significantly higher than during the CS− (t8=4.17, p=0.003, Figure 4B). Moreover, the average amplitude of these phasic dopamine release events was also greater during the CS+ relative to the CS− (t8=2.25, p=0.048; Figure 4C). Together these data suggest that reward-predictive cues result in both a sustained increase in non-transient extracellular dopamine levels and an increase in the frequency and amplitude of transient NAc dopamine release events.
To understand the relationship between dopamine release events and lever pressing activity during the Pavlovian-to-instrumental transfer test, we next assessed the temporal association between dopamine transients and lever pressing. Figure 5A shows the percentage of lever presses, during the CS− and CS+, that were immediately preceded (within 5s) by a dopamine transient (separating small- and large-amplitude transients relative to pre-CS mean transient amplitude). Statistical analyses revealed a main effect of transient size (F1,8=5.22, p=0.05), but neither an effect of CS (F1,8=0.95, p=0.36), nor an interaction between these factors (F1,8=0.03, p=0.87). Thus, regardless of the predictive nature of the CS (CS− control or reward-predictive CS+), lever presses were more likely to be preceded by larger rather than smaller dopamine events during CS presentation. Indeed, despite the fact that dopamine transients were generally larger during the CS+ relative to the CS− (Figure 4C) ‘large’ dopamine transients did occur during the control CS− period and during this period these larger amplitude dopamine transients were more likely to precede lever pressing actions than small amplitude transients (for representative example see Figure S2C in Supplement 1).
The low proportion of presses that were preceded by transients, regardless of size, may in part reflect the fact that transients tended to precede bouts of lever-pressing activity. To evaluate this we conducted an additional analysis focusing on isolated presses and presses that initiated a bout of pressing activity (defined by more than 1 press with an inter-press interval of <6s). Such lever press bout initiations were preceded by dopamine transients 19.58% (sem=6.74) and 39.1% (sem=7.19) of the time during the CS− and CS+, respectively. Figure 5B presents these data separating small- and large-amplitude transients. Statistical analyses revealed a main effect of transient size (F1,8=10.79, p=0.051) with more bout initiations being preceded by large transients, as in the previous analysis, but now we also see an effect of CS (F1,8=6.45, p=0.04), with a greater proportion of bout initiations being preceded by dopamine transients during the CS+ than the CS−. However, there was no interaction (F1,8=0.75, p=0.41), that is, both large and small dopamine transients were represented in the increased proportion of press bouts that were preceded by transients during the CS+. In summary, these analyses indicate a temporal relationship between reward-seeking actions and larger amplitude phasic dopamine release events.
We next investigated the specific relationship between dopamine signaling and cue-induced reward seeking by focusing an analysis on the period in the test where Pavlovian-to-instrumental transfer was apparent (i.e., during the first CS+ presentation (Figure 2)). These between-subjects correlation data are presented in Figure 6. As can be seen in Figure 6A, the average non-transient dopamine concentration change during the first CS+ presentation was not significantly associated with lever pressing (r9= −0.51, p=0.16), suggesting that slower dopamine changes induced by the CS+ do not track the ability of the reward-predictive cue to increase lever pressing. While dopamine transient frequency was not significantly correlated with the increase in reward-seeking actions induced by the CS+ (r9= −0.11, p=0.77, Figure 6B), the amplitude of those phasic dopamine release events did significantly positively predict the degree to which the cue invigorated reward-seeking actions during the CS+ (r9= 0.79, p=0.01, Figure 6C). Thus, in those rats for which the CS+ induced a larger increase in dopamine transient amplitude it also induced a larger increase in reward seeking. This between-subjects correlation shows a clear positive relationship between dopamine transient amplitude and cue-induced reward-seeking, suggesting a specific relationship between phasic dopamine release amplitude and incentive motivation.
Given the above result, we next examined the relationship between accumbal dopamine signaling and reward-seeking actions, generally to evaluate whether within-session changes in dopamine signaling were correlated with lever-pressing actions. Figure 7 shows the total number of lever presses (which is the same for 7A, C and E) for each 2-min CS plotted alongside the average non-transient slow change in dopamine (7A), the total number of dopamine transients (7C) and the average amplitude of those phasic release events (7E). As can be seen from these figures the effects of the CS+ on each dopamine measure did not depend on trial; statistical analyses separating trials 1–4 showed only main effects of CS (average dopamine (7A): F1,8=5.01, p=0.05; total transients (7C): F1,8=17.35, p=0.003; transient amplitude (7E): F1,8=5.43, p=0.048), with no significant effects of trial (average dopamine: F3,24=0.80, p=0.51; total transients: F3,24=2.77, p=0.06; transient amplitude: F3,24=2.48, p=0.09), or trial × CS interaction (average dopamine: F3,24=0.71, p=0.56; total transients: F3,24=0.80, p=0.51; transient amplitude: F3,24=0.026, p=0.99). More importantly, Figure 7 provides data regarding the relationship between dopamine signaling and reward-seeking. Scatter plots showing each data point are presented in Figures 7B, D and F. To take into account each measure of dopamine and evaluate which measures - transient dopamine frequency or amplitude or slower dopamine changes - predicted lever pressing we conducted a linear mixed model analysis including all data points from the CS− and CS+ for each rat. This analysis, which has been described for similar data previously (39, 40), was chosen for its ability to appropriately handle data in which observations are not independent (e.g., repeated measures of dopamine across a session), and for its model-corrected error terms and incorporation of random subject effects. Here we use this analysis to determine which dopamine measures were related to lever pressing activity across the entire test (including both CS− and CS+ periods) and with what relative strength. Average non-transient dopamine concentration change, total dopamine transients and average dopamine transient amplitude were used as fixed factors to predict the target: total lever presses per 2-min CS period. Subject identity was used as a random factor to control for electrode-specific variability in dopamine signaling (e.g., placement and sensitivity) and a variance component covariance structure was used because this provided the lowest possible information criterion score. The overall corrected model was found to significantly predict lever pressing activity (F3,68=5.28, p=0.002) and, specifically, dopamine transient amplitude positively predicted lever pressing activity (F1,68=11.57 p=0.001, Figure 7E,F). The coefficient estimate was 0.17 (± 0.05 SEM), indicating that every 5.88nM increment in the average dopamine transient amplitude predicted a single lever press increase. Interestingly, the frequency of this phasic dopamine release was not shown with this model to predict lever pressing (F1,68=10.36, p=0.55, coefficient = 0.127, Figure 7C,D), suggesting that phasic dopamine release amplitude, but not frequency of occurrence is associated with lever pressing likelihood. Importantly, slower non-transient average dopamine concentration changes did not significantly predict lever pressing (F3,68=0.15, p=0.70, coefficient = −0.0009, Figure 7A,B). These data suggest that greater phasic dopamine amplitude specifically, rather than slower dopamine concentration changes, predicted greater lever pressing activity across the entire test. Taken together these results suggest that phasic dopamine release event amplitude relates to the incentive motivation that allows reward-paired cues to control the performance of reward-seeking actions and that changes in such signaling also predict reward-seeking actions more generally.
We show that both the frequency and amplitude of phasic dopamine release events as well as slower non-transient dopamine levels in the NAc were increased during presentation of a reward-predictive cue in rats performing a Pavlovian-to-instrumental transfer test. Interestingly, dopamine transients were temporally related to lever-pressing activity and the propensity of the reward-paired cue to increase lever-pressing was predicted by the amplitude of these events, indicating a potential role for phasic mesolimbic dopamine release in incentive motivation. Moreover, across both CS− and CS+ periods the amplitude of specifically phasic dopamine release was positively correlated with lever-press activity, while slower dopamine changes were not significantly related to such actions.
Our finding that a reward-predictive cue increased both phasic dopamine signaling and produced slower increases in extracellular dopamine levels in the NAc corroborates previous studies using FSCV and microdialysis, respectively, to demonstrate that reward-paired cues elicit phasic dopamine release (6) and increase tonic extracellular dopamine levels (41) in the NAc. However, both of these studies employed salient, short-duration (10s) cues that signaled imminent reward delivery, whereas here we used long-duration cues that probabilistically-predicted reward. Somewhat paradoxically, these long-lasting, less predictive cues are known to be particularly effective in invigorating reward-seeking actions (42), perhaps because they serve as a context for reward delivery without eliciting strong competing conditioned behaviors (43). Interestingly, the amplitude of cue-evoked dopamine transients reported here was generally lower than reported in studies using short, highly predictive reward-paired cues (6), suggesting that cue-evoked phasic dopamine may encode the strength of reward anticipation. Additionally, since rewards were omitted when the CS+ was presented at test it is possible that negative reward predictions impacted the cue-evoked dopamine responses reported here, particularly during early trials when the omission of cue-contingent reward would be “surprising”. Such effects on dopamine signaling would be expected to be negative and thereby oppose the reward-paired cues’ excitatory effects on dopamine release and may account for the low amplitude responses we observed. Lastly, it is noteworthy that the effects of the cues on dopamine signaling were not diminished over trials, unlike their behavioral effects, suggesting that mesolimbic dopamine release was not simply a reflection of motor activity associated with lever pressing.
Regarding the relationship between dopamine and reward-related behavior, our data show that the amplitude of specifically phasic dopamine release predicted the ability of a reward-paired cue to invigorate reward-seeking actions. That is, phasic dopamine release amplitude, but neither dopamine transient frequency nor slower dopamine concentration changes, tracked the Pavlovian-to-instrumental transfer effect. These data are seemingly in-line with previous studies showing that phasic dopamine release in response to simple Pavlovian reward-paired cues (6, 13), or to cues that signal reinforcement availability or probability (8, 9, 28, 44), is related to reward seeking. However, in such cue-response tasks the cue is trained together with the response (instrumental action or Pavlovian conditioned response) and it is, therefore, unclear if cue-induced phasic dopamine relates to the prediction error/expectations of reinforcement or if it is related to the cue’s forward role in provoking reward seeking, i.e. the incentive motivational effects of the cue, or both. In the current study, because of the explicitly separate Pavlovian and instrumental training, the reward-paired cue does not signal predictive information with respect to reinforcement of the instrumental action. Therefore, any relationship between cue-induced dopamine signaling and reward-seeking can be attributed to the incentive motivational properties of the cue. In this study, however, we cannot rule out the possibility that phasic NAc dopamine release is associated with the outcome-specific, response-biasing effect of reward-paired cues, given that the Pavlovian stimulus was associated with the same outcome as the instrumental response. Importantly, dopamine signaling has been shown to be necessary for the general Pavlovian-to-instrumental transfer effect reflective of the incentive motivational impact of reward-paired cues (18), but unnecessary for outcome-specific Pavlovian-to-instrumental transfer (19), lending credence to the interpretation here that phasic dopamine signaling is related to cue-induced incentive motivation.
Although primarily assessing the impact of reward-paired cues on dopamine signaling and reward-seeking actions, our test also allowed for examination of the relationship between NAc dopamine signaling and reward-seeking activity more generally. Previous reports have suggested that tonic and phasic dopamine release are distinct channels of neurotransmission (45–47); while the frequency and amplitude of dopamine transients is considered reflective of phasic dopamine cell firing (48), slower non-transient dopamine concentration changes may represent tonic dopamine transmission, as previously suggested (33). Indeed, our observed changes in average dopamine levels are within the range of tonic extracellular dopamine concentrations estimated by FSCV (49) and other methods (50–53). Regardless of whether these slow dopamine changes are truly reflective of tonic dopamine release, we found that they were not significantly related to reward-seeking activity in this test, suggesting that slower dopamine concentration changes may not relate to incentive motivation. This is not to say, however, that tonic NAc dopamine does not relate to reward-seeking. Indeed, we have recently shown, using microdialysis, that changes in this measure are negatively correlated with fluctuations in the amount of effort (i.e., number of lever presses) required to obtain reward, or response cost (54). Similar results were likely not found in the current data given the study differences, including the presence of reward and explicit reward-paired cues. Ongoing studies are examining the relationship between slow and phasic dopamine concentration changes and response cost in reinforced reward-seeking tasks.
The amplitude of phasic dopamine release events did, however, predict reward-seeking activity in the current test. While we discuss above how cue-induced invigoration of reward-seeking is predicted by the amplitude of phasic dopamine release events, in the absence of such explicit experimenter-controlled stimuli, contextual cues, including the lever itself, come to control performance by virtue of their repeated pairings with reward delivery. Given the finding, for both the CS+ and CS−, that large dopamine transients predicted vigorous responding, a plausible interpretation of the data is that NAc dopamine transients enable the ability of these contextual cues to facilitate behavior; an interpretation that is supported by the observation that the amplitude of dopamine transients was greater during the CS+ than the CS−. These results are consistent with a large and growing body of evidence suggesting a role for phasic dopamine signaling in Pavlovian incentive motivation. Indeed, data from mice genetically altered to reduce phasic dopamine activity suggest that such activity is necessary for reward-motivated behaviors (55). Moreover, phasic NAc dopamine signaling has been shown to be related to both the acquisition and expression of cue-directed approach behavior (i.e.; sign-tracking), another commonly used assay of incentive motivation (13). Although our results are correlational, there is evidence suggesting that dopamine D1 receptors in the NAc, which are activated preferentially by phasic dopamine release (56), are necessary, although not exclusively so, for the expression of Pavlovian-to-instrumental transfer (57).
While these data support a role for phasic dopamine signaling in Pavlovian incentive motivation, we have previously shown that phasic dopamine release in the NAc backpropagates from the reward itself through a sequence of actions instrumental to reward delivery (26), a finding that is most readily interpreted as supporting a role in reinforcement learning. However, in that study we also found that phasic dopamine release occurring before rats initiated the action sequence predicted the time it took them to complete the task, indicating that it was related to task motivation. Together with the current results, these findings suggest that phasic dopamine signaling may be involved both in reinforcement learning and in the attribution of incentive salience, allowing reward-predictive cues, both those explicitly-trained and those more subtle environmental or internal, to provoke reward seeking. This hypothesis is bolstered by several recent attempts to integrate dopamine’s putative role as a mediator of incentive motivation into the reinforcement learning framework (23, 24), as well as the suggestion that phasic dopamine release relates to a prediction error teaching signal in individuals in which reward-paired cues acquire incentive motivational properties (25). In summary, these data provide further evidence for the involvement of specifically phasic NAc dopamine signaling in incentive motivation and self-initiated reward-seeking behavior.
This research was supported by grants DA009359 and DA005010 from NIDA to N.T.M., grant T32 DA024635 from NIDA and Hatos scholarship to K.M.W. and grant DA029035 to S.B.O. The authors would like to acknowledge and thank Dr. Paul Phillips and Christina Akers for generous assistance with the electrode manufacture technique. They would also like to thank Sarah Lechner for her assistance with electrode manufacture, data collection and analysis and Venuz Greenfield for her assistance with analysis. Lastly, the authors would like to thank Dr. Scott Ng-Evans for his hardware and software assistance.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
All authors report no biomedical financial interests or potential conflicts of interest.