PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nppharmLink to Publisher's site
 
Neuropsychopharmacology. Nov 2012; 37(12): 2653–2660.
Published online Jul 18, 2012. doi:  10.1038/npp.2012.129
PMCID: PMC3473331
Relative Response Cost Determines the Sensitivity of Instrumental Reward Seeking to Dopamine Receptor Blockade
Sean B Ostlund,1,2* Alisa R Kosheleff,1,2 and Nigel T Maidment1,2
1Department of Psychiatry and Biobehavioral Sciences, Semel Institute for Neuroscience and Human Behavior, UCLA, Los Angeles, CA, USA
2Brain Research Institute, UCLA, Los Angeles, CA, USA
*Department of Psychiatry and Biobehavioral Sciences, Semel Institute for Neuroscience and Human Behavior, UCLA, Box 951759, 760 Westwood Plaza, Los Angeles, CA 90024, USA. Tel: +310 206 7890, Fax: +310 206 5895; E-mail: sostlund/at/ucla.edu
Received May 17, 2012; Revised June 19, 2012; Accepted June 26, 2012.
Dopamine is a critical mediator of instrumental reward seeking behavior and appears to have a particularly important role in motivating actions that require considerable effort. As with rewards, response costs can be evaluated in both absolute and relative terms. The current study investigated whether the extent to which instrumental performance is dependent on dopamine transmission is influenced by relative or absolute response cost. Three groups of rats were rewarded for lever pressing on different fixed ratio (FR) schedules that required 1 (FR-1), 10 (FR-10), or 20 (FR-20) presses for each food reward. Rats were then injected systemically with flupentixol, a dopamine receptor antagonist, or vehicle before testing all groups on an intermediate-cost (FR-10) schedule, such that only the relative cost of responding differed across groups. Rats experiencing an upshift in cost (group FR-1/FR-10) showed greater response suppression following flupentixol administration than rats experiencing no shift in cost (group FR-10/FR-10), whereas flupentixol treatment had no effect on rats experiencing a downshift in cost (group FR-20/FR-10). A second round of flupentixol tests was conducted using the rats' maintenance schedules, such that only absolute response costs differed across groups. Here, the pattern was reversed among the groups, in line with previous reports. Specifically, flupentixol had a stronger suppressive effect in group FR-20/FR-20 than in group FR-10/FR-10, and had no detectable effect in group FR-1/FR-1. These findings suggest that response costs are evaluated in both absolute and relative terms and that dopamine has a role in overcoming both kinds of cost.
Keywords: dopamine, utility, effort, cost, reinforcement, contrast
Although dopamine's specific contributions to motivated behavior continue to be debated (Dickinson et al, 2000; Berridge and Robinson, 2003; Wise, 2008; Nicola, 2010; Schultz, 2010; Glimcher, 2011), a large body of work has demonstrated its role in overcoming effort, and other forms of response cost, during the pursuit of reward (Walton et al, 2006; Salamone et al, 2007; Floresco et al, 2008b; Beeler et al, 2010). This is exemplified by the alterations in effortful reward-seeking behaviors brought about by treatments that disrupt or augment dopamine signaling (Beeler et al, 2010; Cousins and Salamone, 1994; Lindner et al, 1997; Aberman and Salamone, 1999; Hamill et al, 1999; Salamone et al, 2001; Mingote et al, 2005; Salamone et al, 1991, 1994a, 2007; Zhang et al, 2003). These findings are complemented by reports that task-related changes in extracellular dopamine levels vary with response cost (Sokolowski et al, 1998; Walton et al, 2006; Day et al, 2010; Wanat et al, 2010; Nasrallah et al, 2011; Ostlund et al, 2011).
Contemporary theories have addressed such findings by assigning dopamine a response-activating function; it is assumed that the amount of dopamine released determines the vigor of reward seeking, increasing the likelihood that more effortful or costly response options will be chosen (Robbins et al, 1998; Dickinson et al, 2000; Berridge and Robinson, 2003; Niv et al, 2007; Phillips et al, 2007; Salamone et al, 2007). Consistent with this account, recent studies have shown that anticipatory (pre-response) dopamine cell firing (Satoh et al, 2003) and mesolimbic dopamine release (Wassum et al, 2012) correlate with the vigor of reward seeking behavior.
One fundamental issue that must be resolved to improve our general understanding of cost/benefit decision-making and dopamine's role in this process is how response costs are evaluated. A common simplifying assumption is that these costs are absolute, fixed by the intrinsic properties of the action itself (in the case of effort), or the negative consequences of the action (eg, delayed reinforcement or punishment). However, behavioral findings suggest that such costs are evaluated in both absolute and relative terms, taking into account information about the cost of alternative options and/or the expected costs of a potential action based on previous experience. As evidence for the latter, rats lever pressing for food reward on a low (response-to-reward) ratio schedule of reinforcement will struggle to maintain their performance if the ratio schedule is suddenly increased—a phenomenon known as ratio strain—even though the same high ratio schedule would tend to encourage an even higher response rate if it were introduced more gradually. There is some evidence that dopamine's involvement in reward seeking involves processing relative response costs. For instance, rats pretrained to lever press on a high effort schedule (fixed ratio (FR)-300) before receiving dopamine deleting lesions of the nucleus accumbens exhibited a modest suppression of responding when tested on a less effortful schedule (FR-5), compared with deficits observed in rats not exposed to a high-effort pretraining condition (Salamone et al, 1993a, 1993b, 2001; Aberman and Salamone, 1999). This suggests that expecting a high cost may lead to underestimation of more moderate costs, decreasing the dopamine dependence of responding.
In the current study we investigate dopamine's role in overcoming relative rather than absolute response costs (see Figure 1). The basic design of the study involved training rats to lever press for food reward on either a high (FR-20), moderate (FR-10), or low (FR-1) cost reinforcement schedule before testing them under varying doses (0, 0.075, or 0.225 mg/kg) of the nonspecific dopamine receptor antagonist flupentixol. Importantly, for these initial tests, all rats were shifted to the moderate cost schedule. Thus, the response cost at test was lower (FR-20/FR-10), higher (FR-1/FR10), or no different (FR-10/FR-10) than what should have been expected based on their recent training history. A second round of tests was conducted to assess the effects of flupentixol on reward seeking with rats reinforced on their maintenance schedules, which differed in absolute response costs. Our findings demonstrate that dopamine is critical for adapting to relative and absolute response costs.
Figure 1
Figure 1
Outline of experimental procedures. During initial lever press training, rats were gradually shifted from a low-effort (fixed ratio (FR)-1) to an intermediate-effort (FR-10) reinforcement schedule. Different groups (n=12) were then given maintenance training (more ...)
Subjects and Apparatus
Thirty-six adult (~90 days) Sprague Dawley male rats (Charles Rivers Laboratories, Wilmington, MA) were used as subjects. They were group housed in transparent plastic cages (2–3 per cage) with corncob bedding in a temperature and humidity controlled vivarium. Experimental procedures were conducted during the light phase of a 12-h/12 h light/dark cycle. Throughout training and testing, the rats were kept on a food deprivation regimen to maintain them at ~85% of their free-feeding body weight, giving each rat between 10 and 14 g of home chow per day after their daily training/testing sessions. Tap water was continuously available in the home cages. All procedures are in compliance with the National Research Council's Guide for the Care and Use of Laboratory Animals and were authorized by the institutional animal care and use committee of UCLA.
Behavioral procedures took place in eight identical Med Associates operant chambers (East Fairfield, VT), each equipped with a retractable lever located to the left of a recessed food cup that received deliveries of 45 mg grain-based food pellets (Bioserv, Frenchtown, NJ). A photobeam detector was used to monitor food cup approaches. A houselight (24 V, 2 W) positioned at the center-top of the opposite wall provided illumination during all experimental sessions.
Behavioral Training
Rats were initially given a 90-min magazine training session in which pellets were non-contingently delivered on a random time 5-min schedule. They were then given 11 days of instrumental conditioning. On each day, the rats were placed in the chamber and the lever was inserted, which the rats could press to earn food pellets. Sessions ended after 30 pellets were earned or 30 min had elapsed, whichever came first. Session length served as the primary measure of responding, with shorter session times indicating more vigorous task performance. Over the first seven sessions, all rats were shifted through increasingly more effortful ratio schedules, earning a pellet with every press (FR-1) on days 1–2, every 5 presses (FR-5) on days 3–4, and every 10 presses (FR-10) on days 5–7. This initial training was used to ensure that all rats were experienced in responding on the FR-10 schedule, which would be used again during testing. Separate groups of rats (n=12) were then given an additional four sessions of training with either a more effortful schedule (FR-20), the same schedule (FR-10), or a less effortful schedule (FR-1).
Relative Response Cost Testing
After initial training, the rats were given the first of three tests to assess the effects of systemic dopamine receptor blockade on their lever press performance. Two hours before each test, rats were given an injection (i.p., 1.0 ml/kg) of 0.0, 0.075, or 0.225 mg/kg of the nonspecific dopamine receptor antagonist flupentixol (Sigma Aldrich, St Louis, MO) mixed in sterile saline. Each rat received one test under each of the three treatments, counterbalancing test order across training groups with a latin square design. Similar to the training sessions, test sessions ended after 30 pellets had been earned or 30 min had elapsed, whichever came first. However, during this round of testing, all subjects were reinforced on the moderate-effort FR-10 schedule. Rats were given a day off after each test and were then given 3 days of retraining on their maintenance schedules (FR-1, FR-10, or FR-20) before the next test session. They were then given an additional round of retraining before undergoing absolute response cost testing.
Absolute Response Cost Testing
Rats were given a series of three tests under vehicle, low and high dose flupentixol to assess the impact of dopamine receptor blockade on reward seeking while maintained on schedules varying in absolute response cost. These tests were conducted using the same basic procedures used during relative response cost testing, including treatment order, except that lever pressing was reinforced using each rats' maintenance training schedule at test. Again, rats were given a day off followed by 3 days of retraining between tests.
Initial Lever Press Training
All rats learned to lever press for food pellets over the first 7 days of training, when all groups were shifted through the same series of ratio schedules (Figure 2—days 1–7). Not surprisingly, the increasing response requirements associated with these schedules resulted in longer sessions times (Figure 2; top panel). A mixed ANOVA (session × group) detected a main effect of session (F6,198=29.62; p<0.001), but found no effect of group (F1,33=0.328; p>0.05) or group × session interaction (F12,198=0.850; p>0.05). However, for each new ratio schedule, the time it took rats to complete the session decreased over days; we detected a significant effect of session for FR-1 training (F1,35=79.84; p<0.001), FR-5 training (F1,35=15.53; p<0.001), and FR-10 training (F2,70=6.544; p<0.01). Rats appeared to learn to perform the task more efficiently (measured as the average number of presses performed before a rat checked the food cup) with practice (Figure 2; bottom panel). During FR-1 training, when each press resulted in a food pellet delivery, the rats displayed an appropriately low ratio of press-to-food cup approaches, indicating that they were, for the most part, checking the food cup after each press. However, such a low ratio is not compatible with rapid collection of food pellets when higher ratio schedules are in place as premature approaches waste time and effort. The rats adapted to changes in response demand appropriately, increasing their ratio of press-to-approaches. This was confirmed by an ANOVA (session × group), which found a main effect of session (F6,198=114.397; p<0.001) but no effect of group (F1,33=0.098; p>0.05) or interaction between these variables (F12,198=0.207; p>0.05).
Figure 2
Figure 2
Instrumental training results, plotted separately for groups fixed ratio (FR)-1, FR-10, and FR-20. Top panel, mean session length (seconds; ±SEM) during initial lever press training (sessions 1–7) and maintenance training (sessions 8–11). (more ...)
Maintenance Schedule Training
Separate groups of rats were then given 4 days of training on different maintenance schedules: either FR-1, FR-10, or FR-20 (Figure 2—days 8–11). The new schedules had an immediate impact on the amount of time it took rats to earn 30 pellets. An ANOVA (session × group) found a significant main effect of group (F1,33=65.071; p<0.001), but found no effect of session (F3,99=1.24; p>0.05) or group × session interaction (F6,99=1.885; p=0.09). Bonferroni post hoc analysis confirmed that group FR-20 took longer than group FR-10 (p<0.001) and group FR-1 (p<0.001), and that group FR-10 took significantly longer than FR-1 (p<0.001). As during initial training, the response requirement of the maintenance schedule caused rats to adapt their food cup checking behavior, reflected in their ratio of press-to-food cup approaches (Figure 2; bottom panel). An ANOVA found main effects for group (F2,33=37.613; p<0.001) and session (F3,99=6.180; p<0.001), as well as a significant group × session interaction (F6,99=2.651; p<0.05), indicating that their behavior continued to adapt over days. Post hoc testing found that group FR-1 displayed a lower ratio than groups FR-10 (p<0.001) or FR-20 (p<0.001), which was appropriate given the former group's minimal response requirement. Interestingly, Groups FR-10 and FR-20 did not significantly differ from one another (p>0.05), even though group FR-20 had to press twice as much as group FR-10 for each reward. This premature checking behavior may have been encouraged by the absence of any explicit signal for food pellet deliveries.
Relative Response Cost Testing
We then conducted a series of tests to determine if the suppressive effect of dopamine receptor blockade on reward seeking is primarily dependent on the absolute cost of responding at test or whether this relationship is modulated by subjects' expectations about response cost based on their pretraining. During these tests all rats were reinforced on the same FR-10 schedule, such that the absolute amount of effort needed to earn reward was identical across groups. However, the relative cost of responding differed across groups: compared with their maintenance schedule, group FR-1/FR-10 experienced an upshift in cost, group FR-20/FR-10 experienced a downshift in cost, and group FR-10/FR-10 experienced no shift in cost.
Consistent with the relative cost hypothesis, we found that these groups were differentially sensitive to the suppressive effects of the dopamine receptor antagonist flupentixol (Figure 3). An ANOVA performed on the time it took rats to complete the test session found a significant main effect of treatment (F2,66=14.361; p<0.001), a marginally insignificant effect of training group (F2,33=2.670; p=0.084), and a significant treatment × group interaction (F4,66=3.17; p<0.05). Bonferroni post-hoc comparisons of mean session lengths during the saline test did not detect any significant differences between groups (p>0.05), indicating that rats rapidly adapted their performance to the FR-10 schedule used at test. Individual repeated-measures ANOVAs, conducted separately for each group, found that flupentixol treatment significantly increased session lengths in group FR-1/FR-10 (F2,22=10.74; p<0.001), which experienced an upshift in cost, and in group FR-10/FR-10 (F2,22=4.169; p<0.05), the unshifted control condition. A mixed treatment × group ANOVA performed on the data from these two groups detected a significant treatment × group interaction (F2,44=5.073; p<0.05). Analysis of the effect sizes for each group confirmed that the disruptive effect of flupentixol was greater in group FR-1/FR-10 (η2=0.494) than in FR-10/FR-10 (η2=0.298). Although flupentixol treatment resulted in a similar numerical trend in the mean session lengths in group FR-20/FR-10, a repeated measures ANOVA found no effect of treatment in this group (F2,22=0.993; p>0.05), and effects size analysis found that flupentixol treatment accounted for an appreciably smaller amount of the total variance in their session lengths (η2=0.083) than in either of the other groups, indicating that the downshift in response cost protected them from the suppressive effects of flupentixol.
Figure 3
Figure 3
Relative response costs modulate the suppressive effects of dopamine receptor blockade on reward seeking. Rats were pretreated with different doses of flupentixol before each of three test sessions on the intermediate-effort (fixed ratio (FR)-10) task. (more ...)
The pattern of food cup checking behavior appeared to depend on the rats' training history. Even though all rats were reinforced every 10 presses, the ratio of press-to-food cup approaches varied across groups, presumably resulting in differences in task efficiency. An ANOVA found a main effect of group (F2,33=7.737; p<0.001) but found no effect of treatment (F2,66=1.529; p>0.05) or treatment × group interaction (F4,66=0.837; p>0.05). Post hoc testing found that the press-to-approach ratio was significantly lower for group FR-1/FR-10 than for group FR-10/FR-10 (p<0.05) or group FR-20/FR-10 (p<0.01), the later two groups not differing from each other (p>0.05). Therefore, it appears that dopamine receptor blockade did not affect this measure of task efficiency. However, given that group FR-1/FR-10 was both the most sensitive to flupentixol and the least efficient in checking the food cup, we examined if task efficiency influenced the impact of flupentixol on performance across rats. To do so, we assessed the correlation (Pearson's, two-tailed) between rats' press-to-approach ratio during their saline test, which should reflect their efficiency on the FR-10/FR-10 schedule used at test, with their session length during the high dose flupentixol (0.225 mg/kg) test, computed as a percentage of their session length during the saline test to control for baseline response rate (Figure 4). Although no relationship was detected for groups FR-10/FR-10 (p>0.05) or FR-20/FR-10 (p>0.05), a significant negative correlation (r=−0.774; p<0.01) was found for group FR-1/FR-10. Specifically, the least efficient rats—those most likely to prematurely check the food cup—showed the strongest task suppression when treated with a high dose of flupentixol. Together with the group differences noted above, this finding suggests that the efficiency of task performance is an important factor affecting the dopamine-dependence of reward seeking behavior.
Figure 4
Figure 4
For rats experiencing an upshift in response requirement during relative response cost testing, individual differences in task efficiency correlated with sensitivity to the suppressive effects of flupentixol. Scatter plots for groups fixed ratio (FR)-1/FR-10 (more ...)
Absolute Cost Testing
Although the current study focused on the question of whether relative response costs modulate the dopamine-dependence of reward seeking, previous studies have already established that absolute response costs can have such an effect. To confirm this relationship under the current experimental conditions, we conducted a series of tests to characterize the effects of dopamine receptor antagonism on reward seeking supported by the rats' maintenance reinforcement schedules (ie, FR-1/FR-1, FR-10/FR-10, or FR-20/FR-20), predicting that rats with higher response demands would be the most sensitive to flupentixol. The results are presented in Figure 5. Consistent with this prediction, an ANOVA performed on test session lengths (Figure 5; top panel) found a significant main effect of treatment (F2,66=8.522; p<0.001) and a significant treatment × group interaction (F4,66=3.242; p<0.05), confirming that the suppressive effects of flupentixol depended on the rats' reinforcement schedule at test. A significant treatment effect was found for group FR-20/FR-20 (F2,22=6.337; p<0.01), but not for group FR-1/FR-1 (F2,22=1.693; p>0.05). Although a numerical trend for the treatment effect was observed in group FR-10/FR-10, it did not reach significance (F2,22=2.001; p>0.05). This is noteworthy as this group showed a significant effect under identical conditions during relative response cost testing. The null effect observed here may have resulted from the additional training they received (Choi et al, 2005), or may reflect tolerance to the suppressive effects of flupentixol because of repeated exposure. The ANOVA also detected a significant main effect of group (F1,33=33.004; p<0.001), and post hoc testing confirmed that group FR-20/FR-20 took longer than group FR-1/FR-1 (p<0.001) and group FR-10/FR-10 (p<0.001) to finish their test session, and that group FR-10/FR-10 took longer than group FR-1/FR-1 (p<0.01), which is not surprising given the group differences in response requirement during these sessions.
Figure 5
Figure 5
Absolute response costs modulate the suppressive effects of dopamine receptor blockade on reward seeking. Rats were pretreated with different doses of flupentixol before each of three test sessions on the maintenance reinforcement schedules, which were (more ...)
The maintenance schedules supported different patterns of food cup checking, but only marginal effects of flupentixol were apparent on this aspect of behavior (Figure 5; bottom panel). An ANOVA found a significant effect of group (F1,33=55.748; p<0.001), a marginally insignificant effect of treatment (F2,66=2.688; p=0.077) and a marginally insignificant interaction between group and treatment (F4,66=2.083; p=0.093). Despite these trends, further analysis found no flupentixol effect in any group (largest F value: F2,22=2.502; p=0.105 for group FR-20/FR-20). Post hoc testing revealed that the press-to-approach ratio was greater for group FR-20/FR-20 than for FR-1/FR-1 (p<0.001) or for FR-10/FR-10 (p<0.001), and that the ratio for the latter groups differed from each other (p<0.001).
We investigated whether dopamine signaling is required for overcoming absolute and relative response costs for rats performing an instrumental lever-pressing task for food reward. Rats exerting a substantial amount of effort for each food reward (group FR-20/FR-20) were more likely to reduce their performance when pretreated with the flupentixol than rats performing low-effort (group FR-1/FR-1) or intermediate-effort (group FR-10/FR-10) tasks. However, when all rats were tested on the intermediate-effort task, flupentixol pretreatment was most effective in suppressing reward seeking in those rats experiencing an upshift in effort (group FR-1/FR-10). Furthermore, compared with rats maintained on the intermediate-effort schedule (group FR-10/FR-10), rats experiencing a downshift in effort (group FR-20/FR-10) were protected from the suppressive effects of flupentixol. These findings demonstrate that dopamine signaling is critical for adapting to changes in task demands and suggest that these relative, or subjective, response costs have a fundamental role in decision-making, even when the absolute, or objective, costs are held constant.
Interestingly, both individual and group differences in task efficiency, measured as the average number of presses performed before a rat checked the food cup, correlated with the dopamine-dependence of task performance. Thus, it appears that the way in which rats organized their reward seeking behavior influenced that behavior's vulnerability to disruption by dopamine receptor blockade. This relationship is consistent with three substantively different accounts that deserve consideration. First, if it is reasonable to take the press-to-approach ratio as a measure of how readily a rat adapts to a change in response requirement, this pattern suggests that rats experiencing the strongest ratio strain (ie, the most difficulty adapting to an upshift in response requirement) also showed the greatest sensitivity to dopamine antagonism. This interpretation is bolstered by the finding that rats shifted from FR-20 down to FR-10 at test exhibited the highest press-to-approach ratio and were also least affected by flupentixol. However, an alternative interpretation along similar lines is that these differences in the press-to-approach ratio represent objective, rather than subjective, differences in response cost since it is likely that premature visits to the food cup increase the net energy expenditure for the task. While these accounts focus on the role of costs (either relative or absolute) in determining the dopamine-dependence of performance, the flexible approach hypothesis (Nicola, 2010) provides a categorically different account. This framework assumes that dopamine transmission is required for initiating original (ie, non-habitual or undertrained) action sequences, particularly when subjects have disengaged from the instrumental task and must navigate back to the manipulandum from a new location in order to earn reward (Nicola, 2010). To explain why high-effort tasks tend to be more sensitive to dopamine receptor antagonism than low-effort tasks, this account notes that high-effort tasks cause subjects to disengage from the task following reward delivery (ie, the post-reinforcement pause). It is the ability to reinitiate performance during such pauses that is supposedly compromised by dopamine receptor antagonism. By extension, the flexible approach hypothesis may explain the current finding that an upshift in effort increased the dopamine-dependence of instrumental performance, if it can be assumed that this manipulation caused rats' in group FR-1/FR-10 to disengage from the task. Although it is difficult to fully evaluate this account without detailed analysis of individual rats' locomotion in the operant chamber at test, the finding that upshifted rats tended to perform the task inefficiently, making unnecessary trips to the food cup between reinforced presses, indicates that they were more frequently required to transition back from the food cup to the lever than unshifted or downshifted rats. However, it is important to note that no statistically significant differences in session length were detected between groups during the control (saline) relative response cost test. Indeed, a more detailed analysis of the rats' mean latency to return to lever pressing after checking the food cup (regardless of whether or not a reward was delivered) during this test suggests that group FR-1/FR-10 (2.79 s; ±0.37) showed a tendency to be, if anything, more efficient in re-initiating task performance than group FR-20/FR-10 (4.23 s; ±0.51; Bonferroni post-hoc test: p=0.05) or group FR-10/FR-10 (3.39 s; ±0.32; p>0.10). Therefore, although the current findings do not provide a critical test of the flexible approach hypothesis, they seem to be more compatible with an effort-based interpretation of dopamine function.
The finding that dopamine signaling has a role in processing response costs is consistent with a number of recent studies measuring dopamine signaling in the nucleus accumbens. For instance, one study using microdialysis to track session-to-session changes in task-related dopamine efflux in rats lever pressing for food reward on a random ratio schedule found that differences in dopamine efflux across sessions tracked fluctuations in experienced response cost (Ostlund et al, 2011). Specifically, increases in the average number of presses required to earn reward were associated with decreases in dopamine efflux. These changes in dopamine may provide an incentive motivational function, in line with a recent computational model of tonic dopamine (Niv et al, 2007). Thus, when adapting to a change in work requirement, increases in dopamine levels may invigorate behavior when rewards are cheap, whereas decreases in dopamine may discourage responding when rewards are rare or costly. These short-term fluctuations in dopamine signaling can be contrasted with patterns observed in situations involving well-trained subjects. In such cases dopamine efflux during instrumental performance tests tends to be greater for rats trained on high-effort tasks than for those trained on low-effort tasks (Salamone et al, 1994b; Sokolowski et al, 1998; Segovia et al, 2011). Thus, it may be that short-term fluctuations in dopamine related to unexpected changes in response cost are countered by a more slowly acquired pattern of dopamine signaling (Ahn and Phillips, 2007), which may be critical for overcoming more persistent response costs.
Dopamine's role in processing relative response costs may also explain its contributions to cost/benefit decision-making. It is well established that treatments that disrupt dopamine signaling bias rats towards less costly response options (Cousins and Salamone, 1994; Denk et al, 2005; Floresco et al, 2008a). Furthermore, studies measuring phasic mesolimbic dopamine signaling during cost-based decision making tasks have found some, albeit mixed, evidence that anticipatory dopamine responses are lower for high-cost options, independent of changes related to reward magnitude (Day et al, 2010; Gan et al, 2010; Wanat et al, 2010; Nasrallah et al, 2011). Importantly, for two-option choice tasks, absolute and relative costs are necessarily conflated. For instance, when choosing between options that require either 16 presses or a single press to produce reward, the difference in absolute cost may be used to encode relative costs (eg, Option A is 16 times harder than Option B). Similarly, the progressive ratio task, where each new reward becomes harder to obtain, also conflates relative and absolute costs as subjects are likely tracking changes in effort over the session. To dissociate these two aspects of response cost, we applied a within-subjects procedure to manipulate relative response costs while holding absolute costs constant across groups, allowing us to attribute group differences in sensitivity to dopamine receptor antagonism to differences in relative response cost.
Although dopamine appears to incorporate information about past and present task conditions to regulate reward seeking, it is not required for learning about the incentive value of instrumental goals (Dickinson et al, 2000; Wassum et al, 2011) or using goal representations to guide action selection (Ostlund and Maidment, 2012), suggesting that dopamine tracks nonspecific information about reward value and response costs. Although these variables are the direct products of instrumental action, the dopamine system may encode such information through a Pavlovian learning process in the background of instrumental learning. This would explain reports that dopamine receptor antagonism selectively disrupts the response-invigorating effects of Pavlovian, reward-paired cues on instrumental reward seeking (Dickinson et al, 2000; Lex and Hauber, 2008; Wassum et al, 2011; Ostlund and Maidment, 2012).
The current results demonstrate that the dopamine-dependence of instrumental performance is modulated by both its absolute and relative costs. Advancing our understanding of dopamine's role in reward seeking will likely require a deeper appreciation for how such costs are computed and implemented to guide decision-making. Such knowledge could be useful for developing treatments to encourage healthy but effortful behaviors and/or discourage unhealthy behaviors (eg, compulsive drug or food seeking) associated with dysregulated dopamine signaling.
Acknowledgments
This work was supported by grant DA09359 and DA05010 from NIDA to NTM and grant DA029035 to SBO.
Notes
The authors declare no conflict of interest.
  • Aberman JE, Salamone JD. Nucleus accumbens dopamine depletions make rats more sensitive to high ratio requirements but do not impair primary food reinforcement. Neuroscience. 1999;92:545–552. [PubMed]
  • Ahn S, Phillips AG. Dopamine efflux in the nucleus accumbens during within-session extinction, outcome-dependent, and habit-based instrumental responding for food reward. Psychopharmacology. 2007;191:641–651. [PMC free article] [PubMed]
  • Beeler JA, Daw N, Frazier CR, Zhuang X. Tonic dopamine modulates exploitation of reward learning. Front Behav Neurosci. 2010;4:170. [PMC free article] [PubMed]
  • Berridge KC, Robinson TE. Parsing reward. Trends Neurosci. 2003;26:507–513. [PubMed]
  • Choi WY, Balsam PD, Horvitz JC. Extended habit training reduces dopamine mediation of appetitive response expression. J Neurosci. 2005;25:6729–6733. [PubMed]
  • Cousins MS, Salamone JD. Nucleus accumbens dopamine depletions in rats affect relative response allocation in a novel cost/benefit procedure. Pharmacol Biochem Behav. 1994;49:85–91. [PubMed]
  • Day JJ, Jones JL, Wightman RM, Carelli RM. Phasic nucleus accumbens dopamine release encodes effort- and delay-related costs. Biol Psychiatry. 2010;68:306–309. [PMC free article] [PubMed]
  • Denk F, Walton ME, Jennings KA, Sharp T, Rushworth MF, Bannerman DM. Differential involvement of serotonin and dopamine systems in cost-benefit decisions about delay or effort. Psychopharmacology. 2005;179:587–596. [PubMed]
  • Dickinson A, Smith J, Mirenowicz J. Dissociation of Pavlovian and instrumental incentive learning under dopamine antagonists. Behav Neurosci. 2000;114:468–483. [PubMed]
  • Floresco SB, Tse MT, Ghods-Sharifi S. Dopaminergic and glutamatergic regulation of effort- and delay-based decision making. Neuropsychopharmacology. 2008a;33:1966–1979. [PubMed]
  • Floresco SB, St Onge JR, Ghods-Sharifi S, Winstanley CA. Cortico-limbic-striatal circuits subserving different forms of cost-benefit decision making. Cogn Affect Behav Neurosci. 2008b;8:375–389. [PubMed]
  • Gan JO, Walton ME, Phillips PE. Dissociable cost and benefit encoding of future rewards by mesolimbic dopamine. Nat Neurosci. 2010;13:25–27. [PMC free article] [PubMed]
  • Glimcher PW. Understanding dopamine and reinforcement learning: the dopamine reward prediction error hypothesis. Proc Natl Acad Sci USA. 2011;108 (Suppl 3:15647–15654. [PubMed]
  • Hamill S, Trevitt JT, Nowend KL, Carlson BB, Salamone JD. Nucleus accumbens dopamine depletions and time-constrained progressive ratio performance: effects of different ratio requirements. Pharmacol Biochem Behav. 1999;64:21–27. [PubMed]
  • Lex A, Hauber W. Dopamine D1 and D2 receptors in the nucleus accumbens core and shell mediate Pavlovian-instrumental transfer. Learn Mem. 2008;15:483–491. [PubMed]
  • Mingote S, Weber SM, Ishiwari K, Correa M, Salamone JD. Ratio and time requirements on operant schedules: effort-related effects of nucleus accumbens dopamine depletions. Eur J Neurosci. 2005;21:1749–1757. [PubMed]
  • Nasrallah NA, Clark JJ, Collins AL, Akers CA, Phillips PE, Bernstein IL. Risk preference following adolescent alcohol use is associated with corrupted encoding of costs but not rewards by mesolimbic dopamine. Proc Natl Acad Sci USA. 2011;108:5466–5471. [PubMed]
  • Nicola SM. The flexible approach hypothesis: unification of effort and cue-responding hypotheses for the role of nucleus accumbens dopamine in the activation of reward-seeking behavior. J Neurosci. 2010;30:16585–16600. [PMC free article] [PubMed]
  • Niv Y, Daw ND, Joel D, Dayan P. Tonic dopamine: opportunity costs and the control of response vigor. Psychopharmacology. 2007;191:507–520. [PubMed]
  • Ostlund SB, Maidment NT. Dopamine receptor blockade attenuates the general incentive motivational effects of noncontingently delivered rewards and reward-paired cues without affecting their ability to bias action selection. Neuropsychopharmacology. 2012;37:508–519. [PMC free article] [PubMed]
  • Ostlund SB, Wassum KM, Murphy NP, Balleine BW, Maidment NT. Extracellular dopamine levels in striatal subregions track shifts in motivation and response cost during instrumental conditioning. J Neurosci. 2011;31:200–207. [PMC free article] [PubMed]
  • Phillips PE, Walton ME, Jhou TC. Calculating utility: preclinical evidence for cost-benefit analysis by mesolimbic dopamine. Psychopharmacology. 2007;191:483–495. [PubMed]
  • Robbins TW, Granon S, Muir JL, Durantou F, Harrison A, Everitt BJ. Neural systems underlying arousal and attention. Implications for drug abuse. Ann N Y Acad Sci. 1998;846:222–237. [PubMed]
  • Salamone JD, Correa M, Farrar A, Mingote SM. Effort-related functions of nucleus accumbens dopamine and associated forebrain circuits. Psychopharmacology. 2007;191:461–482. [PubMed]
  • Salamone JD, Cousins MS, Bucher S. Anhedonia or anergia? Effects of haloperidol and nucleus accumbens dopamine depletion on instrumental response selection in a T-maze cost/benefit procedure. Behav Brain Res. 1994a;65:221–229. [PubMed]
  • Salamone JD, Cousins MS, McCullough LD, Carriero DL, Berkowitz RJ. Nucleus accumbens dopamine release increases during instrumental lever pressing for food but not free food consumption. Pharmacol Biochem Behav. 1994b;49:25–31. [PubMed]
  • Salamone JD, Kurth PA, McCullough LD, Sokolowski JD, Cousins MS. The role of brain dopamine in response initiation: effects of haloperidol and regionally specific dopamine depletions on the local rate of instrumental responding. Brain Res. 1993a;628:218–226. [PubMed]
  • Salamone JD, Mahan K, Rogers S. Ventrolateral striatal dopamine depletions impair feeding and food handling in rats. Pharmacol Biochem Behav. 1993b;44:605–610. [PubMed]
  • Salamone JD, Steinpreis RE, McCullough LD, Smith P, Grebel D, Mahan K. Haloperidol and nucleus accumbens dopamine depletion suppress lever pressing for food but increase free food consumption in a novel food choice procedure. Psychopharmacology. 1991;104:515–521. [PubMed]
  • Salamone JD, Wisniecki A, Carlson BB, Correa M. Nucleus accumbens dopamine depletions make animals highly sensitive to high fixed ratio requirements but do not impair primary food reinforcement. Neuroscience. 2001;105:863–870. [PubMed]
  • Satoh T, Nakai S, Sato T, Kimura M. Correlated coding of motivation and outcome of decision by dopamine neurons. J Neurosci. 2003;23:9913–9923. [PubMed]
  • Schultz W. Dopamine signals for reward value and risk: basic and recent data. Behav Brain Funct. 2010;6:24. [PMC free article] [PubMed]
  • Segovia KN, Correa M, Salamone JD. Slow phasic changes in nucleus accumbens dopamine release during fixed ratio acquisition: a microdialysis study. Neuroscience. 2011;196:178–188. [PubMed]
  • Sokolowski JD, Conlan AN, Salamone JD. A microdialysis study of nucleus accumbens core and shell dopamine during operant responding in the rat. Neuroscience. 1998;86:1001–1009. [PubMed]
  • Walton ME, Kennerley SW, Bannerman DM, Phillips PE, Rushworth MF. Weighing up the benefits of work: behavioral and neural analyses of effort-related decision making. Neural Netw. 2006;19:1302–1314. [PMC free article] [PubMed]
  • Wanat MJ, Kuhnen CM, Phillips PE. Delays conferred by escalating costs modulate dopamine release to rewards but not their predictors. J Neurosci. 2010;30:12020–12027. [PMC free article] [PubMed]
  • Wassum KM, Ostlund SB, Balleine BW, Maidment NT. Differential dependence of Pavlovian incentive motivation and instrumental incentive learning processes on dopamine signaling. Learn Mem. 2011;18:475–483. [PubMed]
  • Wassum KM, Ostlund SB, Maidment NT. Phasic mesolimbic dopamine signaling precedes and predicts performance of a self-initiated action sequence task. Biol Psychiatry. 2012;71:846–854. [PMC free article] [PubMed]
  • Wise RA. Dopamine and reward: the anhedonia hypothesis 30 years on. Neurotox Res. 2008;14:169–183. [PMC free article] [PubMed]
  • Zhang M, Balmadrid C, Kelley AE. Nucleus accumbens opioid, GABaergic, and dopaminergic modulation of palatable food motivation: contrasting effects revealed by a progressive ratio study in the rat. Behav Neurosci. 2003;117:202–211. [PubMed]
Articles from Neuropsychopharmacology are provided here courtesy of
Nature Publishing Group