|Home | About | Journals | Submit | Contact Us | Français|
Author contributions: B.W.B. designed research; V.L. and B.L. performed research; N.M. contributed unpublished reagents/analytic tools; V.L. and B.L. analyzed data; B.W.B. wrote the paper.
Two motivational processes affect choice between actions: (1) changes in the reward value of the goal or outcome of an action and (2) changes in the predicted value of an action based on outcome-related stimuli. Here, we evaluated the role of μ-opioid receptor (MOR) and δ-opioid receptor (DOR) in the nucleus accumbens in the way these motivational processes influence choice using outcome revaluation and pavlovian-instrumental transfer tests. We first examined the effect of genetic deletion of MOR and DOR in specific knock-out mice. We then assessed the effect of infusing the MOR antagonist d-Phe-Cys-Tyr-d-Trp-Arg-Thr-Pen-Thr-NH2 (CTAP) or the DOR antagonist naltrindole into the core or shell subregions of the nucleus accumbens on these tests in rats. We found that, whereas MOR knock-outs showed normal transfer, they failed to show a selective outcome revaluation effect. Conversely, DOR knock-outs showed normal revaluation but were insensitive to the influence of outcome-related cues on choice. This double dissociation was also found regionally within the nucleus accumbens in rats. Infusion of naltrindole into the accumbens shell abolished transfer but had no effect on outcome revaluation and did not influence either effect when infused into the accumbens core. Conversely, infusion of CTAP into the accumbens core abolished sensitivity to outcome revaluation but had no effect on transfer and did not influence either effect when infused into the accumbens shell. These results suggest that reward-based and stimulus-based values exert distinct motivational influences on choice that can be doubly dissociated both neuroanatomically and neurochemically at the level of the nucleus accumbens.
Choice between goal-directed actions is determined by the capacity to encode the consequences associated with specific actions and the relative incentive value assigned to those consequences (Balleine and Dickinson, 1998). Although these values can be influenced by a range of variables, including effort and temporal delay (Cardinal et al., 2001; Walton et al., 2006), they are mostly derived from two forms of incentive learning encoding: (1) the reward value of the goal or outcome of an action (i.e., the value assigned to an action based on consummatory contact with its outcome) (Dickinson and Balleine, 1994; Balleine, 2005) and (2) the predicted value of an action (i.e., the likelihood of reward based on the presence of stimuli associated with its specific outcome) (Colwill and Motzkin, 1994; Dickinson and Balleine, 2002).
We have previously shown that these incentive processes involve the basolateral amygdala (BLA); lesions or local drug infusion-induced changes in BLA function have been found to block the effects of reward- (Balleine et al., 2003; Wang et al., 2005) and stimulus-based values (Corbit and Balleine, 2005; Ostlund and Balleine, 2008) on choice. However, these effects appear also to depend on the connections of the BLA with striatal motor areas, notably the core and shell regions of the nucleus accumbens. Thus, both bilateral core lesions (Corbit et al., 2001) and disconnection of the core from the BLA (Shiflett and Balleine, 2010) have been found to abolish the influence of changes in reward value—induced by outcome devaluation—on choice. In contrast, the influence of outcome-related stimuli on choice, assessed using selective pavlovian-instrumental transfer (PIT), is abolished by bilateral lesions of the shell, rather than core (Corbit et al., 2001; Corbit and Balleine, 2011), and by disconnection of the shell from the BLA (Shiflett and Balleine, 2010).
Beyond evidence that these motivational influences on choice can be dissociated at the level of the accumbens core and shell, little is known about their neural bases. Nevertheless, given the structure of the striatum, these influences are most likely mediated by modulation of medium spiny neurons (Kreitzer, 2009). Although dopamine has long been advanced as serving this modulatory role in the accumbens, other important modulators have been established, most notably the endogenous opioid system (Zhang et al., 2003). Indeed, both agonists and antagonists of μ- and δ-receptors within the accumbens have been reported to produce robust changes in a variety of behavioral responses including the performance of consummatory and preparatory conditioned responses to interoceptive and exteroceptive stimuli paired with reward (Peciña and Berridge, 2000; Kelley et al., 2002; Wassum et al., 2009).
Here, we assessed the role of opioid receptor processes in the influence of reward-guided and stimulus-guided decisions on choice. In Experiment 1, we assessed the effect of genetic deletion of μ-opioid receptor (MOR) and δ-opioid receptor (DOR) on sensitivity to outcome devaluation and to pavlovian-instrumental transfer in mice. In Experiment 2, we assessed the effect of the μ-antagonist d-Phe-Cys-Tyr-d-Trp-Arg-Thr-Pen-Thr-NH2 (CTAP) and the δ-antagonist naltrindole on these effects when infused into the accumbens core or shell in rats.
The current experiments were conducted to assess the effects of manipulations of specific opioid receptor-related processes on the motivational control of goal-directed action using two distinct manipulations of incentive value produced by the following: (1) changes in stimulus-based values, assessed using a pavlovian-instrumental transfer protocol, and (2) changes in reward-based values, assessed using an outcome devaluation protocol. The effects of specific opioid-receptor manipulations were assessed in two experiments: In Experiment 1, the behavioral assessments were conducted on specific MOR and DOR knock-out mice. In Experiment 2, these behavioral assessments were conducted on rats using local pharmacological manipulation of opioid receptors by infusion of either the DOR antagonist naltrindole or the MOR antagonist CTAP into either the nucleus accumbens core or shell.
Knock-out mice were bred at University of California, Los Angeles (UCLA); MOR (Matthes et al., 1996) and DOR (Filliol et al., 2000) KO mice and their WT littermates were generated from HET breeding pairs, backcrossed at least nine generations onto a C57BL/6 background. The experimental subjects were 32 experimentally naive, male C57/B6 mice, ~12 weeks of age, divided into four groups. Two groups were composed of DOR knock-outs (KO-DOR) (n = 8) and their wild-type littermates (WT-DOR) (n = 8), and two were composed of μ-opioid receptor knock-outs (KO-MOR) (n = 8) and their WT littermates (WT-MOR) (n = 8). They were housed in plastic boxes located in a climate-controlled colony room and were maintained on a 12 h light/dark cycle (lights on 7:00 A.M.; training sessions occurred between 11:00 A.M. and 3:00 P.M. each day). Several days before the behavioral procedures, the mice were handled daily and were put on food deprivation schedule to maintain them at ~85% of their ad libitum feeding weight. All procedures were approved by the UCLA Animal Ethics Committee.
Training and testing took place in 16 MED Associates mouse operant chambers enclosed in sound- and light-resistant shells. Each chamber was equipped with a pump fitted with a syringe that delivered 0.025 ml of a 20% sucrose solution into a recessed magazine in the chamber. Each chamber was also equipped with a pellet dispenser that delivered a 20 mg grain food pellet (Bioserve Biotechnologies) when activated. The chambers contained two retractable levers that could be inserted to the left and the right of the magazine. An infrared photobeam crossed the magazine opening, allowing for the detection of head entries. The chambers also contained both a sonalert 3 kHz tone generator and a 28 V DC mechanical relay that was used to deliver a 2 Hz clicker stimulus for pavlovian conditioning. A 3 W, 24 V house light provided illumination of the operant chamber. Two microcomputers running MED Associates proprietary software (Med-PC) controlled all experimental events and recorded the behavioral responses.
All mice received one session of pavlovian training per day for the first 8 d. During this training, the levers were retracted. Each session was 1 h long and consisted in the presentation of the two conditioned stimuli (CSs) (i.e., tone or clicker), each paired with either the sucrose or the pellet outcomes. Each CS lasted 2 min and was presented four times in a pseudorandom order with a variable intertrial interval of 5 min. One-half of the animals received tone–pellet and clicker–sucrose pairings, whereas the other one-half received the opposite CS–outcome relationship. The appropriate outcome was delivered during the tone or clicker CS on a random-time 30 s schedule.
After pavlovian training, the mice received 11 d of instrumental training during which two actions (left and right lever press responses) were trained with the different outcomes (pellets and sucrose) in separate sessions each day; one-half of the mice received left lever press–pellets in one session and right lever press–sucrose in the other, whereas the remainder received the opposite action–outcome relationships. The order of the training sessions was varied over days. Each session ended when 20 outcomes were earned or when 30 min had elapsed. For the first 2 d, lever pressing was continuously reinforced. Thereafter, the probability of the outcome given a response [p(O/R)] was gradually shifted over days using an increasing random ratio (RR) schedule: a RR5 schedule (p = 0.2) was used on days 3–5, a RR10 (p = 0.1) schedule was used on days 6–8, and a RR20 (p = 0.05) schedule used on days 9–11. During this phase, one of the WT-MOR mice failed to acquire lever pressing and was excluded from the remainder of the experiment.
After the final day of RR10 training, mice were given 2 consecutive days of pavlovian-instrumental testing. Both levers were inserted into the box, but no outcomes were delivered during the test. Responding was extinguished on both levers for 8 min to reduce the baseline rate of performance after which each CS was presented four times over the next 40 min in the following order: clicker–tone–tone–clicker–tone–clicker–clicker–tone. Stimulus presentations lasted 2 min and were separated by a 3 min fixed intertrial interval (ITI).
The day after the second pavlovian-instrumental transfer test, mice were retrained on the RR20 schedule for 2 consecutive days. The following day, they received ad libitum access to one of the two outcomes, either pellets or sucrose, for 1 h in distinct feeding cages located in a room different from that in which training had been administered. One-half of the mice in each action–outcome assignment received pellets (10 g placed in a bowl), and the remaining mice received sucrose (10 ml in a drinking bottle). The mice were then given a 5 min choice extinction test in which both levers were available but no outcomes were delivered. The same procedure was repeated 1 d later except that mice that were given ad libitum access to pellets now received sucrose, and mice that were given ad libitum access to sucrose now received pellets.
The subjects were 48 experimentally naive male Long–Evans rats obtained from Monash University Animal Research Platform. They were housed in plastic boxes (two rats per box) located in a climate-controlled colony room and were maintained on a 12 h light/dark cycle. Five days before the behavioral procedures, the rats were handled daily and were put on food deprivation schedule to maintain them at ~85% of their ad libitum feeding weight. The Animal Ethics Committee at the University of Sydney approved all experimental procedures.
Training and testing took place in 16 MED Associates operant chambers enclosed in sound- and light-resistant shells. Each chamber was equipped with a pump fitted with a syringe that delivered 0.1 ml of a 20% sucrose solution into a recessed magazine in the chamber. Each chamber was also equipped with a pellet dispenser that delivered a 45 mg grain food pellet (Bioserve Biotechnologies) when activated. The chambers contained two retractable levers that could be inserted to the left and the right of the magazine. An infrared photobeam crossed the magazine opening, allowing for the detection of head entries. A 3 W, 24 V house light, provided illumination of the operant chamber, and each chamber contained a Sonalert that, when activated, delivered a 3 kHz pure tone, and a 28 V DC mechanical relay that was used to deliver a 2 Hz clicker stimulus. A set of two microcomputers running MED Associates proprietary software (Med-PC) controlled all experimental events and recorded lever presses and magazine entries.
CTAP (Sigma-Aldrich), a selective μ-opioid receptor antagonist, and naltrindole hydrochloride (Sigma-Aldrich), a selective δ-opioid receptor antagonist, were dissolved in 0.9% (w/v) nonpyrogenic saline to obtain a final concentration of 2 μg/μl (Soderman and Unterwald, 2008; Trezza et al., 2011) and 5 μg/μl (Kelley et al., 1996; Schmidt et al., 2002), respectively. Nonpyrogenic saline infusions were used to control for any effect of the infusion procedure per se.
At the time of surgery, rats weighted between 290 and 360 g. They received an injection of 1.3 ml/kg of the anesthetic ketamine at a concentration of 100 mg/ml (intraperitoneal) and of 0.3 ml/kg of the muscle relaxant xylazine at a concentration of 20 mg/ml (intraperitoneal). Anesthetized rats were then placed in a stereotaxic frame (Stoelting Company) with the incisor bar set at −3.3 mm. The scalp was retracted to expose the skull, and 26 gauge guide cannulae (Plastics One) were bilaterally implanted through holes drilled in the skull in one of the targeted structures. Two different sets of coordinates (indicated in millimeters relative to bregma) were used for the core region of the nucleus accumbens: one for the left [anteroposterior (AP), +1.2; mediolateral (ML), −2.1; dorsoventral (DV), −6.0] and one for the right (AP, +1.2; ML, −3.2; DV, −6.2; angled 10° toward the midline in the coronal plane) hemisphere. The coordinates used for the shell region of the nucleus accumbens were the following: AP, +1.7; ML, ±0.7; DV, −6.6. The guide cannulae were maintained in position with dental cement, and dummy cannulae were kept in each guide at all times except during microinjections. Immediately after the surgical procedure, rats were injected intraperitoneally with a prophylactic (0.4 ml) dose of 300 mg/kg solution of procaine penicillin. Rats were allowed 3 d to recover from surgery, during which time they were handled and weighed daily.
CTAP, naltrindole, and saline were infused into either the core or the shell region of the nucleus accumbens by inserting a 33 gauge infusion cannula into the guide. The infusion cannulae were connected to a 25 μl glass syringe connected to an infusion pump (KD Scientific; SDR Clinical Technology) and projected 1 mm ventral to the tip of the guide. A total volume of 0.2 μl each side was delivered at a rate of 0.1 μl/min. The infusion cannula remained in place for a further 1 min after the infusion and then removed. On the day before the first infusion, the dummy cannula was removed and the infusion pump was turned on for 3 min to familiarize the rats with the procedure and thereby minimize any stress produced by this procedure when infusions occurred.
At the end of the experiment, the rats received a lethal dose of sodium pentobarbital. The brains were removed and sectioned coronally at 40 μm through the core or the shell region of the nucleus accumbens. Every third section was collected on a slide, and the sections were stained with cresyl violet. The location of cannulae tips was determined under a microscope by a trained observer who was unaware of the subjects' group designations using boundaries defined using the atlas of Paxinos and Watson (2007). Subjects with inaccurate cannulae placements or with extensive damage at the infusion site were excluded from the statistical analysis.
All rats received eight daily sessions of pavlovian training during which the levers were retracted. Each session was of 60 min duration and consisted in presenting two CSs (tone or clicker), each paired with either sucrose or pellets. Each CS lasted 2 min and was presented four times in a pseudorandom order with a variable intertrial interval of 5 min. One-half of the rats received the tone paired with pellets and the clicker paired with sucrose, whereas the other one-half received tone–sucrose, clicker–pellet pairings. The sucrose or pellets were delivered on a random time 30 s schedule throughout the appropriate CS.
Following pavlovian training, all animals received 10 d of instrumental training during which two responses (left and right lever presses) were trained with two different outcomes (pellets and sucrose) in separate daily sessions. The order of the sessions was counterbalanced, as were the response–outcome relationships that were also counterbalanced with the CS–outcome relationships established during pavlovian training. Each session ended when 15 outcomes were earned or when 30 min had elapsed. For the first 2 d, lever pressing was continuously reinforced (i.e., each response was reinforced). Then, the probability of the outcome given a response was gradually shifted over days using increasing random ratio schedules: a RR5 schedule (p = 0.2) was used on days 3–5 and a RR10 (p = 0.1) schedule was used on days 6–8. Rats were then given ad libitum access to food and water for 5 consecutive days before undergoing surgery. Following recovery from surgery, rats were returned to the food deprivation schedule previously used and received 2 additional days of instrumental training on a RR10 schedule.
After the final day of RR10 training, rats were given 2 consecutive days of pavlovian-instrumental tests. Both levers were inserted into the box, but no outcomes were delivered. Responding was extinguished on both levers for 8 min to establish a low rate of baseline performance. Each CS was presented four times over the next 40 min in the following order: clicker–tone–tone–clicker–tone–clicker–clicker–tone. Stimulus presentations lasted 2 min and were separated by a 3 min fixed ITI. Fifteen minutes before each test, rats were given an infusion into either the accumbens core or shell with either drug (naltrindole, CTAP) or saline vehicle. The order of infusion (drug or vehicle) was counterbalanced; rats infused with vehicle before the first test received an infusion of drug (naltrindole or CTAP) before the second test, whereas rats that had been infused with drug before the first test received an infusion of vehicle before the second test.
Beginning the day after the second pavlovian-instrumental test, rats were retrained on the levers using the RR10 schedule across 2 consecutive days. Outcome devaluation was then conducted the next day. Rats received ad libitum access to one of the two outcomes (i.e., pellets or sucrose) for 1 h in distinct feeding cages located in a room different from that used for training. One-half of the rats in each response–outcome assignment received pellets (50 g placed in a bowl), and the remaining rats received sucrose (50 ml in a drinking bottle). Immediately after the prefeeding, rats were infused into the core or the shell with either drug (i.e., naltrindole or CTAP) or vehicle (i.e., saline). Fifteen minutes after infusion, rats were given a 5 min choice extinction test in which both levers were available but no outcome was delivered. The same procedure was repeated 1 d later except that rats previously given ad libitum access to pellets now received sucrose, whereas rats that were previously given ad libitum access to sucrose now received pellets. The order of infusions was also counterbalanced as described above for pavlovian-instrumental transfer tests.
This experiment investigated the effect of MOR and DOR knock-out on pavlovian-instrumental transfer and outcome devaluation. The two lines were backcrossed onto the same C57BL/6 root stock in excess of nine generations, and hence we did not anticipate differences in performance between the two WT groups. As such, we planned to analyze the data using three groups by first comparing the two WT controls and, given they did not differ, then combining them. All analyses of training data were conducted using mixed-model ANOVA. Analyses of the test data were conducted using mixed-model ANOVA followed by simple main effects analyses to establish the source of any significant interactions. To confirm effects established using the combined WT group, we also conducted two-way analyses comparing knock-out and their wild-type controls in both the transfer and devaluation tests.
This experiment investigated the effects of infusions of the DOR antagonist naltrindole and the MOR antagonist CTAP into the accumbens core or shell on pavlovian-instrumental transfer and outcome devaluation. As in Experiment 1, all analyses of training data were conducted using mixed-model ANOVA. Analyses of the test data were conducted using mixed-model ANOVA followed by simple main effects analyses to establish the source of any significant interactions.
We first compared performance in the two WT control groups, WT-DOR and WT-MOR, across the pavlovian and instrumental training phases in Experiment 1. At no point did these groups differ (largest value of F(1,13) = 1.11) and so we collapsed them into a single WT group (n = 15) for analysis of the results of these phases of the experiment. Furthermore, neither conditioned magazine entries nor lever presses differed across the counterbalancing conditions used during the training phases of Experiment 1 (all values of F < 1), and, as such, responding was averaged across these factors. The pavlovian and instrumental training data for the WT and groups of mutant mice are displayed in Figure 1, A and B, respectively.
It is clear from this figure that neither the DOR nor the MOR knock-out affected performance during these phases. In pavlovian conditioning, the mice clearly discriminated between the CS and pre-CS periods, and this discrimination grew larger over the course of training. However, there was no evidence that this discrimination or the rate of change in this discrimination differed by group, an assertion confirmed by the statistical analysis. ANOVA, conducted using a between-groups factor of Group (separating WT, DOR, and MOR), and within-subjects factors of CS period (separating CS and pre-CS) and of Session, found an effect of CS period (F(1,28) = 68.4; p < 0.001), of Session (F(7,196) = 3.39; p < 0.01), and a significant CS period by Session interaction (F(7,196) = 12.5; p < 0.001), but neither an effect of Group nor any interactions involving Group as a factor (all values of F < 1).
A similar pattern emerged during the instrumental training. Performance increased over the course of training but at no point appeared to differ between groups. ANOVA conducted on these data using factors of Group and of Session found an effect of Session (F(10,280) = 142.2; p < 0.001), but no effect of Group nor any interaction between these factors (largest value of F(10,280) = 1.2).
The transfer tests pitted performance during a stimulus predicting the same outcome as a particular lever press response (Same) against performance on the lever that delivered a different outcome from that predicted by the CS (Diff). To assess the effects of these stimuli, we subtracted baseline performance on the two levers during the tests from performance on the “Same” and “Different” levers during each CS presentation to establish the net pavlovian-instrumental transfer (i.e., the increase in performance during the stimulus over baseline). These data are presented in the left panel of Figure 2 separated by stimulus and by group.
It is clear from this figure that performance in WT mice showed a substantial specific-transfer effect; performance on the lever that, in training, delivered the same outcome as that predicted by the stimulus was elevated over baseline relative to the lever that delivered a different outcome. Test performance in Group MOR was similar to the WT controls. However, and more importantly, a clear deficit in this transfer effect was observed in Group DOR. In these mice, performance on the two levers appeared not to differ and the Same stimulus failed to exert an excitatory effect relative to baseline levels of performance.
Again, this description was confirmed by the statistical analysis. ANOVA conducted using factors of Group and of Transfer (separating performance during the Same and Different stimuli) found an effect of Group (F(2,28) = 4.5; p < 0.05), and of Transfer (F(1,28) = 27.0; p < 0.001), and a significant interaction between these factors (F(2,28) = 5.0; p < 0.05). Simple-effects analyses conducted on the significant interaction found significant transfer in both Group WT (F(1,28) = 14.4; p < 0.001) and Group MOR (F(1,28) = 23.1; p < 0.001), but no effect in Group DOR (F < 1). These effects emerged due to the influence of the pavlovian cues and were not present in the baseline data; at no point during the baseline periods during the tests did performance on the levers differ between groups (F(2,28) = 1.5; p = 0.24). Mean lever presses per minute during the tests in the absence of the stimuli (i.e., the baseline performance on test) for the three groups was as follows: Group WT, 5.3; Group MOR, 5.8; Group DOR, 4.5.
The two WT groups combined in Figure 2A performed very similarly on test. During the test, Group WT-MOR responded 4.3 times per minute during the Same and 0.2 times per minute during the Different stimulus, and Group WT-DOR responded 4.2 times per minute during the Same and 1.1 times per minute during the Different stimulus. Nor did these groups differ statistically (all values of F < 1). To confirm the deficit in transfer in the DOR mice described above, however, we also conducted two-way ANOVA comparing Same versus Different for WT-MOR versus KO-MOR and for WT-DOR versus KO-DOR. These tests revealed the following: In the MOR groups, an effect of Transfer (F(1,13) = 23.0; p < 0.05), but neither an effect of Group nor an interaction between Group and Transfer (values of F < 1). In the DOR groups, however, there was an effect of Group (F(1,14) = 11.07), of Transfer (F(1,14) = 11.17; p < 0.05), and an interaction between these factors (F(1,14) = 4.64; p < 0.05). Simple-effects analyses conducted on the interaction found an effect of Transfer in the WT-DOR group (F(1,14) = 12.3; p < 0.05), but not in KO-DOR (F < 1).
Performance during the outcome devaluation choice extinction test is presented in the right-hand panel of Figure 2 separated by Group and by responses on the devalued and the non-devalued lever. Group WT showed a clear outcome-specific devaluation effect, responding markedly less on the lever that, in training, had delivered the now devalued outcome relative to the other action. Despite the deficit in selective transfer, a very similar devaluation effect was also observed in Group DOR. However, in contrast to the selective elevation in performance observed in response to outcome-specific predictions in the transfer test, outcome-specific devaluation was not observed in Group MOR; instead choice between the two levers appeared to be relatively indifferent after the devaluation treatment.
ANOVA, conducted using factors of Group and of Devaluation, found a significant effect of Group (F(2,28) = 3.6; p < 0.05), and of Devaluation (F(1,28) = 20.7; p < 0.001), and a significant interaction between these factors (F(1,28) = 4.6; p < 0.05). Simple-effects analyses conducted on the interaction found a significant effect of Devaluation in Group WT (F(1,28) = 16.4; p < 0.001), and in Group DOR (F(1,28) = 12.7; p < 0.001), but no effect of Devaluation in Group MOR (F < 1). There was no effect of Group in responding on the devalued action (F < 1), but a significant effect of Group on the non-devalued action (F(2,56) = 7.7; p < 0.001). Although this may be taken to suggest that the mice in Group MOR showed evidence of a general devaluation effect, it is difficult to conclude whether the general reduction in performance on both levers reflects a failure to discriminate actions or outcomes (which in any case is countered by the results of the PIT test), or simply the effects of indifference during the choice test, which could promote the effects of response competition and a concomitant loss of responding in the performance of both actions.
The two WT groups combined in Figure 2B performed similarly on test; during the test, Group WT-MOR responded 12.9 times per minute on the devalued and 18.5 times per minute on the non-devalued lever and Group WT-DOR responded 11.2 times per minute on the devalued and 21.2 times per minute on the non-devalued lever. Nor did these groups differ statistically (all values of F < 1). In this case, to confirm the deficit in devaluation in the MOR mice described above, however, we also conducted two-way ANOVA comparing devalued and non-devalued performance for WT-MOR versus KO-MOR and WT-DOR versus KO-DOR. These tests revealed the following: In the MOR groups, an effect of Group (F(1,13) = 6.1; p < 0.05), of Devaluation (F(1,13) = 8.02; p < 0.05), and an interaction between these factors (F(1,13) = 4.7; p < 0.05). Simple-effects analyses revealed an effect of devaluation in Group WT-MOR (F(1,13) = 9.03), but not in Group KO-MOR (F < 1). In the DOR groups, there was a significant devaluation effect (F(1,14) = 20.1; p < 0.05), but neither an effect of Group nor an interaction between Group and Devaluation (values of F < 1).
Generally, therefore, this experiment found evidence of a double dissociation in the involvement of δ- and μ-opioid receptor-related processes in outcome-specific PIT and outcome devaluation (i.e., in the motivational effects that reward-based and stimulus-based decisions have on choice between goal-directed actions). As described above, we hypothesize that this effect of the global knock-out is localized to changes in usual role that the nucleus accumbens core and shell play during the PIT and devaluation tests, a hypothesis that we tested in Experiment 2.
Figure 3 shows the location of injection cannulae tips for rats bilaterally implanted in either the nucleus accumbens core or shell. A total of 16 rats were excluded because of incorrect placement of the guide cannulae. This yielded the following group sizes: Group Core-NAL (n = 10), Group Core-CTAP (n = 8), Group Shell-NAL (n = 12), and Group Shell-CTAP (n = 6).
Although no treatment was given during training, we first established that the groups were similar during the pavlovian and instrumental training phases. Performance across the counterbalancing conditions in these training phases (i.e., across the two auditory stimuli in the pavlovian phase and the two levers during the instrumental training phase) was averaged for this analysis. During pavlovian conditioning, the rats discriminated between the CS and pre-CS periods and this discrimination grew larger over trials but there was no evidence that this discrimination differed among groups (i.e., shell and core). ANOVA revealed an effect of CS period (F(1,575) = 554.0; p < 0.001), of Session (F(7,575) = 10.1; p < 0.001), and a significant CS period by Session interaction (F(7,575) = 12.7; p < 0.001), but neither an effect of Group nor any interactions involving Group as factor (all values of F < 1). By the end of pavlovian training, rats were entering the magazine at a similar rate during the CS and pre-CS periods: core cannulated rats: 13.04 and 2.22 per minute; shell cannulated rats: 11.9 and 2.86 per minute, respectively. During instrumental training, all of the groups acquired the lever press responding that increased as the ratio parameters increased across days. By the end of training, all groups were responding at a similar rate on the levers: Core cannulated group, 40.1 (±1.8); Shell cannulated group, 35.5 (±2.4). Their rates did not differ: F(1,35) = 2.4; p = 0.134.
After the pavlovian and instrumental training phases, we assessed the effect of the MOR and DOR antagonists infused into the accumbens core and shell on outcome-specific PIT and outcome devaluation. With regard first to the effects in the core, the PIT tests were conducted across 2 consecutive days that occurred after an infusion of vehicle, naltrindole, or CTAP in the nucleus accumbens core. Figure 4A shows the mean number of lever presses per minute (CS minus baseline) when the CS predicted the same outcome as the response (Same) and when the CS predicted a different outcome from the response (Different). Generally, the drugs appeared to have little if any effect on outcome-specific PIT. A mixed ANOVA, conducted using Drug (vehicle, naltrindole, or CTAP) and CS identity (Same or Different) as factors, revealed a main effect of CS identity (F(2,71) = 34.9; p < 0.001), but no effect of Drug or a Drug by CS identity interaction (values of F < 1). These effects emerged during the CSs; there was no difference during the test in the baseline levels of performance (F(2,35) = 0.6; p = 0.5).
Next, we assessed the effect of vehicle, naltrindole, or CTAP infusion into the core on outcome-specific devaluation. These data are presented in Figure 4B. Although a devaluation effect emerged after infusion of vehicle or naltrindole, importantly, CTAP produced a clear deficit in this effect. A mixed ANOVA was conducted using Drug (vehicle, naltrindole, or CTAP) and Devaluation (Valued or Devalued) as factors. It failed to find an effect of Drug (F < 1) but detected a main effect of Devaluation (F(1,71) = 17.5; p < 0.001) and a Devaluation by Drug interaction (F(2,71) = 3.2; p < 0.05). Simple-effects analysis conducted on the significant interaction revealed that, whereas the rats in Group Core-VEH and Core-NAL showed a significant outcome devaluation effect (F(1,35) = 21.6, p < 0.001; and F(1,19) = 9.4, p < 0.01, respectively), this effect did not emerge in the Group Core-CTAP, which failed to show any evidence of outcome-selective devaluation (F < 0.1).
In the same fashion, after instrumental training, we assessed the role of shell opioid receptors in outcome-specific PIT and outcome devaluation. The results of the PIT test are presented in Figure 5A. Again, Group Shell-VEH showed a selective PIT effect; responding was increased over baseline but only when the CS predicted the same outcome as the action. This effect also emerged in Group Shell-CTAP. However, a clear deficit was observed in Group Shell-NAL. A mixed ANOVA revealed no effect of Drug (F < 2.8), but a main effect of CS identity (Same or Different) (F(1,71) = 37.3; p < 0.001) and a Drug by CS identity interaction (F(1,71) = 6.6; p < 0.01). Simple-effects analysis conducted to investigate the source of this interaction found a selective increase in responding during CS same in both Group Shell-VEH (F(1,35) = 30.6; p < 0.001) and Shell-CTAP (F(1,11) = 27.3; p < 0.001) but no effect in Group Shell-NAL (F(1,23) = 2.7; p = 0.11). Again, these effects emerged during the CSs; there was no difference during the test in the baseline levels of performance between groups (F(2,35) = 0.8; p = 0.47).
Finally, we assessed the role of shell opioid receptors on the sensitivity of instrumental choice performance to outcome devaluation. These data are presented in Figure 5B, which shows evidence of a clear outcome devaluation effect in Group Shell-VEH that was unaffected by the infusion of drug; outcome devaluation appeared to be unaffected by either infusion of naltrindole or CTAP into the accumbens shell. ANOVA revealed a main effect of Devaluation (F(1,71) = 39.7; p < 0.001), but neither an effect of Drug nor a Drug by Devaluation interaction (values of F < 0.5).
The results of the current study provide important new information on the involvement of the nucleus accumbens in the motivational processes that influence reward-related actions. First, these results replicate the findings of previous experiments showing that the effects on choice produced by changes in reward value, during an outcome devaluation test, and by the presentation of reward-related stimuli, during tests of outcome-specific PIT, are dissociable at the level of the nucleus accumbens core and shell (Corbit et al., 2001; Corbit and Balleine, 2011). First, in Experiment 1, we found that, whereas mice lacking the μ-opioid receptor showed reduced sensitivity to outcome devaluation, they showed normal outcome-specific PIT. Conversely, mice lacking the δ-opioid receptor showed a deficit in outcome-specific PIT but normal outcome devaluation. Second, in Experiment 2, we found that rats given an infusion of the selective MOR antagonist CTAP showed reduced sensitivity to outcome devaluation when that infusion was made in the NAc core but no effect when it was infused into the shell. However, CTAP had no effect on specific PIT when infused into either core or shell. Furthermore, rats given an infusion of the selective DOR antagonist naltrindole showed reduced selective PIT when the infusion was made in the NAc shell but no effect when it was infused into the core and no effect on outcome devaluation whether it was infused into the core or shell.
The dissociable effects of MOR and DOR manipulations in accumbens core and shell reflect the role of the accumbens within the larger system in which it is placed, particularly that involving the basolateral amygdala. Thus, whereas lesions of NAc core (Corbit et al., 2001) and lesions that disconnected the NAc core from the BLA (Shiflett and Balleine, 2010) were found to abolish the sensitivity of instrumental choice performance to outcome devaluation, these treatments did not affect rats' sensitivity to outcome-related stimuli in tests of outcome-specific PIT. Conversely, lesions (Corbit et al., 2001) or inactivation of NAc shell (Corbit and Balleine, 2011) and disconnection of the NAc shell from the BLA (Shiflett and Balleine, 2010) were found to abolish the effect of outcome-related stimuli on choice without affecting the rats' sensitivity to outcome devaluation. Here, we found evidence of a similar dissociation.
One conclusion from these studies, therefore, is that the effects of changes in reward value, such as those induced by outcome devaluation by specific satiety, and in stimulus-guided decisions, such as those induced by outcome-related stimuli in specific PIT, on instrumental choice performance are doubly dissociable both in terms of the broader system in which the accumbens is positioned but also in terms of local circuits within both the NAc core and shell. The system- and circuit-level specificity of these two influences on choice points to an important duality in the role that incentive learning-related processes play, generally, in governing goal-directed actions and in the circuitry through which these distinct functions are implemented.
In fact, a MOR-related process in the NAc has long been thought to influence the rewarding effects of palatable food (Bakshi and Kelley, 1993; Zhang et al., 2003), and the current results extend this to a role in reward seeking based on the experienced reward value of instrumental outcomes. However, these findings limit the latter involvement of this MOR process to the NAc core, on the one hand, and to changes in reward value made explicit during consummatory experience, on the other. There was no evidence that MORs are involved in the influence of outcome-related stimuli on choice. Furthermore, although DOR agonists in the NAc core have occasionally been reported to influence the vigor of various reward-related responses (Simmons and Self, 2009; Katsuura and Taha, 2010), the current results suggest they are not involved in the effect of either reward-based or stimulus-based decisions on instrumental choice performance.
Likewise, DOR-related processes in the medial NAc shell have been increasingly implicated in reward, particularly in the context of addiction in which they have been reported to influence the consumption of drugs (Krishnan-Sarin et al., 1995; Nie et al., 2011), and the severity of withdrawal (Ambrose-Lanci et al., 2008; McCarthy et al., 2011). Of course, both DOR (Zhang and Kelley, 1997) and MOR (Katsuura and Taha, 2010) agonists have been argued to elicit feeding when infused in the shell, and the latter have also been argued to affect consummatory reflexes, particularly the fixed action patterns induced by particular tastes, notably sucrose (Peciña and Berridge, 2005). Few studies have assessed the effect of manipulations of DOR or MOR on the way in which specific incentive processes affect instrumental performance and the current experiments appear to be the first to demonstrate a DOR-selective effect in this context. In any case, no effect of CTAP infusions into the shell was observed here in tests assessing either changes in reward value or outcome-related stimuli. These results suggest, therefore, that previously reported effects of MOR agonists and antagonists in the shell may be specific to consummatory reflexes associated with exposure to reward itself, a conclusion consistent with other recent findings (Wassum et al., 2009; Ambroggi et al., 2011). Hence, the current findings suggest that a DOR-related process in the NAc shell is particularly important in the ability of pavlovian cues to influence instrumental performance and bias choice between goal-directed actions.
Within the accumbens, both MOR and DOR are highly expressed and widely distributed in local circuits within both the NAc core and shell (Mansour et al., 1995). Although their exact localization may well differ according to region and subtype, MOR and DOR are thought to be predominantly expressed extrasynaptically on dendrites, dendritic shafts, and soma (Gracy et al., 1997; Svingos et al., 1997; Wang and Pickel, 1998), and to modulate the activity of medium spiny neurons both directly and indirectly through their influence on local acetylcholine and dopamine release (Svingos et al., 1997, 1999; Wang and Pickel, 1998). Some important differences could, however, explain their distinct functional influences; for example, there is evidence that DOR might also act presynaptically in the accumbens in a way that differs from MOR, particularly in the shell (Svingos et al., 1998; Britt and McGehee, 2008; Hipólito et al., 2008). Nevertheless, there is much still to be discovered in this circuitry, and, as there are currently multiple alternative hypotheses as to the way in which CTAP and naltrindole could act differentially in core and shell to regulate distinct function, further research is currently ongoing to establish how this is achieved in the current situation.
A key reason why distinct neural circuits are observed to mediate the effects of reward-related cues versus changes in reward value on instrumental choice performance lies in the different learning processes that contribute to these influences on choice. Changes in reward value are derived from consummatory experience with the outcome after shifts in internal motivational conditions, and result in a change in the rewarding properties assigned to the outcome (Dickinson and Balleine, 1994; Balleine, 2004). In contrast, the influence of outcome-related stimuli on choice is dependent on the information that the stimulus provides about outcome delivery (i.e., it is the predictive validity of the cue with respect to a specific outcome that determines its effects on choice) (Delamater, 1995; Dickinson and Balleine, 2002; Balleine et al., 2008).
Generally, therefore, reward-guided and stimulus-guided decisions appear to be based on distinct forms of incentive learning derived from evaluative learning and predictive learning processes, respectively. It is important to note that, although neither of these incentive learning processes have been ascribed to the nucleus accumbens, the neural structures to which these functions have been ascribed [i.e., regions of amygdala (Ostlund and Balleine, 2008), midbrain (Takahashi et al., 2009), and orbitofrontal cortex (Schoenbaum et al., 2003; Ostlund and Balleine, 2007)] provide some of its major afferents. Hence, consistent with these and other data, we believe that these functional considerations suggest that, rather than being involved in encoding, experiencing, or predicting reward, the accumbens is critically involved in the way these incentive processes influence the performance of instrumental actions (Stuber et al., 2011). For example, we previously reported that naloxone-induced blockade of opioid receptors in the nucleus accumbens shell or core during changes in experienced value attenuated the performance of consummatory taste reactivity reflexes but had no effect on the value of the instrumental outcome. In contrast, naloxone infused into the basolateral amygdala did not alter taste reactivity reflexes but completely blocked changes in reward value (Wassum et al., 2009, 2011).
It appears likely, therefore, that changes in reward value and reward prediction are mediated by the BLA and that these changes influence choice performance through connections between the BLA and the accumbens, consistent with descriptions of the accumbens as the “limbic-motor interface” (Mogenson et al., 1980; Shiflett and Balleine, 2010). A similar claim can be made for predicted reward in which interconnections between BLA, VTA, and orbitofrontal cortex appear to mediate predictive learning, whereas connections with the shell mediate the influence of this learning on performance (Stuber et al., 2011). Indeed, as we have claimed previously (Yin et al., 2008; Corbit and Balleine, 2011), the motivational functions of the accumbens core and shell appear to be well captured by their involvement in modulating motor output rather than, say, in hedonic experience (i.e., in the performance not only of instrumental actions but also of the consummatory conditioned reflexes [CRs] [e.g., licking, chewing, eating reactions, sometimes called “liking” (Peciña and Berridge, 2000)] and preparatory CRs [e.g., conditioned approach, sign tacking (Flagel et al., 2011), sometimes called “wanting” (Berridge, 1996)] that are elicited during pavlovian conditioning). In any case, we do not find, nor do we believe that others have shown, that the accumbens is necessary to encode either the reward value or the predicted value of the instrumental outcome.
This work was supported by NIMH Grant 56446, National Health and Medical Research Council Grant 633267, and Australian Research Council Laureate Fellowship FL0992409 (B.W.B.) as well as Grants DA09359 and DA05010 from the National Institute on Drug Abuse (N.T.M.).