|Home | About | Journals | Submit | Contact Us | Français|
Both the anterior cingulate cortex (ACC) and mesolimbic dopamine, particularly in the nucleus accumbens (NAc), have been implicated in allowing an animal to overcome effort constraints to obtain greater benefits. However, their exact contribution to such decisions has, to date, never been directly compared. To investigate this issue we tested rats on an operant effort-related cost–benefit decision-making task where animals selected between two response alternatives, one of which involved investing effort by lever pressing on a high fixed-ratio (FR) schedule to gain high reward [four food pellets (HR)], whereas the other led to a small amount of food on an FR schedule entailing less energetic cost [two food pellets, low reward (LR)]. All animals initially preferred to put in work to gain the HR. Systemic administration of a D2 antagonist caused a significant switch in choices towards the LR option. Similarly, post-operatively, excitotoxic ACC lesions caused a significant bias away from HR choices compared with sham-lesioned animals. There was no slowing in the speed of lever pressing and no correlation between time to complete the FR requirement and choice performance. Unexpectedly, no such alteration in choice allocation was observed in animals following 6-hydroxydopamine NAc lesions. However, these rats were consistently slower to initiate responding when cued to commence each trial and also showed a reduction in food hoarding on a species-typical foraging task. Taken together, this implies that only ACC lesions, and not 6-hydroxydopamine NAc lesions as performed here, cause a bias away from investing effort for greater reward when choosing between competing options.
In many natural circumstances, animals have to assess the value of investing extra work to obtain greater reward, integrating the expected costs of a choice with the anticipated benefits to decide which response to make. There is a large body of evidence implicating the mesolimbic dopamine (DA) system, particularly within the nucleus accumbens (NAc), in establishing and modulating aspects of instrumental behaviour and in allowing animals to work to obtain reward (Berridge, 1996; Robbins & Everitt, 1996; Ikemoto & Panksepp, 1999; Nicola, 2007; Salamone et al., 2007). In a series of experiments where rats chose between either investing effort (scaling a barrier or making multiple lever presses) for a larger or more palatable reward or selecting to obtain a less valuable reward at less energetic cost, administration of a DA antagonist or NAc DA depletions, particularly to the core region, caused a shift in preference towards the low-effort option (Cousins & Salamone, 1994; Salamone et al., 1994; Denk et al., 2005). By contrast, hyperdopaminergic mice, which have elevated levels of tonic DA, obtained a greater proportion of their food from the high-effort option (Cagniard et al., 2006).
A similar change in choice behaviour away from options requiring investing effort for high reward (HR) has also been observed following excitotoxic lesions of the anterior cingulate cortex (ACC) (Walton et al., 2003; Schweimer & Hauber, 2005; Rudebeck et al., 2006). The ACC in rats sends a major projection to the core of the NAc (Brog et al., 1993) and it has been proposed that the interaction between top-down excitatory projections from the ACC with DA in the NAc may be crucial in guiding such effort-related choices (Walton et al., 2006; Phillips et al., 2007). Although both the ACC and NAc DA are strongly implicated in this type of decision making, their relative roles have never been directly simultaneously compared using the same task. Moreover, in the majority of these previous studies, it has been difficult to exactly measure and manipulate quantitatively the effort requirements of all of the available options (e.g. scaling a barrier) (see also Floresco et al., 2008).
The aim of the present study was therefore to compare the causal importance of both ACC and NAc core DA on effort-related cost–benefit behaviour. Initially, we validated an operant effort-based decision-making paradigm, in which animals could choose either to obtain a large reward by responding on a lever with a high fixed-ratio (FR) schedule or to select another option with a lower FR and smaller reward (Walton et al., 2006; see also Floresco et al., 2008), by assessing the effects of systemic DA receptor blockade, previously shown to alter choice behaviour on previous effort-related tasks. Subsequently, the choice performance of animals with either excito-toxic ACC or 6-hydroxydopamine (6-OHDA) NAc lesions on this cost-benefit task was investigated. Finally, these animals were run on a species-typical food-hoarding task, which has previously been shown to be sensitive to both NAc DA depletions and large medial frontal cortex damage (Kolb, 1974; Kelley & Stinus, 1985).
A total of 46 male Lister hooded rats (Harlan UK, Bicester, UK) were used in the present study, 10 for the haloperidol drug administration test and 36 for the lesion experiments. All animals were approximately 2 months of age at the beginning of testing and had not been used in any previous experiments. Rats were housed in groups of three and maintained on a 12-h light/dark cycle (lights on at 07:00 h), with all testing carried out during the light phase, the first cohort of eight animals starting at ~10:30 h each day and the last one finishing no later than 17:30 h, the cohorts always being run at equivalent times each day. During testing animals were placed on a restricted diet, weighed daily prior to testing and maintained at approximately 85% of their free-feeding weight. Any supplemental food to that gained during testing was provided in the home cages at the end of the day, consisting of approximately 8–16 g of standard laboratory chow at least 30 min after the end of the final test session, the amount depending on the animal's recorded weight. Although this was necessarily partially influenced by the number of rewards gained in the task, no explicit calculation was made to ensure that each animal received the same amount of food each day (i.e. increasing the laboratory chow to counter gaining fewer rewards during testing). Water was available ad libitum while animals were in their home cage. All procedures were carried out in accordance with the UK Animals (Scientific Procedures) Act (1986) and its associated guidelines.
All operant testing was carried out in eight operant chambers (30 × 24 × 21 cm; Med Associates, VT, USA) contained within sound-attenuating cubicles. Each cubicle was fitted with a fan providing both air circulation and a constant background noise of approximately 60 dB. One wall of each chamber contained a central food magazine into which 45 mg standard formula Noyes Precision Pellets (Sandown Chemical Limited, Middlesex, UK) could be dispensed. The magazine could detect head entries by means of an infrared beam and could be illuminated by an internal light. Two motor-driven retractable levers were positioned 2 cm above floor level to the left and right of the central food magazine, and a stimulus light was positioned above each lever. Each chamber was illuminated by a 2.8 W house light located in the wall opposite the magazine and levers. Pellet delivery, lever extension/retraction, lever presses and food magazine entries were controlled by a PC running Med-PC software (WMPC version IV, Med Associates).
Habituation and lever press training were performed comparably for both the haloperidol drug administration test and the lesion experiments. On the two nights prior to their introduction to the operant chambers, approximately 25 reward pellets were placed in the animals' home cages. Subsequently, animals were individually habituated to the operant chambers over two sessions during which they had free access to approximately 25 reward pellets placed in the food magazine. The house and magazine lights were illuminated throughout and the rats were removed once all pellets had been consumed or after 15 min had elapsed. The following day, rats underwent a single 30 min magazine training session using a variable interval 60 s schedule during which a single pellet was delivered on average every 60 s (min interval, 10 s; max interval, 110 s) along with simultaneous magazine light illumination for 6 s. The house light was illuminated throughout.
Rats were next trained to lever press for a single pellet delivery using a FR1 schedule on one lever at a time (counterbalanced across animals) until they reached a criterion of making more than 100 presses within a 15 min period. The lever was extended for the entire session and the house light continuously illuminated. Shaping the rats to press the levers was initially aided by affixing a palatable food (Shreddie, Nestle, UK) to the extended lever.
On the following days, the rats underwent a training regime on a simplified version of the full behavioural task. Experimental sessions consisted of 48 trials. Each training trial started with house and magazine light illumination, and with the levers retracted. Once a nose-poke response in the magazine had been made, the magazine light would extinguish and one of the levers would extend (forced trial). If the rat responded, the magazine light was illuminated and one pellet was delivered to the magazine. After 6 s, both the magazine and house light extinguished, marking the end of a trial (although here, and in all subsequent testing, if not collected during the 6 s, the pellets remained available for consumption even after the lights were extinguished). Initially, the inter-trial interval (ITI) was set at 4 s. Following one session at FR1, the response requirement on both levers was increased to an FR2 and then an FR4 across subsequent sessions, and the ITI increased to 20 s. In these sessions, there was no time limit either to make a magazine response at the start of the trial or to make an initial lever press response.
For all subsequent training and testing sessions one of the two levers was designated as being the ‘HR’ lever and the other as the ‘low-reward’ (LR) lever. Selection of the HR resulted in the delivery of four pellets and the LR two pellets (counterbalanced across rats and remaining constant for each animal throughout the study). Also, a 10 s time limit was set for both the magazine response at the start of the trial and for the initial lever press. Failure to respond within this time was scored as an omission (miss) and caused all lights to extinguish and levers, if present, to retract, followed by a 60 s time-out. Rats experienced three sessions of FR4 forced trials with the HR/LR differential. The ITI was increased across these sessions up to 45 s.
In the final training session, rats were given two ‘choice trials’ (both levers were extended at the same time) for every four forced trials, resulting in a total of 16 choice and 32 forced trials per session. In order to equate frequency of reward delivery on the two levers, in these and all subsequent test sessions the ITI was calculated as being 60 s minus the amount of time taken to complete the FR schedule (i.e. if it took 8 s to complete the FR4 response requirement, then the ITI would be 52 s).
Testing methods were identical to those described in Walton et al. (2006) (see Fig. 1A). As in the final training session, each session consisted of 48 trials but now rats were given four choice trials after every two forced trials (one to each lever), resulting in a total of 32 choice trials per session. This protocol allowed sufficient assessment of decision-making behaviour while ensuring that rats sampled each option at least once out of every six trials. On choice trials, selection of one lever caused the alternative to retract immediately. Across sessions, the HR was increased to an FR8 and then, two sessions later, to an FR12. In order to reduce uncertainty about the cost on the HR and the need for rapid behavioural flexibility, the FR on the HR and LR always remained fixed within a session. During these test sessions, only one session per animal was ever run on each day, with testing usually occurring on at least 6 days per week.
A group of well-trained rats, whose effort-related decision-making behaviour had already previously been characterized (for full details, see Walton et al., 2006), took part in this experiment. Animals were retested with both the HR and LR as FR4 until they were selecting the HR lever on > 90% of choice trials across an entire session. The FR on the HR was subsequently increased to FR8 and animals were tested for four further sessions.
Based on performance in the final two test sessions, rats were divided into two groups, also counterbalanced for left/right orientation of the HR lever. Group 1 received a systemic injection of 0.1 mg/kg haloperidol, Group 2 being injected with the vehicle (saline), 50 min before being run on the operant decision-making task, with the HR set at FR8 for four pellets and the LR at FR4 for two pellets. On the following days, all animals were given one session of all forced trials with the same contingencies (HR, FR8; LR, FR4) and one session of mixed forced and choice trials (four choice to every two forced) to allow performance to return to pre-injection levels. If an animal failed to return to choosing the HR on more than 50% of trials, it was excluded from the experiment. The next day the groups were reversed and the haloperidol manipulation was repeated (Group 1 receiving the vehicle, Group 2 receiving 0.1 mg/kg haloperidol).
The rats in this experiment were experimentally naive prior to habituation and lever press training. As in Experiment 1, initially both the HR (four rewards) and LR (two rewards) were set at FR4 and animals were tested until the mean percentage of HR choice was > 90% over two consecutive sessions. If this criterion was not reached within six sessions then the rats were excluded from the study. Once rats had reached criterion levels of performance, the effort required on the HR increased to FR8 for two sessions and then to FR12. Pre-surgery test blocks consisted of two blocks of two sessions, separated by 7 days, with the HR at FR12 and LR at FR4 (Blocks A and B). Post-surgery, after a minimum 2 week recovery period and once the animals had all reached 85% of their post-surgery free feeding weight, they were retested with the HR at FR12 and LR at FR4 for three blocks of two sessions (Blocks C–E) (Fig. 1B).
In order to test the effect of increasing the effort-related response costs on choice behaviour, first, the FR on the LR lever was increased to FR12 (Blocks F and G) (Fig. 1B). This equated the effort expenditure required to obtain reward on both options. Trials were run until each group attained a mean performance of 80% HR choices in a single session. Second, the FR ratio of both levers was increased to FR24 for two sessions (Block H) and to FR48 for two sessions (Block I).
At 2 months after completion of the operant decision-making experiments, the rats were moved from their home room to be housed individually in wire mesh cages (37 × 24 × 20 cm). They were given 3 days habituation to this new environment before commencement of testing on a species-typical test of choice behaviour, i.e. food hoarding. On test days, a large plastic bucket (30 cm in length), dusted with home cage sawdust and containing 90 g of standard rodent food pellets at the end furthest from the animal's cage, was attached to the front of each cage. The rats were removed 60 min later and the amount of food (i) moved to the home cage (including pellets that had dropped through the bars underneath the cage) or (ii) left in the bucket was separately calculated. The difference between these two figures and 90 g was considered to represent the amount consumed by the animal [food eaten = 90 g − (food hoarded + food left)]. Rats were run in squads of eight, counterbalanced by lesion, and were all tested on two separate days.
Based on pre-operative HR lever choice performance and left/right HR orientation, rats received either bilateral excitotoxic lesions of the ACC (n = 11) (including both Cg1 and Cg2 subfields; Zilles, 1985), bilateral 6-OHDA DA lesions of the NAc core (n = 11), bilateral vehicle injections (0.1% ascorbic acid) into the NAc core (n = 7) or sham surgery (n = 7). Those animals undergoing either 6-OHDA or vehicle infusions received i.p. injections of desipramine hydrochloride (20 mg/mL, 1 mL/kg) at 30 min prior to surgery. All rats were anaesthetized with 5% isoflurane and maintained during surgery with 2–3% isoflurane. They were placed in a stereotaxic frame, with head level between bregma and lambda, an incision was made over the midline and a restricted craniotomy was performed overlying the injection sites. Injections were made using a 10 μL syringe mounted onto the stereotaxic frame with a specially adapted 34-gauge needle. ACC lesions were made by injecting 0.2 μL of quinolinic acid (0.09 m) at the coordinates shown in Table 1. These coordinates and protocol have previously been shown to be sufficient to cause complete neuronal loss within the ACC (Walton et al., 2003; Rudebeck et al., 2006). DA-specific lesions of the NAc core were made by injecting 0.4 μL of 6-OHDA at the coordinates shown in Table 1 based on pilot experiments. Infusions were made manually at a rate of 0.1 μL every 30 s with a 30 s interval between each infusion. On completion of each infusion the needle was left in place for a further 3 min to ensure diffusion away from the injection site. Following surgery rats were allowed at least 2 weeks recovery with food and water available ad libitum. In spite of similar recovery periods, there were some differences in post-operative free feeding weights, with the NAc DA-depleted animals being lighter on average than the other groups (341 g compared with 356 g for both the ACC and control groups; NAc DA vs. ACC, t17 = 2.18, P = 0.044; NAc vs. control, t17 = 1.78, P = 0.092). However, there was no evidence that this directly influenced choice performance post-operatively (see supplementary results in Appendix S1 in Supporting information).
The ACC-lesioned rats and one sham-lesioned animal were anaesthetized with sodium pentobarbitone (200 mg/kg) and perfused transcardially with physiological saline for 6 min and then with 10% formol saline for 10 min. The brains were then removed and placed into a formol saline solution. Subsequently, the brains were placed in a sucrose/formalin solution from 24 h, frozen and then sectioned coronally (50 μm). All sections were mounted and stained with cresyl violet. Histological evaluation was conducted by two experimenters, one of whom was unaware of the rats' behavioural performance.
The 6-OHDA-lesioned and vehicle rats were killed by CO2-induced asphyxiation at 14 days after the final food-hoarding test, their brains rapidly removed and frozen on dry ice. Based on the methods of Dalley et al. (2002) and Palkovits (1973), coronal sections from the frontal pole were cut (250 μm thickness) on a cryostat (−10°C) and mounted onto pre-chilled microscope slides. A stainless steel micropunch (0.75 mm diameter) was used to remove aliquots of tissue bilaterally from the NAc core and the dorsolateral striatum from both the right and left hemispheres (~0.6–1.2 mg extrapolating from previous studies using an equivalent micropunch; Dalley et al., 2002), which were kept separate for all subsequent analyses. Samples (consisting of three punches) were homogenized in 0.06 m perchloric acid using a tissue macerater (1250–2250 g for ~10 s) and centrifuged (20000 g for 15 min). Tissue supernatants were assayed for catecholamines and their metabolites using high-performance liquid chromatography with electrochemical detection. Analytes were separated on a Microsorb C18 reverse-phase column (4.6 × 100 mm) and detected (LC-4B electrochemical detector) using a glassy carbon working electrode (+ 0.7 V vs. silver/silver chloride). The mobile phase composition (flow rate, 1 mL/min) was as follows: 14.5% methanol, 0.1 m NaH2PO4, 0.8 mm EDTA and 3.2 mm sodium octane sulfonate, pH 3.35 (Sharp & Zetterstrom, 1992).
One pilot NAc DA-depleted animal was anaesthetized as above and then perfused using 4% paraformaldehyde in 0.1 m phosphate buffer for 10 min at 22 days post-surgery. The brain was removed and placed into 4% paraformaldehyde solution for 02:00 h at 4°C, before being transferred to a cryoprotectant (3% sucrose phosphate) for 1–2 days. Coronal sections (50 μm) were cut, mounted and stained with an antibody to tyrosine hydoxylase to check for the presence and location of remaining DA terminals. A 1 : 4000 concentration of primary antibody was used and a 1 : 100 concentration of secondary antibody. All lesions are described in terms of the nomenclature and classification of cortical areas adopted by Paxinos & Watson (1998).
Post-mortem analyses demonstrated that tissue levels of DA in the NAc were significantly reduced in all animals at 6 months after 6-OHDA injections to ~11% of control levels on average (range 4–27%) (controls were both sham and vehicle animals) (main effect of Group: F1,20 = 32.84, P < 0.001). By contrast, there was only a small reduction in DA in the dorsolateral striatum to ~70% of control levels (Table 2). In spite of injections of the noradrenaline uptake inhibitor, levels of noradrenaline were also similarly affected, with levels being on average 4% and 69% of control levels in the NAc and dorsolateral striatum, respectively. Inspection of the single pilot 6-OHDA animal stained with tyrosine hydoxylase shows that the monoamine depletion was targeted and discrete, removing the majority of fibres in the core of the NAc while leaving the shell region largely intact (Fig. 2, left columns).
The ACC lesions were targeted and reproducible in all cases. There was some sparing of tissue in the most rostral portions of Cg1, with all lesions starting between 2.7 and 2.2 mm anterior to bregma. Posterior to here, there was extensive cell loss throughout the ACC reaching back as far as 1.3 mm posterior to bregma (Fig. 2, middle and right columns). Damage was generally confined to the ACC, although in three animals there was also slight invasion of the lesion into secondary motor cortex. No subjects were excluded on histological grounds.
The mean percentages of HR choices obtained for haloperidol and saline groups as a proportion of all choices (excluding missed trials) on the operant effort-based decision-making task are displayed in Fig. 3A. One rat was excluded from the analysis after choosing the HR on fewer than 20% of trials during retraining following the first test session, leaving a total of nine animals. Systemic antagonism of DA via injection of 0.1 mg/kg haloperidol caused rats to choose the HR significantly less often than after vehicle injections on trials where one of the two levers was chosen (t = 2.78, P = 0.024). It is also possible to examine the effect of the haloperidol on effortful lever pressing independent of choice behaviour by examining reaction times on the 16 forced trials when only one of the levers is available. However, this demonstrated that there was no difference between the groups on either the HR or LR on the latency to complete the FR requirement (Fs < 2, not significant), indicating that the shift away from high-effort choices was not solely caused by increased time taken or difficulty completing the lever pressing in the haloperidol group.
As a group, the haloperidol animals also missed more trials by failing either to initiate a trial or to make a lever press response within 10 s of magazine and house light illumination or lever extension, respectively (86% of these were failures to initiate a trial) [haloperidol mean 16.58% (6.86 SEM); vehicle 2.75% (1.09 SEM)], although this was mainly caused by a notable increase in omissions (> 25%) in three of nine animals and so did not quite reach significance, even when analysed with the non-parametric Wilcoxon Signed Ranks test (Z = 1.78, P = 0.075). Moreover, there was no correlation between change in choice behaviour and increase in number of missed trials, suggesting that there was not a direct link between motor retardation and choice behaviour (Pearson correlation coefficient, r = 0.09, not significant). The drug-treated group were nonetheless significantly slower to initiate a trial by making a magazine response (t = 2.50, P = 0.037) and this did correlate with subsequent decision making, with the animals that were slower to initiate a trial showing a greater shift in behaviour towards the LR (Pearson correlation coefficient, r = 0.73, P = 0.024) (Fig. 3B and C).
Four rats did not learn the reward discrimination within six sessions and so were excluded from further testing. The rats were divided into four groups on the basis of their pre-lesion performance. Two rats died post-surgery, one from the NAc DA group and one from the vehicle group, leaving a total of 30 animals (ACC, 10; NAc DA, 9; sham, 6; vehicle, 5). Pre-operatively, all groups showed a preference for choosing the HR at FR12 (Blocks A and B, Fig. 4A left-hand side). There was a small drop in performance across the two pre-operative blocks that was similar in all groups (main effect of block: F1,26 = 19.69, P < 0.001). Consequently, all subsequent analyses of the effects of surgery will be compared with the second pre-operative block (Block B).
There were no significant differences between the performance of the sham and vehicle groups post-surgery (Pre-Block B vs. Post-Block C, no main effect or interactions with group: all Fs < 3, P > 0.1; Post-Blocks C–E, no main effect or interactions with group: all Fs < 2, P > 0.2); therefore, for all further analyses of the operant choice data, the sham and vehicle animals were combined into a single control group. Although the performance of this control group dropped immediately after surgery, particularly on the first day of testing, most of these animals continued to show a preference for the HR in Block C. Following excitotoxic ACC lesions, however, rats showed a marked change in behaviour, with this group now selecting the LR lever on a large majority of trials. By contrast, no such alteration in choices was observed in the NAc DA-depleted rats. This was confirmed by an anova comparing pre- and post-lesion performance (Blocks B vs. C) that showed a significant Group × Testing Block interaction (F2,27 = 4.69, P = 0.018). Further anovas combined with simple main effects comparing each individual group against each other showed that the ACC group made significantly fewer HR choices post-surgery than both the NAc DA-depleted rats (Group × Testing Block: F1,17 = 11.79, P = 0.003) and the control rats in session 2 of Block C (Group × Testing Block × Day: F1,19 = 7.88, P = 0.011). However, there were no differences between the control and NAc DA-depleted rats on any choice measure.
Inspection of Fig. 4A shows that there were some changes in the patterns of responses in all the groups post-surgery as the animals were re-exposed to the choice contingencies. To assess the consistency of choice behaviour after the lesions, a repeated measures anova was run on all post-surgery blocks (C–E). This showed a significant Group × Testing Block interaction (F4,54 = 3.89, P = 0.008) as well as a trend towards a main effect of Group (F2,27 = 2.74, P = 0.083). As can be observed in Fig. 4A, although there was a slight trend towards an increase in HR choices after session 2 in the ACC-lesioned rats, they continued to make significantly more LR choices than either of the other two groups in the second post-lesion testing block (Block D) (main effect of group in Block D: ACC vs. control, F1,19 = 5.53, P = 0.03; ACC vs. NAc DA, F1,17 = 5.43, P = 0.03). However, by the final post-lesion block, although still making numerically fewer HR choices, there were no statistical differences between choice behaviour in the ACC-lesioned rats and the other groups (Fs < 2, not significant). Although the NAc DA-depleted rats made fewer HR choices across post-lesion testing sessions, their choice behaviour was statistically indistinguishable from the control group on every block.
One possible reason for a change in choice behaviour would be an increase in the amount of time taken to complete the effort requirement. There was no relationship pre-surgery between choice behaviour and the amount of time that it took to complete the FR requirement on the HR or the average lever press time (total FR schedule completion time/FR schedule), suggesting that choice behaviour was not simply being guided by temporal costs (both Ps > 0.25). Moreover, although there was a slight increase in the average amount of time to complete the FR12 on forced HR trials in all groups initially after surgery, there were no significant differences between the groups at any point, either when comparing pre-lesion Block B with post-lesion Block C or all post-lesion blocks together (all Fs < 1).
Experiment 1 showed that systemic administration of a DA antagonist caused a significant increase in latencies to initiate a trial with a magazine response as well as a numerical increase in the number of missed trials. The corresponding data for Experiment 2 can be seen in Fig. 4B. Although there was a significant increase in initial magazine latency after surgery (Block B vs. Block C: main effect of block, F1,27 = 60.96, P < 0.001), there was initially no difference between the groups, although the interaction between Group × Session × Day approached significance (F2,27 = 2.78, P = 0.08). However, over all post-lesion testing, there was a significant interaction between Group × Session × Day (F4,54 = 3.50, P = 0.013) and further analyses comparing each group against each other showed that the NAc DA-depleted group were overall significantly slower to initiate trials than the control animals (main effect of group: F1,18 = 4.53, P = 0.047). There was no similar difference apparent in the ACC group when compared with controls (F < 1).
A similar analysis of all omissions (both failing to initiate a trial and failing to make an initial lever press response within the available 10 s, the large majority being the former) demonstrated a significant increase in failures to respond after surgery (Block B vs. Block C: main effect of block, F1,27 = 58.44, P < 0.001) as well as a Group × Testing Block × Day interaction comparing pre- and post-surgery Blocks B and C (F2,27 = 3.81, P = 0.035). Further analyses using simple main effects showed that this was caused by the ACC group having significantly fewer misses than the NAc DA group on the second block of post-lesion testing (P < 0.01). However, although there was a trend for the ACC group to fail to initiate responding less often than the control and NAc DA groups over all post-lesion testing (Blocks C–E, main effect of group, ACC vs. control: F1,19 = 3.91, P = 0.063; ACC vs. NAc DA: F1,17 = 3.05, P = 0.099), performance of the NAc DA-depleted animals was indistinguishable from the controls throughout. Similarly, there were no differences in the latencies with which the NAc DA-depleted or control groups entered the magazine to obtain reward once completing the FR requirement (see supporting Appendix S1).
Equating the effort on both the HR and LR caused all animals to return to choosing the HR on a large majority of trials. In subsequent blocks, the response requirement to receive reward on both levers was raised from FR12 (Block G, i.e. 12 responses required to receive reward) to FR24 (Block H, i.e. 24 responses to receive reward) and then FR48 (Block I, i.e. 48 responses to receive reward). As can be seen in Fig. 5A and B, although extending the effort on both of the levers had little effect on overall patterns of choice behaviour, with all groups continuing to select the HR option when choices were made, the numbers of missed trials increased across blocks (main effect of block: F2,54 = 90.95, P < 0.001). However, there were no differences between the three groups at any effort level. The time taken to make a magazine response to initiate a trial was also investigated. In agreement with the previous measures, the latencies to make this initial response increased as the effort requirement increased (main effect of block: F2,54 = 18.69, P < 0.001). However, as in Experiments 1 and 2, there was also a significant main effect of group (F2,27 = 3.66, P = 0.039), which additional comparisons showed to be caused by the NAc DA-depleted animals being significantly slower across all sessions than the controls to make the initial magazine response (P = 0.012), an effect that was particularly marked when the effort was increased to FR48.
To investigate whether the rate of responding was altered across the three groups as the effort increased irrespective of choice, the average lever press time (total time to complete the FR/FR size) was calculated for all forced trials (Fig. 5C). Again, there was an overall increase in mean response time across blocks, particularly when the effort increased to FR48 (main effect of block: F2,54 = 256.67, P < 0.001), and responses were consistently faster to the HR than LR (main effect of reward: F1,27 = 7.52, P = 0.011). However, as before, the three groups responded at a similar speed throughout and, although there was a trend for an interaction between group and testing block (F4,54 = 2.47, P = 0.056), neither of the lesion groups was ever significantly different to the controls on any block.
All existent post-surgery rats from the total of 36 animals that started out training in Experiment 2 were run on this task, including the four animals excluded from the operant decision making for being unable to learn the reward discrimination, resulting in the following group sizes: ACC, 11; NAc DA, 10; sham, 7 and vehicle, 6. As there were no significant differences between the amount of food hoarded or eaten across the 2 days of testing in the sham-lesioned or vehicle groups (hoarding, no main effect of Group or interactions involving the factor of Group: all Fs < 2.5, not significant; eating, no main effect of or interactions with group: all Fs < 1), these animals were combined into a single control group for subsequent analyses.
Subsequently, a repeated measures anova comparing the amount of food hoarded across the 2 days in the three groups was performed. This showed a main effect of Group (F2,31 = 3.86, P = 0.032) and also a main effect of Day (F1,31 = 4.73, P = 0.037) caused by decreased hoarding in all groups on day 2 of testing. Further analyses comparing each individual group with each other demonstrated that, compared with control animals, which hoarded the majority of their food, rats with NAc DA depletions left significantly more pellets in the bucket (F1,21 = 7.61, P = 0.012) (Fig. 6). However, although ACC-lesioned animals showed a numerical reduction in the amount of food hoarded, there were no statistical differences between the ACC and either of the other groups. A correlation analysis to compare hoarding performance with average post-operative choices across all sessions on the operant effort-related decision task (Experiment 2) failed to show any relationship between behaviour on the two tasks, either when all animals were included or when just the control group was examined (both rs < 0.12).
A similar repeated measures anova for the amount of food eaten also resulted in a main effect of day (F1,31 = 4.30, P = 0.047). However, there was no main effect of, or interactions with, group at any stage, although the difference between the control and NAc DA groups approached significance (F1,21 = 3.97, P = 0.059) with the NAc group eating more food on average than the controls.
Both the ACC and DA projection to the NAc core have been implicated in allowing animals to choose to exert effort to receive greater reward. In the present study, using an operant decision-making paradigm where animals had to weigh up whether it was worth engaging with a longer response schedule to receive a large payoff when a smaller one was available at a shorter FR schedule, systemic injections of a DA antagonist (haloperidol) caused animals to be slower to initiate responding as well as a significant bias towards choosing the low-effort option. Surprisingly, however, only ACC lesions, and not NAc core DA depletions, caused a significant choice bias towards low-effort options compared with the control group. This was not caused by the ACC-lesioned animals having an inability or unwillingness to invest effort as they all chose the HR option when the effort was equated on both levers and showed a similar level of performance in terms of trials completed and response latencies as controls when the response requirement was increased markedly on both levers (Experiment 3). Although their choice behaviour did not show the expected bias, the NAc DA group, but not the ACC animals, were consistently slower to initiate responding when the magazine and house light illuminated to signal the start of a trial, particularly when the effort cost was high. This implies that, whereas the effects of 6-OHDA NAc lesions may affect the activation and energization of initial cue-related responses in this operant paradigm, only excitotoxic ACC lesions appeared crucial for altering operant effort-related cost–benefit choice allocation.
The ACC has previously been shown to play an important role in guiding effort-related choices in a number of studies using an analogous T-maze barrier decision-making task where animals have to choose whether it is worth scaling a barrier to earn a large reward when a small reward is easily available (Walton et al., 2002, 2003; Schweimer & Hauber, 2005; Rudebeck et al., 2006). As in the present study, the effect was specific to choice trials where animals had to weigh up the costs and benefits of one option against another where both the effort required and reward available were less. Two prominent theories of ACC function have posited that this region is important for monitoring for errors or for when there are conflicting responses. Our results do not easily fit in with the former notion as there are no clear errors in the present task as the animals are rewarded for either option and instead have to weigh up the costs and benefits to decide which option is more valuable. Although such effort-related decisions necessarily engender ‘decision conflict’ when weighing up the costs and benefits of the two options to decide what to do (Pochon et al., 2008), it is notable that ACC lesions do not cause a similar deficit if the cost to be overcome is a delay (Cardinal et al., 2001; Rudebeck et al., 2006). This implies that the ACC is not merely involved whenever it is necessary to evaluate two competing options but instead specifically when evaluating the benefits of exerting extra effort for a better reward in comparison to a less rewarding option that requires less energetic expenditure. Overall, the present finding therefore fits in better with the notion that the ACC plays a pivotal role in deciding whether the benefits of one course of action are worth the costs that will be entailed (Walton et al., 2006; Rushworth et al., 2007). We and others have previously shown that rats are extremely sensitive to effort costs caused by repeated lever pressing, adjusting their choice behaviour in accordance with the proportional response requirement and not just the outcome size (Walton et al., 2006; Floresco et al., 2008). Moreover, in monkeys, cells have been found in parts of the ACC that represent the distance from and progression towards reward in terms of the response steps required to obtain that outcome (Shidara & Richmond, 2002), a response that is largely absent in DA cells (Ravel & Richmond, 2006), as well as encoding both the effort cost and reward payoff value of an option (Kennerley et al., 2008).
One immediate question concerns the nature of the cost that the animals are choosing to overcome as the increased effort required to complete a longer FR schedule also takes more time than for a short FR. Several lines of evidence suggest that the response-related work element of the decision is likely to be causing the change in choices in the ACC animals. In all of the animals, there was no correlation between choice performance and either the length of time taken to complete the FR schedule or the average amount of time to make lever presses, suggesting that the rats are not simply making their decisions based on the time taken to gain either the HR or LR. Importantly, the amount of time taken to complete the FR schedule was unchanged in the ACC group after surgery, showing that any alteration in behaviour was not simply caused by retardation in motor function resulting in an alteration of decision making as a consequence of the increased delay between choosing and receiving the HR.
Furthermore, there are numerous indications that effort and delay costs are separable at the neuropsychological and behavioural levels. Rudebeck et al. (2006) demonstrated that ACC lesions only affect decision making on an effort-based T-maze task and not on an analogous delay-based task, a finding supported by the lack of a deficit observed in a separate operant delay-discounting task (Cardinal et al., 2001). In humans, it has been suggested that neuropsychiatric disorders that affect the ACC are more likely to lead to effort than time aversion (Cummings, 1993). A similar dissociation between effort and delay costs has been demonstrated for NAc DA (Salamone et al., 1994; Wakabayashi et al., 2004; Mingote et al., 2005; Winstanley et al., 2005), as well as between different species of monkeys with distinct feeding ecologies (Stevens et al., 2005a,b). Therefore, it seems likely that the alterations in performance seen after ACC lesions in the present experiment were caused by the effortful nature of the lever pressing, or having to persist with actions over time, rather than merely by the temporal components on their own. The cost aversion observed after ACC lesions may be relatively specific to effort-related costs when ascribing value to courses of action (Rudebeck et al., 2006).
However, it is notable that, although the ACC lesion effect is clearly significant, it is neither as pronounced nor as persistent as that previously seen in the T-maze barrier tasks where a complete reversal of behaviour is initially observed. This may partly be a factor of the large number of forced trials that animals experience in the operant task compared with the T-maze, with rats in the present study having to sample each option at least once every six trials (a total of 16 forced trials per session), whereas in the T-maze task they only received one forced trial to each goal arm at the start of each session of 10 trials. In the study by Rudebeck et al. (2006) using the T-maze, it was shown that an initial ACC decision-making deficit was ameliorated by experience of climbing the barrier to gain the HR, suggesting that other interconnected structures known to be involved in such choices, such as the amygdala (Floresco & Ghods-Sharifi, 2007), may be able to represent the cost–benefit contingencies sufficiently to guide optimal decision making in ACC-lesioned animals following repeated experience of the outcome contingencies. Compared with the T-maze barrier task, such experience is easily gained in the current operant paradigm through the large number of forced trials which may, in turn, result in the shift in choice behaviour across sessions. The ACC may be particularly important for guiding decision making when task contingencies are not fully known or when there is a need to evaluate outcomes to guide future choices (Walton & Mars, 2007). In the present and previous studies (Walton et al., 2003; Rudebeck et al., 2006), control performance also dropped immediately following surgery, suggesting that all animals may initially need partially to relearn the costs and benefits of each option.
Another factor that may play a role is that, whereas in both the T-maze barrier task and the free operant paradigms employed by Salamone and colleagues (e.g. Cousins & Salamone, 1994) an animal could choose between two distinct courses of action (either investing effort in climbing a barrier for the HR or simply collecting the LR in the other arm of the T-maze or engaging in effortful lever pressing or collecting freely available laboratory chow in the free operant task), in the present operant paradigm rats were opting between making different amounts of an analogous response (lever pressing), where even the LR was some distance away from the choice point in terms of response steps. It has been demonstrated in studies of temporal discounting that different patterns of behaviour or deficits may emerge as a function of whether only the HR is delayed or whether animals have to wait some time between the choice and reward regardless of which option they take (Green et al., 1994; Kheramin et al., 2002; Mobini et al., 2002). Moreover, although the similarity of the deficits would suggest that the costs involved with repeated lever pressing and with scaling a barrier are similar, whether they are analogous types of effort cost is unclear; the former requires persevering with responses to gain the outcome, whereas the climbing of a barrier involves a punctate burst of energy that takes almost no extra time than choosing the easy option.
Nonetheless, the effort costs in both the operant and T-maze tasks do share some common components. In particular, both types of effort require the animal to persist through a sequence of actions, the components of which are themselves unrewarded, to achieve the HR. Moreover, in both tasks, at the choice point animals need to be able to evoke a representation of the more beneficial outcome distant to the high-effort action cost in order to resist choosing the more easily obtained LR option. It has been proposed that the orbitofrontal cortex may predict and represent the value of impending events based on stimulus–outcome associations (Schoenbaum et al., 2006). The results of the effort-based decision-making experiments suggest that the ACC may serve a similar role for future action sequences.
The complete absence of an alteration in a bias away from high-cost choices on the operant decision-making task following NAc core DA depletions, as well as similar performance when the effort requirement was increased on both levers, was unexpected given the wealth of studies indicating an important role for this monoamine in the NAc in allowing animals to work to receive rewards (Salamone et al., 2007). Previous studies have shown that injections of systemic DA antagonists or depletions of DA in the NAc cause a reduction in operant lever pressing at high schedules while leaving low schedules or primary food intake unaffected (Aberman et al., 1998; Aberman & Salamone, 1999; Salamone et al., 2001, 2006), whereas infusions of amphetamine into the NAc, an indirect DA agonist, increase instrumental responding (Zhang et al., 2003). The lack of an effect in the present study was not caused by the lesion being ineffective as these animals did show a selective slowing in latencies to initiate responding at the start of each trial in both of the operant experiments, while not exhibiting any concomitant slowing in making lever press responses or collecting rewards, and showed a similar reduction in hoarding behaviour as has previously been observed following 6-OHDA NAc lesions (Kelley & Stinus, 1985). Moreover, the extent of the depletion observed in the pilot brain stained for DA fibres, killed at a time-point equivalent to the start of testing, and in the experimental group, which showed on average ~90% depletion in DA several months after surgery, demonstrates that the lesion was complete and durable, consistent with previous research demonstrating a persistent reduction in DA concentrations following 6-OHDA injections (Breese & Traylor, 1970; Uretsky & Iversen, 1970). Such levels of depletion are similar to previous studies showing effort-related deficits following DA depletions (Salamone et al., 2001; Mingote et al., 2005). Finally, it seems improbable that the lack of an effect on choice behaviour was caused by the paradigm not being sufficiently sensitive as systemic low doses of the dopaminergic antagonist haloperidol did cause a significant shift away from the high-effort HR option, as well as a similar retardation in the speed to initiate responding at the start of each trial.
There are a number of possible reasons for this discrepancy, none of which are mutually exclusive. The animals in our study were tested for the first time on average 18 days after surgery to allow all animals to recover fully (i.e. to be at least at pre-operative free-feeding weight) before being food deprived again to 85% of ad libitum weight. This is notably longer than in the studies on effort-related effects of DA depletions by Salamone et al., (1994, 2001), where testing frequently started as soon as 2–3 days post-surgery and behavioural performance was then tracked across several days or weeks (Sokolowski et al., 1998; Ishiwari et al., 2004; Mingote et al., 2005). In several experiments on the effects of unilateral mesostriatal 6-OHDA lesions on turning behaviour, administration of either amphetamine or apomorphine had differing effects during each of the first 7 days after surgery, which then were largely stable after this first week even up to 2 years later (Ungerstedt, 1971; Schwarting & Huston, 1996). Increasing depletion of DA fibres occurs over the first 5 days following injection of 6-OHDA and is accompanied by a sensitization of DA receptors (Marshall et al., 1989; Gerfen et al., 1990; see Gerfen, 2003). However, it has also been shown that the impairment in locomotor effects often observed following targeted NAc DA depletions (though see Liu et al., 1998), particularly when animals are in a novel context, can sometimes ameliorate entirely at 3–4 weeks post-surgery, an effect that has been shown to correlate with the amount of food gained from lever pressing (Koob et al., 1978; Wolterink et al., 1990; Correa et al., 2002). Therefore, the state of the DA depletion and the associated neurobiological consequences in the current and previous effort-related experiments is unlikely to have been analogous.
Moreover, at 10 days after DA depletions, tonic levels of DA, as measured by microdialysis, are reduced by approximately 60% but recover to near normal levels at 1 month after surgery and even show increased levels 2 months later (Robinson & Whishaw, 1988; Parkinson et al., 2002), probably as a result of compensatory increases in firing rates of remaining DA neurons and/or DA synthesis and release or development of supersensivity of DA receptor systems (Stachowiak et al., 1987; Wolterink et al., 1990). Hyperdopaminergic mice, which have elevated levels of tonic DA with largely unaltered phasic firing, are more willing to work for palatable food than normal animals, even when laboratory chow is freely available (Cagniard et al., 2006) and, computationally, elevated DA tone has been linked with the vigour with which animals will work to obtain food (Niv et al., 2005).
By contrast, phasic DA signals have been more closely associated with learning about and predicting rewards, guiding behaviour towards such beneficial outcomes and perhaps also allowing animals to explore novel alternatives (Montague et al., 1996; Schultz, 1998; Redgrave & Gurney, 2006; Phillips et al., 2007). Numerous electro-physiological and voltammetric studies have demonstrated phasic DA signals or DA release, respectively, in response to reward-predicting cues (e.g. Ljungberg et al., 1992; Roitman et al., 2004) and inactivation of midbrain DA structures or manipulating the DA projection to the NAc affects how animals respond to incentive cues (see Di Ciano et al., 2001; Dalley et al., 2002; Parkinson et al., 2002; Yun et al., 2004; Calaminus & Hauber, 2007). Although we cannot fully rule out that the slowed response to the initial trial onset are not simply caused by being further from the magazine tray, the deficits observed during the present operant effort-based decision-making tasks would nonetheless therefore be consistent with a loss in phasic DA signals in the NAc core resulting in a failure of the initial conditioned stimulus (the magazine and house light illumination, which is the earliest predictor of reward in a trial) to activate responding as efficiently. However, it may be speculated that a partial recovery in tonic DA levels allowed animals to continue with the decision-making policy learned prior to surgery and hence to show no discernible changes in the amount they are willing to work for reward. Such a hypothesis will require further experiments in which measures of phasic and tonic DA are taken following 6-OHDA lesions and in which DA antagonists are directly infused into the NAc core to assess how tonic and phasic DA release may be involved in these types of behaviour.
It is notable that a global blockade of DA (mainly D2) receptors brought about by injection of an antagonist did cause a change in choice behaviour, without grossly affecting motoric performance, corroborating recent findings using a similar operant effort-related task (Floresco et al., 2008). This bias away from high-effort HR options was accompanied by a similar slowing in response initiation as occurred in the 6-OHDA NAc core depletions. A similar pattern of findings (slowed response initiation and a bias away from a high-effort HR option) was also observed following systemic haloperidol injections in a T-maze effort-based decision-making task (Denk et al., 2005). Again, further studies using local infusions of DA antagonists will be required to determine whether the differing effects on patterns of response choice were caused by local differences in the NAc between the actions of a systemic antagonist and depletion with 6-OHDA some weeks post-surgery or whether this suggests that DA in regions outside the NAc core is also crucial for guiding effort-related choices. Based on the work of Salamone and colleagues, it seems unlikely that the DA projection to other parts of the striatum is important as both dorsomedial and ventrolateral striatal DA depletion fail to cause similar biases in effort-related payoffs as they observed following NAc depletions, where there would be a decrease in lever pressing for palatable food accompanied by an increase in freely-obtained laboratory chow consumption (Cousins et al., 1993). Nonetheless, there is some evidence that DA antagonism in frontal regions may also play some role in effort-related choice behaviour (Schweimer & Hauber, 2006; although also see Walton et al., 2005). However, it is possible to rule out that any differences between the effects of haloperidol administration and the 6-OHDA depletion were caused by the reduction in choice performance of the control animals in Experiment 2 leading to a floor effect in the lesion group, as performance of the 6-OHDA NAc-lesioned animals, tested ~1 month after pre-lesion sessions, was also very similar to the vehicle-injected animals in Experiment 1, tested immediately after training.
A second difference between the present operant experiments and previous studies investigating the effects of DA manipulations on effort-related tasks where there was more than one rewarded option to choose between is that, in the latter, the less valuable outcome was always a readily available primary reward (either laboratory chow at the back of the operant box in the lever press/chow experiment or a small number of food pellets in the T-maze barrier task), whereas the more beneficial payoff that required energetic expenditure to obtain was at some distance away from the choice point, either requiring lever presses or barrier climbing to obtain. Therefore, in these tasks, as well as evaluating the costs and benefits of the available options, animals have to resist the temptation of the freely available food to engage with the effortful response option. As DA is important for signalling the predictive value of cues, this type of decision between a freely available primary reward and responding to a conditioned stimulus may be particularly difficult following DA manipulations. Moreover, all of the operant tasks employed previously utilized a free-operant procedure where there were no discrete trials and animals could respond whenever they chose to. By contrast, in the present operant decision-making experiments, the task used discrete, cue-initiated trials and both response alternatives required some effort to be invested before reward is available. Either of these task features could have caused DA depletions not to bias selection towards the low-effort LR option in a similar manner to that observed previously, as in the present task both response options may come to have the status of conditioned stimuli and the presence of several discrete cues (house and magazine illumination at trial onset, lever extension with stimulus lights following nose poke, etc.) means that there are minimal requirements for self-initiated responding. Voltammetric recordings have demonstrated DA release in the NAc core both to discriminative cues and also just prior to the initiation of a behavioural response (Phillips et al., 2003; Roitman et al., 2004), and DA depletions have been shown to have some affect even on schedules as low as an FR5 (Sokolowski et al., 1998), similar to the smallest effort requirement in the present operant decision-making task.
Some support for such a notion comes from the results of the food-hoarding experiment, a species-typical unlearned foraging behaviour in which animals have to choose between storing laboratory chow back in their home cage or instead immediately consuming the primary reward where it is discovered (Bindra, 1948). Consistent with previous experiments (Kelley & Stinus, 1985), in a task where the primary reward is available and there are no discrete cues to guide behaviour, rats with NAc core DA depletions did show a significant bias away from hoarding. This is unlikely to have been caused by a change in the way that the animals discounted the value of the food according to the delay to consume, as a recent carefully controlled study showed that the effects of a DA antagonist on operant effort-based decision making was the same even when the delay to reward was equated for both options (Floresco et al., 2008). Although less constrained, the hoarding task used in the present study shares some conceptual similarities with previous cost–benefit decision-making tasks in that hungry rats must weigh up the benefits of putting in work to collect the pellets in a single location to gain the later benefits of having food stored in a safe accessible place rather than just eating them where found. However, one should caution against drawing too many parallels between hoarding and other effort-related decision-making tasks as there was no correlation between behaviour on the two tasks and the ACC group, which were previously impaired at operant decision making, failed to show a significant reduction in hoarding (a similar, non-significant decrease in hoarding after discrete ACC lesions has been observed in two previous experiments with analogous group sizes, whereas large medial frontal lesions including ACC, prelimbic and infralimbic regions cause a much greater decrease: Walton et al., unpublished observations). Again, further experiments comparing analogous decision-making tasks where primary reward either is, or is not, available for one of the options will be necessary to test this account directly.
In spite of pre-treatment with desipramine, a noradrenaline reuptake inhibitor that was intended to protect the noradrenaline neurons from the toxic effects of the 6-OHDA, there was also a substantial depletion of noradrenaline. However, it should be noted that the noradrenaline projection to the NAc core is extremely small (Berridge et al., 1997) and the tissue content of noradrenaline in this region was measured to be ~20% that of DA, with estimates in other studies similarly ranging from 2 to 20% (Garris et al., 1993; Parkinson et al., 2002). Moreover, other studies looking at the effects of psychostimulants have directly implicated DA and not noradrenaline in the NAc in aspects of conditioned reinforcement (Cador et al., 1991). Therefore, it is unlikely that the noradrenaline depletion was a significant contributor to any effects observed here.
In conclusion, for the operant effort-related decision-making task used here, only excitotoxic ACC and not 6-OHDA NAc lesions caused animals to be biased away from overcoming effort constraints to achieve greater benefits in the face of a rewarded alternative requiring less effort. The ACC sends excitatory projections down to the NAc in both rodents and primates (Zahm & Brog, 1992; Haber et al., 2006), synapsing on medium spiny neurons proximal to efferents from midbrain DA cells (Sesack & Pickel, 1992), and dysfunction in this circuit has previously been related to symptoms of apathy (Cummings, 1993; van Reekum et al., 2005; Rosen et al., 2005; Levy & Dubois, 2006). Although both the ACC and NAc DA are involved with effort-based decision making, it would appear that only the ACC, with its widespread connections with limbic and motoric regions, is crucial for providing an evaluation of the costs and benefits of the available competing options when effortful action costs need to be considered. By contrast, NAc DA (perhaps particularly phasic DA release in the NAc, which is the type of DA function most likely to be assessed by the procedures used in the current experiment) appeared unnecessary for this process as animals were able to continue to choose to invest effort for HR. Instead, one important role of phasic NAc DA during effort-based decision making may be to activate initial responding and to resist the temptation of primary reward.
This work was supported by the Medical Research Council and Welcome Trust Advanced Training and Senior Research Fellowships to M.E.W. and D.B., respectively. We would like to thank Amy Ross-Russell for help with testing, Jeff Dalley for advice on tissue punching, Scott Ng-Evans for programming, Greg Daubney for histology, Nick Rawlins for support with the experiments, and John Salamone, Paul Phillips and Stefan Sandberg for helpful discussions about the data.
Additional supporting information may be found in the online version of this article:
Appendix S1. Supplementary results.
Please note: Wiley-Blackwell are not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.