|Home | About | Journals | Submit | Contact Us | Français|
Evaluation of the behavioral “costs,” such as effort expenditure relative to the benefits of obtaining reward, is a major determinant of goal-directed action. Neuroimaging evidence suggests that the human medial orbitofrontal cortex (mOFC) is involved in this calculation and thereby guides goal-directed and choice behavior, but this region's functional significance in rodents is unknown despite extensive work characterizing the role of the lateral OFC in cuerelated response inhibition processes. We first tested mice with mOFC lesions in an instrumental reversal task lacking discrete cues signaling reinforcement; here, animals were required to shift responding based on the location of the reinforced aperture within the chamber. Mice with mOFC lesions acquired the reversal but failed to inhibit responding on the previously reinforced aperture, while mice with prelimbic prefrontal cortex lesions were unaffected. When tested on a progressive ratio schedule of reinforcement, mice with prelimbic cortical lesions were unable to maintain responding, resulting in declining response levels. Mice with mOFC lesions, by contrast, escalated responding. Neither lesion affected sensitivity to satiety-specific outcome devaluation or non-reinforcement (i.e., extinction), and neither had effects when placed after animals were trained on a progressive ratio response schedule. Lesions of the ventral hippocampus, which projects to the mOFC, resulted in similar response patterns, while lateral OFC and dorsal hippocampus lesions resulted in response acquisition, though not inhibition, deficits in an instrumental reversal. Our findings thus selectively implicate the rodent mOFC in braking reinforced goal-directed action when reinforcement requires the acquisition of novel response contingencies.
Several distinct brain structures have been implicated in learning causal relationships between behavioral responses and their consequences, and in adjusting behavior based on changes in response contingencies (Balleine & Dickinson, 1998). For example, lesions of the prelimbic prefrontal cortex (PL) in rats impair instrumental response acquisition under conditions of uncertain reward availability (Corbit & Balleine, 2003). Human neuroimaging reports suggest that the medial orbitofrontal cortex (mOFC), which lies ventral to the PL within the frontal pole, also regulates behavioral responding based on action-outcome response contingencies (Valentin et al., 2007; Tanaka et al., 2008). Nonetheless, rodent studies aimed at understanding how the brain guides actions to obtain desired outcomes has historically focused on more dorsal prefrontal structures.
The mOFC comprises part of the medial prefrontal cortex (PFC) network in rodents and primates that sends projections to the striatum and receives projections from downstream structures via the dorsomedial thalamus (Öngür & Price, 2000). In monkeys, the mOFC appears to process reward expectations relative to instrumental response contingencies and outcome costs (Roberts, 2006); this is in contrast to the lateral OFC, famously essential for inhibitory control during stimulus-response reversal learning (Iverson & Mishkin, 1970; McAlonan & Brown, 2003). The rodent mOFC shares some anatomical characteristics with the rostrally-situated ventral PL (Öngür & Price, 2000; Thierry et al., 2000), but recent studies show the rodent mOFC sends projections to selective patches of the dorsomedial striatum, as occurs in the primate mOFC (Schilman et al., 2008). These projections are distinctive from those associated with the PL and infralimbic cortices (Berendse et al., 1992). Moreover, the region has distinctive organization and cytoarchitectonic boundaries in the mouse, as in higher organisms (Van de Weld et al., 2010). Site-selective lesions of the mOFC in this species might therefore be expected to have distinct consequences for goal-directed action relative to other medial PFC structures.
Here, we reversed the action-outcome response requirement to acquire food reinforcement in mice with mOFC lesions. Mice with PL lesions served as a comparison, and only mOFC lesions impaired response inhibition. We also tested sensitivity to a progressive ratio schedule of reinforcement; here, mOFC lesion mice escalated responding. Satiety-specific outcome devaluation and post-training lesion experiments suggested this phenotype could be attributed deficient acquisition of novel response requirements.
Ventral hippocampal projections uniquely excite mOFC neurons (Ishikawa & Nakamura, 2003) and may endow this region with information regarding the emotional and motivational salience of an outcome. We therefore tested mice with ventral hippocampal lesions in the same tasks with the hypothesis that response profiles would resemble those generated by mOFC, and not PL, lateral OFC, or dorsal hippocampal lesions. Our findings reveal correlations between structure and function that largely agree with those established in rats and primates and dissociate major structures within the mouse PFC along both rostral-caudal and medial-lateral axes. Given the increasing utility of transgenic and knockout mice in modeling human behavior, better verification of response inhibition loci in this species may be an essential step in understanding the biological bases of goal-directed action.
Experimental animals were male C57BL/6 mice (initially 10–12 wks old) from Charles River Laboratories (Kingston, NY). Mice were food-restricted (90 min access/day) to maintain ~92% original body weight. Tests were conducted during the light phase of a 12-hour light cycle (0700 on). Procedures were approved by the Yale University Animal Care and Use Committee.
Experimenters used standard operant conditioning chambers for mice (16×14×12.5 cm) controlled by MedPC software (Med-Associates Inc., Georgia, VT). Head entries into 3 nose poke recesses and a food magazine were detected by photocell, and a dispenser delivered grain-based food pellets (20 mg; Bio-Serv, Frenchtown, NJ) upon completion of the response requirement. Mice were initially trained to perform the operant response (nose poke) with ~12 25-min sessions (1/day) during which 1, 2, or 3 responses yielded food reinforcement, i.e., a variable ratio 2 (VR2) schedule of reinforcement. The location of the reinforced aperture (right or left) was counter-balanced, with the center nose poke never reinforced. Mice were required to retrieve pellets before earning more.
Mice were first trained to obtain food as described above, then administered site-specific N-methyl D-aspartate (NMDA) infusions. For group designations, mice were matched based on reinforcements earned during training. Mice were then anaesthetized with 1:1 2-methyl-2-butanol and tribromoethanol (Sigma Aldrich, St. Louis, MO) diluted 40-fold with saline, or with pentobarbital. The shaved head was placed in a stereotaxic frame (David Kopf Instruments, Tujunga, CA). The scalp was incised, skin retracted, the head leveled based on bregma and lambda, and coordinates were located using the Kopf Instruments digital coordinate system with resolution of 1/100 mm. A single hole was drilled, and NMDA (20 μg/μl; Sigma) or sterile saline was infused over 1 min (0.1 μl/hemisphere) with needles aimed at +2.8 AP, −2.3 DV, ±0.1 ML for mOFC lesions or +2.0 AP, −2.5 DV, ±0.1 ML for PL lesions (Paxinos & Franklin, 2003; Gourley et al., 2008, 2009). Needles remained in the brain for 2 additional min after infusion. Mice were then sutured and allowed at least one week for recovery before food restriction resumed. Before testing, mice were given 2–3 “reminder” sessions identical to training. Here, sham and lesion mice did not differ in the number of reinforced responses performed.
In the instrumental reversal task, the location of the active nose poke aperture was “reversed,” such that the previously non-reinforced aperture on the opposite side of the chamber was reinforced, with no consequences for responding on the originally reinforced aperture. The schedule of reinforcement was VR2, as in training, with one 15-min session/day for 7 consecutive days; the “reversal” occurred on the first day, and the subsequent test sessions served to generate acquisition curves for responding on the newly reinforced aperture and inhibition of the previously reinforced response (i.e., “perseverative” responding). Reinforced and perseverative responses were analyzed by 2-factor (lesion × session) repeated measure (RM)-analysis of variance (ANOVA) with Tukey's post-hoc comparisons. Extinction sessions were conducted in the same animals after mice had acquired the reversal; here, all reinforcement was withheld, regardless of animals' responding, for 5 daily 15-min test sessions. Responses made on “active” aperture were analyzed by 2-factor (lesion × session) RM-ANOVA.
A separate group of mice was tested on the classic progressive ratio schedule of reinforcement (Hodos, 1961), in which mice initially trained to perform an instrumental response for food reinforcement on a highly reinforcing schedule (VR2 here) are, at test, required to respond progressively more for each subsequent reinforcer. The “break point ratio,” referring to the highest number of responses an animal completes for a single food reinforcer, serves as the dependent variable. Mice were trained as described above, lesions were placed, and then mice were shifted to a progressive ratio schedule with a linearly increasing response:reinforcement requirement (1, 5, 9, x+4 responses per reinforcement), with 1 session/day for 5 consecutive days. Sessions ended when animals executed no responses for 5 consecutive min or reached 2 hours in the chambers. Break point ratios did not differ between mOFC and PL sham groups and were combined. Break point ratios were analyzed by 2-factor (lesion x session) RM-ANOVA with Tukey's post-hoc comparisons. In a separate analysis, break point ratios on days 2–5 were normalized to each individual animal's day 1 value to further evaluate whether responding increased or decreased across multiple sessions. These values were arc-sin transformed to ensure normality (Ferguson, 1978), then analyzed by 1-factor RM-ANOVA.
In a subsequent experiment, mice were trained to nose poke as described and then trained on the progressive ratio schedule for 5 sessions before surgery. Mice were matched based on break point ratios, and lesions were placed as described above. After recovery, 5 more progressive ratio test sessions were conducted. Sham groups did not differ and were combined, and break point ratios were analyzed by 2-factor (lesion × session) RM-ANOVA.
We also tested sensitivity to devaluation of the food outcome using a satiety-specific prefeeding procedure. Here, mice were allowed 30 min access to the reinforcer pellets in a clean cage before a 15-min test session conducted in extinction, as is standard practice. Responding was normalized to a non-devalued, i.e., “valued,” session conducted the following day—a 15-min test session also conducted in extinction, prior to which food pellets were not available. Again, sham groups did not differ and were combined, and groups were compared by ANOVA. The mice used in this experiment were the same as those used in the first progressive ratio study described above.
Mice with vHC, dHC, and lOFC lesions were also generated and tested in the instrumental reversal and progressive ratio tasks for comparison to mice with medial PFC lesions. We used the same surgical methods as described above with the following exceptions: For vHC lesions, 4 holes were drilled in the skull, and NMDA was infused at −3.0 AP, −4.0 DV, ±2.75 ML and −3.4 AP, −4.0 DV, ±3.0 ML in a volume of 0.1 μl/site over 1 min with the needles left in place for 4 additional min. For dHC lesions, 4 holes were drilled, and NMDA was infused in a volume of 0.1 μl/site over 1 min with needles remaining for 2 additional min at AP-1.3, DV-2.0, ML±1.0 and AP-2.1, DV-2.2, ML±1.5 mm (from Chowdhury et al., 2005). Note that these mice were used in a water maze experiment for an independent study before instrumental training and testing. For lOFC lesions, 2 holes were drilled, and NMDA was infused at AP+2.6, ML±1.2, DV-2.8 (from Bissonette et al., 2008) in a volume of 0.1 μl/site over 1 min with the needles left in place for 4 additional min.
After behavioral testing, mice were deeply anaesthetized with pentobarbital and transcardially perfused with chilled saline and 4% paraformaldahyde. Brains were stored for 48 hours, then transferred to 30% w/v sucrose, and sliced into 40 micron thick sections on a microtome (−15°C±1). Every third section was immunostained for NeuN (Millipore, Billerica, MA; Rb; 1:500) and Glial Fibrillary Acidic Protein (GFAP) (Dakocytomation, Carpinteria, CA; Ms; 1:1000). AlexaFluor goat IgGs (Invitrogen, Carlsbad, CA; 1:300) served as secondary antibodies. Slices were imaged, and lesions were graphically transposed onto corresponding mouse brain atlas images (Paxinos & Franklin, 2003).
The medial PFC (specifically, the PL) is associated with action-outcome associative learning (Balleine & Dickinson, 1998; Killcross & Coutureau, 2003), while lateral orbitofrontal cortex lesions retard stimulus-response learning in reversal tasks (e.g., Schoenbaum et al., 2002). Situated at the junction of the ventrolateral orbital and PL cortices in both primates and rodents, the mOFC may be expected to influence behavioral responding in tasks that require updating stimulus-response or action-outcome associations, but little is known about this structure in these contexts. Here, mice were initially trained to respond on a nose poke aperture in the northeastern corner of an operant conditioning chamber, then were required to shift responding to an aperture in the northwestern corner or vice versa for reinforcement. A variable ratio schedule of reinforcement was used, with no discrete cues signaling reinforcement delivery. Thus the task required mice to update action-outcome—as opposed to stimulus-response—associative relationships in order to obtain food pellets, and subsequent sensitivity to outcome devaluation confirmed mice responded based on action-outcome response contingencies (below).
In agreement with a classic report on the effects of mOFC lesions in monkeys (Iverson & Mishkin, 1970), mOFC lesions increased “perseverative” responding—responding on the previously-reinforced aperture despite non-reinforcement [main effect of lesion F(1,21)=5.5, p=0.03; lesion × session F(6,126)=2.1, p=0.06] (fig. 1b)—while acquisition of the newly reinforced response was unaffected [main effect of lesion F<1; lesion × session F(6,126)=1.7, p=0.1] (fig. 1a). Mice with PL lesions also acquired the newly reinforced response [main effect and interaction Fs<1] (fig. 1c), but in this case, perseverative responding was unaffected [effect of lesion F(1,7)=1.2, p=0.3; session × lesion F<1] (fig. 1d), indicating distinct roles for these adjacent medial PFC structures.
mOFC and PL lesions in this and all other experiments were largely separated on both dorso-ventral and rostro-caudal planes, with the typical mOFC lesion at the rostral-most tip of the frontal pole and larger mOFC lesions spreading laterally to include the ventral OFC (fig. 1e). mOFC lesions were rostral enough to avoid the infralimbic cortex, although some GFAP staining in the rostral PL was noted. Approximately 50% of mOFC lesion mice had some degree of GFAP staining along the needle track in at least 1 hemisphere. PL lesions were caudal to mOFC lesions and encompassed the PL and anterior cingulate cortex, with spread to the infralimbic cortex in some mice (fig. 1f). Two mOFC and 6 PL lesions were unilateral, and 2 “mOFC” mice appeared to have lesions only in the infralimbic cortex; these animals were excluded.
To further characterize the role of the mOFC in behavioral inhibitory processes, we generated another group of lesion mice and conducted 5 test sessions in which mice were required to respond on a progressive ratio schedule of reinforcement for food. When break point ratios were analyzed, an interaction between lesion and session was detected [F(8,160)=2.6, p=0.01], and subsequent post-hoc tests indicated mice with mOFC lesions escalated responding, achieving break point ratios that differed from sham mice at a trend level of significance during session 2 (p=0.057), and that were significantly higher than sham levels during subsequent test sessions (ps≤0.03) (fig. 2a).
By contrast, mice with PL lesions differed from sham mice at a trend level of significance during the final session (p=0.07), suggestive of a declining response pattern (fig. 2a). To clarify this possibility, we calculated each animal's break point ratio as a percentage of its day 1 baseline. Both sham and mOFC lesion mice shifted responding upward to 135% and 159%, respectively, of day 1 baseline. By contrast, mice with PL lesions shifted downward, achieving break point ratios that were, on average, 72% of baseline across several sessions [main effect of lesion F(2,40)=4.6, p=0.02; post-hoc ps<0.05] (fig. 2b). Representative GFAP staining in lesion mice from these experiments is shown (fig. 2c); as indicated in fig. 1, mOFC and PL lesions were distinguishable by rostro-caudal position within the mPFC.
To confirm that the effects of PL lesions were not simply attributable to insensitivity to the previously learned action-outcome association, we devalued the food outcome with 30-min prefeeding with the reinforcer pellets used in the task. All mice consumed equivalent amounts of food during this prefeeding period (relative to shams, ps≥0.2; not shown). Subsequently, sham, mOFC, and PL mice showed the expected attenuation of instrumental responding with no difference between groups [F(2,32)=1.4, p=0.3] (fig. 3a). Mice also extinguished responding at equivalent rates when reinforcement was withheld across several sessions [effect of lesion and lesion × session Fs<1] (fig. 3b). These findings are in agreement with the argument that, under normal circumstances, the PL invigorates reinforced instrumental responding by maintaining sensitivity to the motivational value of the outcome (Corbit & Balleine, 2003), but lesions do not impact upon action-outcome associative relationships acquired prior to lesion placement (Ostlund & Balleine, 2005).
Implicit in this interpretation is the idea that if lesions were placed after the progressive ratio response requirements had been learned, responding would not be affected. To address this possibility, we trained another group of mice to acquire reinforcement, then further trained these animals to respond on a progressive ratio schedule of reinforcement with 5 test sessions before mOFC or PL lesion placement. We then tested the same animals on a progressive ratio schedule after recovery (5 sessions). Before lesion placement, responding did not differ by group designation as determined by break point ratio [main effect of group and group × session Fs<1]. After lesions were placed, break points remained unchanged [main effect of group and group × session Fs<1] (fig. 3c). Our findings thus suggest that neither the mOFC nor PL is required for progressive ratio performance once response parameters have been learned. These data also provide the first evidence for mOFC involvement in the acquisition, and not expression, of an instrumental response schedule in rodents.
In addition to its well-established role in spatial learning and memory, the hippocampus regulates motivational sensitivity to food and drug reward. Moreover, stimulation of the ventral sector uniquely excites neurons within the mOFC (Ishikawa & Nakamura, 2003), suggesting this region may provide the mOFC with information regarding the motivational salience of an appetitive outcome and thereby contribute to its regulation of goal-directed behavior. In this case, lesions of the vHC might be expected to result in similar response patterns relative to lesions of the mOFC. Indeed, vHC lesion mice successfully shifted responding to a newly reinforced aperture in an instrumental reversal task [main effect of lesion F<1; lesion × session interaction F(6,90)=1.6, p=0.15] (fig. 4a), but showed an impairment in response inhibition on the previously reinforced aperture, specifically during the initial test sessions [lesion × session interaction F(6,90)=3, p=0.01; sessions 1–3 post-hoc ps<0.05] (fig. 4b). Moreover, as with mOFC lesions, vHC lesions increased progressive ratio break points [main effect of lesion F(1,10)=6.5, p=0.03] (fig. 4c). Histological analyses indicated vHC lesions were largely limited to the ventral 50% of the caudal hippocampus, though some larger lesions spread dorsally, resulting in GFAP staining in the intermediate hippocampus. Lesions tended to be biased towards the rostral extent (e.g., Bregma −2.7) of the vHC or the caudal extent (e.g., Bregma −3.7) (fig. 4d), but this distinction did not appear to affect behavioral responding in our tasks.
A premise of this manuscript is that the mouse mOFC regulates action-outcome response flexibility in a manner that is unique relative to related prefrontal structures. In a final series of experiments, we dissociated the mOFC from lOFC by placing lesions in the lOFC and testing mice in the reversal and progressive ratio tasks. Mice with dHC lesions were also generated, as the dHC has no projections to the OFC (Cenquizca & Swanson, 2007), and recent studies highlight its functional and genetic dissociation from the vHC (Dong et al., 2009; cf., Fanselow & Dong, 2010), so lesions of this region might also be expected to produce distinctive effects in these two tasks. Saline-infused mice did not differ and were combined for representative purposes.
As predicted, lOFC and dHC lesions produced distinctive response patterns in the instrumental reversal task that were dissimilar to mOFC lesion response profiles. First, both lOFC and dHC lesions delayed the acquisition of the reversal [session × lesion interaction F(12,132)=3.9, p<0.001], though in distinct ways: Mice with lOFC lesions responded less than sham mice during session 3 (p=0.02), but not later (fig. 5a). By contrast, mice with dHC lesions appeared to reverse during early sessions, but were unable to achieve optimal responding, as indicated by fewer responses during the final test sessions (sessions 6–7, ps<0.006), perhaps because optimal responding depended on the spatial location of the aperture within the operant conditioning chamber (Mahut, 1971; Whishaw & Tomie, 1997). Somewhat surprisingly, lOFC lesions facilitated the extinction of responding on the previously reinforced aperture [session × lesion interaction F(12,132)=2.6, p=0.004, session 1 p<0.001], thus the medial and lateral OFC compartments were dissociable on both instrumental reversal response and response suppression measures. dHC lesions had no effect on response inhibition (all ps≥0.09) (fig. 5b).
Unlike mOFC lesions, lOFC lesions had no significant effects on break point ratios when mice were required to respond on a progressive ratio schedule of reinforcement, and mice with dHC lesions achieved higher ratios [main effect of lesion F(2,30)=8.5, p<0.001; post-hoc p=0.004] (fig. 5c), as has been previously reported in rats (Schmelzeis & Mittleman, 1996). Histological analyses indicated lOFC infusions resulted in prominent GFAP staining in the lOFC and lateral ventral OFC that spared medial prefrontal structures in all mice (fig. 5d–e). Twenty-eight percent of lOFC infusions resulted in particularly large lesions that spread laterally to affect the dorsolateral orbital cortex (“DLO” in Paxinos & Franklin, 2003). dHC lesions were restricted to the rostral dHC, and most encompassed all major subregions (fig. 5f). In several mice, NMDA spread ventrally such that GFAP staining was detected in the intermediate hippocampus, but the ventral hippocampus was spared. In fact, a subset of animals in both the dHC and vHC groups had prominent GFAP staining within the intermediate hippocampus. Thus, disparate behavioral response patterns in these groups are presumed to be due to cell death within the non-overlapping dorsal and ventral regions, respectively. Two dHC lesions were unexpectedly non-detectable, and 2 were unilateral; these animals were excluded.
It was recently argued that there are no good models of prefrontal function in mice (Bissonette et al., 2008); indeed, few behavioral tasks thought to rely in whole or in part on the PFC—based on lesion studies in rats and non-human primates—have been validated by lesion studies in mice. This is unlike canonically hippocampus-dependent tasks, such as trace conditioning (e.g., Chowdhury et al., 2005) and the Morris water maze (e.g., Pittenger et al., 2002). Moreover, the vast majority of rodent anatomical studies of the PFC use rats. These practices are paradoxically apposed to a growing reliance on transgenic and knockout mice to model psychiatric diseases commonly characterized by disordered goal-directed action and generally, deficits thought to derive from abnormalities in medial prefrontal cytoarchitecture, biochemistry, and/or network activity. The goals of this study were thus twofold: 1) to develop protocols to place anatomically discrete lesions along the medial wall of the mouse PFC, and 2) to compare the effects of PL and ventromedial PFC—i.e., mOFC—lesions on behavioral flexibility based on action-outcome (also termed “response-outcome”), as opposed to stimulus-response, associations.
Most notably, our findings suggest the rodent mOFC facilitates goal-directed response inhibition under circumstances that require the adoption of novel response strategies, with the caveat that lesion effects were detected only in the presence of appetitive reinforcement, i.e., lesions did not affect responding during non-reinforced (extinction) test sessions. Comparable roles for the primate mOFC were recently proposed (Kringelbach, 2005; Roberts, 2006; Tanaka et al., 2008), but evidence for mOFC involvement in inhibitory control in rodents, in general or under specific conditions, is, to date, indirect (Cetin et al., 2004), despite the identification of a medial compartment in both the rat and mouse OFC (Uylings et al., 2003; Van de Werd et al., 2010). It is notable, however, that in previous studies, rats with large medial PFC lesions that included the mOFC showed increased perseverative responding in stimulus-response reversal tasks (Aggleton et al., 1995; Chudasama & Robbins, 2003), while more selective lesions of the cingulate and infralimbic cortices or PL spared (de Briun et al., 1994; Aggleton et al., 1995; Joel et al., 1997; Ragozzino et al., 1999; Dias & Aggleton, 2000; Boulougouris et al., 2007) or partially spared (Sutherland et al., 1988; Chudasama & Robbins, 2003) responding in a variety of intramodal shifting tasks, as with PL lesions here.
Human neuroimaging reports implicate the mOFC in encoding the value of available actions relative to available reinforcers (Erk et al., 2002; Arana et al., 2003; Paulus & Frank, 2003; O'Doherty et al., 2003; Elliott et al., 2008). A report by Plassmann et al. (2007) showed selective activity in the mOFC during willingness-to-pay calculations, a finding that may be particularly germane to this study, as we argue that, without the mOFC, modestly food-restricted mice were unable to calculate the appropriate “pay”—effort expenditure—relative to the outcome value when responding for food on a progressive ratio schedule of reinforcement. Specifically, mice must choose between performing an action and withholding responding to end the session and non-contingently receive chow upon returning to the home cage. Control mice establish a low, steady pattern of responding, while mOFC mice withhold responding only at higher break points. Mice with progressive ratio schedule experience prior to lesion placement responded appropriately, indicating the effect is selective to acquisition of the progressive ratio response requirements.
In contrast to mOFC lesions, PL lesions reduced break point ratios (see also Gourley et al., 2008), and in monkeys, ventral PL neurons are more active during “self-initiated” response trials—in which animals respond for water reinforcement in the absence of discrete cues—than in cued trials (Bouret & Richmond, 2010). These patterns support the argument put forth in a previous report that the PL serves to motivate instrumental responding when reinforcement is uncertain (Corbit & Balleine, 2003).
Instrumental sensitivity to satiety-specific outcome devaluation was intact in mice with medial prefrontal lesions placed after instrumental training. This finding in PL mice is consistent with a previous report in rats (Ostlund & Balleine, 2005), but whether the mOFC lesion profile is also consistent with previous work is less obvious. Monkeys with broad OFC lesions including the mOFC in an early study were insensitive to prefeeding devaluation, but it is unclear whether the animals were responding for the food outcome or the discrete cues that accompanied reinforcement (Butter et al., 1963). More recently, monkeys with mOFC-inclusive lesions were unable to suppress responding for an object associated with a devalued food, but when asked to perform a response for the food itself, instrumental responding diminished (Baxter et al., 2000; Izquierdo et al., 2004; Izquierdo & Murray, 2004), as here and in a previous study in rats with large OFC lesions (Ostlund & Balleine, 2007). Our results with discrete mOFC lesions thus suggest this region is important for adopting new behavioral strategies based on action-outcome associative relationships when reinforcement requirements change, such as in a spatial reversal, in shifting to a new response schedule, or in detour reaching tasks (in monkeys: Wallis et al., 2001), but not in maintaining a representation of the action-outcome associative relationship itself.
Large lesions of the hippocampus have historically resulted in hyper-sensitivity to food and drug reward and a general increase in appetitive behavior in rats (Jarrard, 1964; Kimble & Kimble, 1965; Whishaw & Mittleman, 1991; Wilkinson et al., 1993; Schmelzeis & Mittleman, 1996; Mittleman et al., 1998; Kelley & Mittleman, 1999), consistent with the conclusion that the hippocampus gates reward sensitivity and with elevated break point ratios in mice with either dHC or vHC lesions here. In other contexts, the hippocampus can be functionally and anatomically dissociated along the dorso-ventral axis: The dorsal sector is classically associated with spatial learning and memory and the ventral with the emotional and motivational salience of outcomes (Fanselow & Dong, 2010). vHC lesions result, for example, in reduced hyponeophagia—i.e., increased willingness to seek food despite novel environmental stimuli (Bannerman et al., 2002, 2003). The vHC also sends direct ipsilateral projections to the mOFC, and stimulation of these sites results in the excitation of single units within the mOFC (Ishikawa & Nakamura, 2003). Excitation of these projections may facilitate mOFC-mediated response inhibition; consistent with this hypothesis, vHC lesions mimicked the effects of mOFC lesions, though it is unclear why mOFC lesions resulted in a delay in escalated responding on the progressive ratio schedule of reinforcement, while vHC lesions did not.
The mOFC and vHC may alternatively regulate the acquisition of novel response contingencies via projections that converge onto single neurons within the nucleus accumbens core (French & Totterdell, 2002). This model is consistent with reports that the acquisition of variable ratio response schedules requires NMDA receptors and downstream protein kinase activity within the core subregion (Baldwin et al., 2000, 2002), and potentially with evidence that the vHC regulates the balance between tonic and phasic activation of dopamine neurons within the nucleus accumbens (Floresco et al., 2001), which would be expected to impact upon an animal's ability to detect and acquire a novel response contingency.
As anticipated based on connectivity patterns and previous studies, neither dHC nor lOFC lesions produced behavioral profiles that were similar to mOFC lesions. For example, both dHC and lOFC mice showed deficits acquiring the “reversed” response contingency: Mice with dHC lesions were unable to fully acquire the new response, presumably due to the spatial component of the response requirement (Whishaw & Tomie, 1997), while mice with lOFC lesions showed acquisition delays.
Recent evidence indicates the rat lOFC encodes reward prediction error and uses this information to guide future choice behavior (Sul et al., 2010). As anticipated by models of reward prediction error (Rescorla & Wagner, 1972), the rat lOFC encodes prediction errors that are both positive, indicating reinforcement for a given action is better than anticipated, and negative, indicating reinforcement is worse than expected. Moreover, recent adaptations of predict error theory can account for learning from “missing” reward in action-outcome associative settings (Redish et al., 2007). Thus, the delay in response reversal in lOFC mice here may reflect inactivation of a region that enables the acquisition of novel choices based on whether previous choices resulted in reward or no reward. This model cannot, however, obviously account for facilitated extinction of non-reinforced responding in lOFC mice. This effect has also been previously reported in rats (Grakalic et al., 2010), and suggests that, under some circumstances, the lOFC retards the extinction of goal-directed activities—perhaps by maintaining sensitivity to stimulus-response associations that promote habitual instrumental responding—but further studies are necessary.
The rodent medial PFC contains multiple cytoarchitectonically distinct subregions that can be differentiated based on efferent and afferent projection patterns, with dorsal regions—including the dorsal PL—sharing similar functions that differ from those of the ventral medial PFC, which includes the mOFC (Heidbreder & Groenewegen, 2003; Vertes, 2004; Schilman et al., 2008). We show that selective mOFC lesions produce distinctive behavioral effects relative to more dorsally and caudally-situated PL lesions in mice performing food-reinforced instrumental tasks. Specifically, mOFC lesions increased perseverative responding in an “instrumental” reversal task, as well as responding for food reinforcement on a progressive ratio schedule of reinforcement, resulting in effort expenditure that outstripped the value of the reinforcer. Mice trained to respond on a progressive ratio schedule prior to lesion placement were unaffected; we thus propose a role for the rodent mOFC in facilitating goal-directed response inhibition specifically in the presence of appetitive reinforcement and under circumstances that require the acquisition of novel response strategies. Such a model has relevance to psychiatric illnesses commonly characterized by disordered goal-directed action.
The authors thank Dr. Philip Corlett for valuable feedback. This work was supported by the National Institutes of Health [DA011717, MH025642, and MH066172 to JRT; MH079680 to SLG], the Connecticut Department of Mental Health and Addiction Services (JRT, CP), and the Interdisciplinary Research Consortium on Stress, Self-control and Addiction (UL1-DE19586 and the NIH Roadmap for Medial Research/Common Fund, AA017537).