|Home | About | Journals | Submit | Contact Us | Français|
It is now widely recognized that exposure to palatable foods engages reward circuits that promote over-eating and facilitate the development of obesity. While the melanocortin 4 receptor (MC4R) has previously been shown to regulate food intake and energy expenditure, little is known about its role in food reward. We demonstrate that MC4R is co-expressed with the dopamine 1 receptor (D1R) in the ventral striatum. While MC4R-null mice are hyperphagic and obese, they exhibit impairments in acquisition of operant responding for a high fat reinforcement. Restoration of MC4R signaling in D1R neurons normalizes procedural learning without affecting motivation to obtain high fat diet. MC4R signaling in D1R neurons is also required for learning in a non-food-reinforced version of the cued water maze. Finally, MC4R signaling in neostriatal slices increases phosphorylation of the Thr34 residue of DARPP-32, a protein phosphatase-1 inhibitor that regulates synaptic plasticity. These data identify a novel requirement for MC4R signaling in procedural memory learning.
While access to highly palatable, calorically dense food has long been appreciated as a contributing factor to the development of obesity [1, 2], the neural substrates that mediate the desire to eat palatable food (termed here “hedonic drive”) remain poorly understood (see [3–8] for excellent reviews on the topic). Food reward has been conceptualized as consisting of three components [7, 9]: ‘Wanting’ refers to the motivation to obtain palatable food and is also termed “incentive salience”, ‘liking’ represents the enjoyment or pleasure obtained from food, and ‘associative learning’ is the term used for associating neutral cues with predictors of reward.
Several recent studies have begun to dissect the neuroanatomical and molecular basis of these inter-connected processes. Previous work in mice has indicated that distinct regions of the striatum are critical for learning various aspects of food reward . The dorsal striatum has been implicated in the acquisition of stimulus-response/reinforcement instrumental learning and learning of goal-directed instrumental actions, while the ventral striatum primarily affects preparatory conditional responses of Pavlovian conditioning and consummatory conditional responses .
A series of studies by Ann Kelley analyzed the acquisition of instrumental responding for sucrose pellets by rats on the molecular level . Her group found that multiple regions including the nucleus accumbens (NAc) core, basolateral amygdala, and prefrontal cortex are required for instrumental learning [13–16]. Furthermore, pharmacological studies identified glutamate signaling through NMDA receptors and dopamine signaling through dopamine 1 receptor as critical components of this learning process. Kelley hypothesized that coincident signaling through NMDA and dopamine 1 receptors facilitate neuroplastic mechanisms that promote the association of temporal events with acquisition of a reward.
In the present study we further dissect the neuroanatomical and molecular basis of one form of food reward learning, operant responding for high fat diet, by analyzing the role of melanocortin 4 receptor (MC4R) signaling in dopamine 1 receptor (D1R) expressing neurons. MC4R is a well-known regulator of food intake and body weight in both mice and humans. Approximately 5% of patients with severe obesity have a mutation in the MC4R gene . While patients harboring MC4R mutations exhibit an obesity syndrome characterized by hyperphagia , the role of MC4R signaling in food reward processes remains poorly defined. To directly test this possibility, we utilized a unique mouse line with a loxP-transcriptional/translational blocking cassette-loxP inserted in the MC4R gene (MC4R-TB). This mouse line is functionally null for MC4R activity and displays hyperphagia and obesity similar to human patients with MC4R mutations . Against this null background, Cre-recombinase under control of the D1R was employed to remove the transcriptional/translational blocking cassette and ‘restore’ expression of endogenous levels of MC4R in a neuron specific manner. We then examined the three groups of mice generated from these pairings (wild-type, MC4R-TB, and MC4R expressed only in D1R neurons) in food-reinforced operant responding as well as a non-food-reinforced procedural learning task.
Additionally, we sought to identify the possible intracellular mechanism of MC4R action on learning and memory. The MC4R, like the D1R, is coupled to the Gs/olf-signaling pathway. Multiple studies have established that the protein Dopamine- and cAMP-regulated phosphoprotein, Mr 32 kDa (DARPP-32) is downstream of D1R through cAMP/protein kinase A signaling and serves as an important integrator of multiple signaling pathways to regulate the function of striatal neurons . Furthermore, activity of DARPP-32 affects multiple electrophysiological, transcriptional, and neuroplastic processes and is critically involved in striatum-dependent learning . Therefore, we hypothesized that activation of MC4Rs in the striatum may affect synaptic plasticity through activation of DARPP-32. In order to test this possibility, we utilized a slice pharmacology approach to determine if activation of the MC4R increases phosphorylation of DARPP-32.
Mice in which GFP expression is under control of MC4R gene promoter (MC4R-GFP) were generated and characterized as reported previously . Mice in which a transcriptional/translational blocking cassette flanked by lox-P sites was inserted into the MC4R gene (MC4R-TB) to create a functionally null allele were generated and characterized as reported previously . Mice expressing Cre-recombinase under the control of the dopamine 1A receptor (Drd1a-Cre) mice were obtained from Gensat (EY262, http://www.gensat.org/cre.jsp). Mice were back-crossed to achieve >90% genetic background on the C57BL/6 line. All pairings were conducted with male mice heterozygous for MC4R-TB and expressing the Drd1a-Cre transgene mated to female mice heterozygous for MC4R-TB. All experiments were conducted on wild-type and MC4R-TB homozygous littermates. Mice were housed in the University of Texas Southwestern Medical Center (UTSW) vivarium in a temperature-controlled environment (lights on: 06:00–18:00) with ad lib access to water and standard chow (SC) (4% fat diet #7001, Harlan-Teklad, Madison, WI). All animal procedures were performed in accordance with UTSW Institutional Animal Care and Use Committee guidelines. For slice pharmacology experiments, male C57BL/6N mice at 8–12 weeks old were purchased from Japan SLC (Shizuoka, Japan). All mice used in this study were handled in accordance with the Guide for the Care and Use of Laboratory Animals as adopted by the U.S. National Institutes of Health, and the specific protocols were approved by the Institutional Animal Care and Use Committee of Kurume University School of Medicine.
To confirm restoration of MC4R expression, single-label MC4R in situ hybridization was performed in wild-type, MC4R-TB and MC4R-TB/D1R-Cre mouse brain sections as reported previously [19, 21, 22]. A mouse-specific MC4R cDNA probe labeled with 33P was generated using the following primers: F 5′-ATTACCTTGACCATCCTGAT-3′ and R 5′-ATGTCAATTCATAACGCCCA-3′.
In order to measure learning of procedural memories that are not reinforced by food, we selected a cued water maze task previously demonstrated to require function of the striatum with equal efficacy in male and female rodents [23, 24]. Briefly, mice are trained to escape from a circular pool of water by locating an escape platform (12cm × 12.5cm) hidden approximately 1cm under the surface of the water. Three distinct visual cues are used; plastic cylinders (11cm in length and 2.5cm in diameter) with interchangeable covers of either neutral gray or sharp black-and-white stripes (1cm in width) oriented either horizontally or vertically. The first 2–3 days of this task involve shaping to the task, in which the platform was marked with a neutral gray cylinder. Mice are given 4 trials (60 sec/trial) per day in order to swim to the platform. Following successful completion of this portion, the mice are then trained in the two-cue task for 7 days, 4 trials per day. The escape platform is marked with horizontal stripes (or ‘target’ cue) and a second cue in which a platform is not present (or ‘lure’ cue) is marked with vertical stripes. The cues are placed in the pool in adjacent quadrants and the locations of the cues are moved every trial such that the target cue is placed in each of the four quadrants each training day. The lure cue moves in relation to the target cue, so that the cues are always in adjacent quadrants; however, the direction needed to swim to reach the target cue constantly changes. Learning was assessed using a 60 sec probe trial on Days 4 and 8 of the training; in which both target and lure cues are present but no platform is available to permit escape from the pool. Because swim speed did not differ between the groups (Fig. 5A and 5B), learning is measured by comparing total time spent in a 12 cm × 12.5 cm area surrounding ‘cue’ and ‘target’ platforms as measured by tracking software (Ethovision, Noldus, Leesburg, VA).
Mice were trained to poke their nose into a lit portal to obtain a 20 mg high fat diet (HFD) pellet reward as previously reported  in standard operant conditioning chambers (Model ENV307A, Med Associates Inc., St Albans, VT) equipped with three nose poke portals. Briefly, mice were rewarded for nose poking in the middle portal only; the side portals were monitored but inactive. The HFD pellets were custom prepared by Bio-serv (product# F06245, Frenchtown, NJ), and provided 4.5 kcal/g of metabolizable energy of which 45.4% of energy comes from fat, 35.0% comes from carbohydrate, and 21.0% comes from protein. The main components of these pellets were casein (233 g/Kg), palm oil (207 g/Kg), dextrates (197 g/Kg), sucrose (197 g/Kg), cellulose (58 g/Kg), and soybean oil (20 g/Kg). During the training period, mice were kept on a restricted feeding (RF) schedule and allowed access to regular chow 4 hours per day (1200–1600). For the training sessions, mice initially received the HFD pellet rewards under a fixed ratio (FR) schedule. In order to pass training, mice had to obtain 30 reinforcements within one-hour time for FR1 (once), FR3 (twice), and FR5 (three times) before moving on to the progressive ratio schedule. Learning of task was assessed by the number of days required to pass FR1.
Following completion of the training period, the mice were then kept on the restricted feeding schedule and advanced to a progressive ratio schedule where they had to perform increasing numbers of nose-pokes to obtain the pellet according to the following series: 5, 10, 20, 30, 50, 70, 100, 130, etc. A relatively steep progressive ratio was chosen to ensure that only differences in motivation were measured and not satiation. After four days of stable responding was achieved, the mice were allowed free access to chow and effortful responding was assessed for HFD pellets under ad lib conditions. Breakpoint was defined as the last progressive ratio that an animal successfully completed to receive a reinforcement pellet within a 30 min period. The mice were tested for breakpoint on 4 consecutive days on a RF schedule and then tested an additional four days after receiving ad lib (AL) access to regular chow. Total number of nosepokes and pellets earned in that session were also recorded and used for statistical analysis.
The effect of MC4R activation on PKA signaling in striatal neurons was investigated by measuring dopamine- and cAMP-regulated phosphoprotein, Mr 32 kDa (DARPP-32) phosphorylation at Thr34 (PKA-site) in mouse neostriatal slices. Male 8–12 week old C57BL/6N mice were sacrificed by decapitation. The brains were rapidly removed and placed in ice-cold, oxygenated Krebs-HCO3− buffer (124 mM NaCl, 4 mM KCl, 26 mM NaHCO3, 1.5 mM CaCl2, 1.25 mM KH2PO4, 1.5 mM MgSO4 and 10 mM D-glucose, pH 7.4). Coronal slices (350 μm) were prepared using a vibrating blade microtome, VT1000S (Leica Microsystems, Nussloch, Germany). Striata were dissected from the slices in ice-cold Krebs-HCO3− buffer. Each slice was placed in a polypropylene incubation tube with 2 ml fresh Krebs-HCO3− buffer containing adenosine deaminase (10 μg/ml). The slices were preincubated at 30°C under constant oxygenation with 95 % O2/5% CO2 for 60 min. The buffer was replaced with fresh Krebs-HCO3− buffer after 30 min of preincubation. Adenosine deaminase was included during the first 30 min of preincubation to counter the increase in adenosine levels during slice preparations and minimize the variability among slices. Slices were treated with melanotan II [Ac-Nle-cyclo(-Asp-His-D-Phe-Arg-Trp-Lys-NH2)] at 10 μM for 15 s to 5 min. After drug treatment, slices were transferred to Eppendorf tubes and frozen on dry ice. The reaction of control slices was terminated without changing Krebs-HCO3− buffer, because incubation of slices with the buffer for 15 s to 5 min did not affect DARPP-32 phosphorylation. Slices were stored at −80°C until assayed..
Frozen tissue samples were sonicated in a solution of boiling 1% sodium dodecyl sulfate (SDS), then boiled for an additional 10 min. Small aliquots of the homogenate were retained for protein determination by the BCA protein assay method (Pierce, Rockford, IL). Equal amounts of protein (40 μg) were separated by 4–12% polyacrylamide Bis-Tris gels (Bio-Rad, Hercules, CA), and transferred to nitrocellulose membranes (0.2 μm) (Schleicher and Schuell, Keene, NH).
The membranes were immunoblotted using a phosphorylation state-specific antibody raised against DARPP-32 phospho-Thr34 peptide, the site phosphorylated by PKA (CC500; 1:500 dilution). Monoclonal antibody generated against DARPP-32 (C24-5a; 1:7,500 dilution), which is not phosphorylation state-specific, was used to determine the total amount of DARPP-32. The membranes were incubated with goat anti-rabbit Alexa 680-linked IgG (1:5,000 dilution) (Molecular Probes, Eugene, OR) and goat anti-mouse IRDyeTM800-linked IgG (1:5,000 dilution) (Rockland Immunochemicals, Gilbertsville, PA). Fluorescence at infrared wavelengths was detected by the Odyssey infrared imaging system (LI-COR Biosciences, Lincoln, NE), and quantified using Odyssey software. In an individual experiment, samples from control and drug-treated slices were analyzed on the same immunoblot. For each experiment, values were normalized to values obtained from control samples. Normalized data from multiple experiments were averaged and statistical analysis was carried out using one-way ANOVA followed by Neuman-Keuls test.
Mice were transcardially perfused with 4% paraformaldehyde (PFA) and cryopreserved in 20% sucrose. Brains were sectioned into 30 micron coronal sections (collecting 1:5 sections) and stored in cryoprotectant at −20 °C until use. Brain sections were blocked and permeabilized with 3% normal donkey serum (Jackson ImmunoResearch, West Grove, PA), 0.3% Triton X-100 in PBS for 30 minutes at room temperature (RT), rinsed with PBS and incubated with primary antibodies (diluted in 3% NDS, 0.3% Tween-20 in PBS) at RT for 2–3 hours or at 4 °C for overnight. Brain sections were rinsed with PBS 3 times for 10 minutes each, incubated with 0.03% SDS in PBS for 10 minutes, blocked in 3% NDS, 0.3% Triton X-100 in PBS for 30 minutes at room temperature and then incubated with primary antibody for 24 hours at room temperature plus additional 48 hours at 4 °C. Images were first taken by fluorescent microscopy (Nikon Eclipse, 80i) and all possible co-localization were further determined by confocal microscope scanning (LSM510-META, Zeiss, Thornwood NY). All of the primary antibodies used in present study are commercially available and have previously been tested by different laboratories described as follows: Chicken anti-GFP antibody (1:1000, #GFP1020, Aves Lab Inc, Tigard, OR) and rabbit anti-Cre polyclonal antibody (1:1000 dilution, Novagen, Cat# 69050-3). Primary antibodies were detected by Cy2-, and Cy3- conjugated secondary antibodies (Jackson ImmunoResearch).
The data are presented as mean ± SEM. GraphPad Prism 5 software (GraphPad Software Inc., San Diego, CA) were used to perform all statistical analyses. No differences between wild-type groups with and without D1R-Cre expression were detected so both groups were combined into a ‘control’ group to improve statistical power. Learning of the cued water maze task was performed by comparing time spent in ‘target’ and ‘lure’ areas by Student’s t-tests and has been previously described . Learning was further examined by comparing time spent at the ‘target’ of all three groups by one-way ANOVA with Newman-Keuls post-analysis. All other comparisons between groups were made by one-way ANOVA with Newman-Keuls post-analysis or two-way ANOVA with Bonferroni post-analysis as noted in the text. P< 0.05 was considered to be statistically significant.
We were interested in examining the function of MC4R signaling in the ventral striatum given its known role in the regulation of food reward . MC4R has previously been shown to be co-expressed with prodynorphin , a marker of D1R neurons in the striatum. Therefore we mated a mouse line expressing Cre-recombinase specifically in D1R-neurons (D1R-Cre) with a mouse line that expresses GFP under control of the MC4R promoter (MC4R-GFP, ) in order to analyze the neuroanatomical co-localization of these two receptors. The data presented in Figure 1 demonstrate the overlap of MC4R-GFP and D1R-Cre within key regions of the ventral striatum of interest to us including the NAc core and shell and the olfactory tubercle.
After identifying a D1R-Cre line that overlaps with MC4R, In situ hybridization confirmed restoration of MC4R expression predominantly in the ventral striatum (Fig. 2A, 2B, 2C). Additional areas of restoration include neurons in the paraventricular nucleus of the hypothalamus (Fig. 2D, 2E, 2F) and the lateral olfactory tract (Fig. 2G, 2H, 2I). MC4R expression was not observed in any other brain region including the cortex, dorsal striatum, or hippocampus (data not shown).
The ventral striatum has previously been implicated in the regulation of motivated food consumption . Therefore, we next determined if MC4R signaling in D1R neurons affected the motivation to obtain a palatable food. Starting at six weeks of age, mice were maintained on a time restricted feeding schedule (1200–1600) and trained to nose-poke for HFD pellets. Under ad lib chow feeding conditions, MC4R-TB mice are hyperphagic and diverge in body weight compared to wild-type mice starting at seven weeks of age (wild-type- 21.55 g ± 1.56, MC4R-TB- 29.43 g ± 3.75, MC4R-TB/D1R-Cre- 26.27 ± 2.53, mean ± S.D., n = 17, 18, and 18, respectively). However, the mice used in this study were maintained on a restricted feeding schedule during most of the testing period and only allowed ad lib intake during the final four days of testing. Two-way ANOVA of body weight between group and time demonstrated a primary group X time interaction (F16,408 = 7.15, P < 0.001, n = 27 control, 13 MC4R-TB, and 14 MC4R-TB/D1R-Cre, Fig. 3A) during the testing period. A Bonferroni posttest demonstrated both MC4R groups (i.e. +/− D1R-Cre) did not differ between themselves, but were significantly heavier than wild-type mice during the final 4 days of ad lib feeding (Fig. 3A). Additionally, one-way ANOVA demonstrated no significant effect of group on locomotor activity during either the restricted (F2,27 = 1.557, P = 0.229, n = 12 control, 9 MC4R-TB, and 9 MC4R-TB/D1R-Cre, Fig. 3B) or ad lib (F2,27 = 0.1206, P = 0.887, n = 12 control, 9 MC4R-TB, and 9 MC4R-TB/D1R-Cre, Fig. 3B) feeding time periods. One-way ANOVA demonstrated no significant effect of group on food intake during restricted feeding at 1 hour (F2,27 = 0.2082, P = 0.813, n = 12 control, 9 MC4R-TB, and 9 MC4R-TB/D1R-Cre, Fig. 3C), 2 hours (F2,27 = 0.5067, P = 0.608) and 4 hours (F2,27 = 0.4665, P = 0.632).
Two-way ANOVA of total nosepokes (Fig. 3D) between group and time demonstrates a primary group X time interaction (F14,343 = 2.34, P = 0.004, n = 27 control, 13 MC4R-TB, and 14 MC4R-TB/D1R-Cre). Two-way ANOVA of total rewards earned (Fig. 3E) between group and time demonstrates primary effects of time (F7,343 = 77.57, P < 0.001, n = 27 control, 13 MC4R-TB, and 14 MC4R-TB/D1R-Cre) and group (F2,49 = 4.08, P = 0.023, n = 27 control, 13 MC4R-TB, and 14 MC4R-TB/D1R-Cre). Two-way ANOVA of 30 minute breakpoint (Fig. 3F) between group and time demonstrates primary effects of time (F7,343 = 30.87, P < 0.001 and group, F2,392 = 13.27, P < 0.001, n = 27 control, 13 MC4R-TB, and 14 MC4R-TB/D1R-Cre). Bonferroni post testing found that both MC4R-TB groups displayed less total number of nose-pokes (Fig. 3D), pellets earned (Fig. 3E), and breakpoint (Fig. 3F) on the progressive ratio than wild-type littermates only during the days 2–4 of the restricted feeding period. Re-feeding reduced effortful responding in all groups and no differences were noted between any groups during ad lib feeding (P5–8). Thus, even though the loss of MC4R signaling produces hyperphagia, it is associated with decreased effortful responding for palatable food on a steep progressive ratio during restricted feeding.
During the training phase, it was noted that MC4R-TB mice were delayed in learning the operant responding task (Fig. 4A). Signaling through both the D1R and its down stream effector cAMP-dependent protein kinase has been implicated in learning of instrumental responding for reward pellets [13, 15]. Because MC4R is also a Gs-coupled receptor, we analyzed the training phase data to determine if MC4R signaling may also be involved in learning of instrumental responding. In our paradigm, mice are required to nose-poke in the correct port 30 times within one hour on a fixed ratio of 1 (FR1) before advancing to the next stage of training (FR3). One-way ANOVA of days to earn 30 rewards within one hour on a FR1 scheduled demonstrated an effect of groups (F2,48 = 5.653, P = 0.006, n = 27 control, 13 MC4R-TB, and 14 MC4R-TB/D1R-Cre). A Neuman-Keuls post-test demonstrated that restoration of MC4R in D1R neurons rescued this learning deficit despite the fact that there is no difference in overall body weight (Fig. 3A), food intake (Fig. 3C), or effortful responding (Fig. 3D, 3E, 3F) between both MC4R-TB groups (i.e. +/− D1R-Cre). This finding indicates that while restoration of MC4R expression selectively in D1R signaling does not rescue motivation for food intake, it is required for learning of instrumental responding.
We next utilized a non-food reinforced task, the cued water maze, as a second measure of procedural learning. After 4 days of training sessions, no preference for the ‘target’ was observed as measured by time at cue (Student’s t-test of target vs. lure, control t = 0.473, df = 46, p = 0.634, MC4R-TB t = 0.776, df = 24, p = 0.445, MC4R-TB/D1R-Cre t = 1.349, df = 30, p = 0.188, n = 24 control, 13 MC4R-TB, and 16 MC4R-TB/D1R-Cre, Fig. 5C). However after an additional 4 days of training, wild-type mice displayed increased time at the ‘target’ cue (Student’s t-test of target vs. lure, t = 3.483, df = 46, p = 0.001, Fig. 5D) demonstrating effective learning of the task. In contrast, mice lacking the MC4R show impaired learning with no difference between time spent in the ‘target’ and ‘lure’ cue areas (Student’s t-test of target vs. lure, t = 0.587, df = 24, p = 0.563, Fig. 5D). Similar to learning of operant responding, restoration of MC4R expression in D1R neurons significantly increased preference for time at the ‘target’ cue (Student’s t-test of target vs. lure, t = 3.333, df = 30, p = 0.002, Fig. 5D). After comparing each group separately by t-test to determine if mice learned to swim toward the ‘target’ cue, all three groups were compared to determine if there were any group differences in time spent at the ‘target’ cue. One-way ANOVA demonstrates an effect of group (F2,50 = 3.762, P = 0.030, n = 24 control, 13 MC4R-TB, and 16 MC4R-TB/D1R-Cre) with Neuman-Keuls post-test showing a significant difference between wild-type and MC4R-TB and between MC4R-TB and MC4R-TB/D1R-Cre groups (Fig. 5B). This finding indicates that restoration of MC4R signaling in D1R neurons was sufficient to restore the ability to learn the cued task and confirms a novel role for the MC4R in learning both food-reinforced and non food-reinforced procedural memories.
In order to determine if DARPP-32 is an intracellular target of MC4R signaling, neostriatal slices were prepared from adult mice and treated with the melanocortin agonist melanotan II (MT II). Acute treatment with MT II caused a rapid increase in phosphorylation at the Thr34 site on DARPP-32 (One way ANOVA, F5,69 = 3.119, p = 0.0135 followed by Newman-Keuls test, Fig. 6), a modification that converts DARPP-32 into an inhibitor of protein phosphatase-1 .
In the current study we utilized mouse models of striatum-dependent learning to demonstrate a role for the melanocortin 4 receptor in learning of both food-reinforced and non-food reinforced procedural memories. Striatum-dependent learning encompasses a wide range of conditions including the development of behavioral responses to both appetitive and aversive stimuli . Much of the previous work on striatum function has utilized psychopharmacological approaches to define neuroanatomic locations involved in reward processing. For instance, neuropharmacological studies with mu-opioid receptor and AMPA antagonists have identified several ‘hotspots’ or regions within the ventral striatum and ventral pallidum that enhance the hedonic ‘liking’ and incentive salience of sweet tasting foods .
Our data extend these findings using a genetic approach to indicate that MC4R signaling is also required for the acquisition of both appetitive (operant responding task) and aversive (cued water maze) procedural-based memories. Interestingly, MC4Rs and D1Rs are co-expressed in a small set of neurons in the ventral striatum including dorsomedial and lateral aspects of the NAc shell and the NAc core. This observation indicates that MC4R/D1R neurons in the ventral striatum are required for cued water maze learning in contrast to previous work that primarily implicates dorsal striatum function in procedural based learning [24, 29].
Importantly, we also demonstrate that MC4R signaling, like D1R, increases phosphorylation of DARPP-32 at Thr34. This observation suggests a potential molecular mechanism linking MC4R signaling to striatum-dependent learning. DARPP-32 is a well-known regulator of neuronal plasticity in the neostriatum and is an important mediator of the effects of drugs of abuse . One interesting aspect of this finding is the rapid time course of the effect. DARPP-32 is phosphorylated by 30 seconds after exposure to MT II and returns to baseline by one minute. This time frame is consistent with other studies in striatal slices, which demonstrate that dopamine released in response to treatment with nicotine , neurotensin , or cocaine  induces a rapid and transient increase in DARPP-32 phosphorylation. Furthermore, studies in DARPP-32 KO mice revealed that neurotensin increased GluR1 Ser845 phosphorylation via activation of dopamine D1 receptor/PKA signaling and inhibition of PP-1 by P-Thr34 DARPP-32 , demonstrating that the transient inhibition of PP-1 by P-Thr34 DARPP-32 is sufficient to modulate the downstream signaling cascades. In addition, activation of DARPP-32 signaling by G-protein coupled receptors such as adenosine A2A receptors (A Nishi. un-published observations) or β2-adrenoceptors  is also rapid and transient.
For these reasons, the short-lived signal of P-Thr34 DARPP-32 induced by MTII presumably plays a critical physiological role in MCR4-mediated signaling events. One possibility is that MC4R signaling may increase levels of phospho-Thr34 DARPP-32 converting it into an inhibitor of protein phosphatase-1. Inhibition of protein phosphatase-1 has been shown to promote signaling through the NMDA-receptor allowing for coincident detection of NMDA and dopamine 1 receptor signaling and association of temporal events with positive or negative reinforcements . This tight temporal association may be an essential feature of learning of procedural tasks and confirms that both MC4R and D1R couple to a common intracellular pathway known to regulate neuroplasticity and learning.
We did not identify a role for MC4R signaling in the motivation to obtain palatable food in the present study. In fact, MC4R-null mice were significantly less motivated to nosepoke for pellets in the food-restricted state than wild-type littermates (Fig. 3D and 3E). This distinction has important clinical implications. While human patients with mutations in the MC4R gene clearly display hyperphagia that diminishes with age , there are no clear reports of increased motivation to obtain food in these patients, in contrast to those observed in other hyperphagia-associated neuropsychiatric disorders such as Prader-Willi Syndrome .
This dissociation of hyperphagia and motivation is further supported by work in MC4R-null mice suggesting that motivation and hyperphagia are regulated by distinct neural substrates. A previous study placed wild-type and MC4R-null mice in a ‘foraging’ paradigm in which they responded on a fixed ratio schedule to obtain food pellets. While MC4R-null mice are hyperphagic during ad libitum conditions, their total food intake, meal size, and meal frequency did not differ from wild-type mice in the foraging paradigm, a situation in which they are forced to work for food . In a follow-up study, MC4R-null mice were again housed in operant chambers and received all of their meals by lever pressing. However, in this study, the food pellets were dispensed under a shallow progressive ratio in which the mice had to press the lever an increasing number of times to receive the reward. Under this condition, MC4R-null mice exhibited increased effort and earned more pellets for consumption . Finally, in a study comparing the effects of MC4R, MC3R and double MC4R/MC3R deletion, mice lacking MC4R demonstrated increased effortful responding on a low fixed ratio schedule (FR 2), but actually earned fewer rewards than wild-type mice at a fixed ratio of 50 . These findings suggest that while MC4R-null mice are hyperphagic, they are very sensitive to the demand costs of food.
While pharmacologic studies have been useful in identifying the striatum as involved in food reward, less is known about the underlying signaling pathways that mediate distinct processes such as ‘wanting’, ‘liking’, and ‘associative learning’. Several studies have now begun to dissect the relevant contributions of D1R vs. D2R signaling in striatum-dependent reward processing. Pharmacologic inhibition of D1R signaling in the NAc-core , medial prefrontal cortex , or amygdala  impairs appetitive instrumental learning. Our studies extend these findings to identify a role for MC4R signaling in D1R neurons in instrumental learning. Furthermore we demonstrate that MC4R, like D1R, targets DARPP-32 downstream of cAMP/PKA signaling. Therefore, it is tempting to speculate that Gs-coupled signaling in D1R neurons is required for instrumental learning. Indeed, infusion of a PKA inhibitor into the nucleus accumbens  or medial prefrontal cortex  also impairs appetitive instrumental learning confirming that PKA signaling in D1R is required for learning of a food-reinforced task. While these reports support a role for dopamine in instrumental learning, other groups have reported dopamine-independent appetitive learning with dopamine serving to primarily to mediate learning of ‘incentive salience’ [41–44]. Clearly more work will need to be done to resolve these discrepancies.
In addition to a role in appetitive learning, local infusion of a D1R antagonist also blocks appetitive eating induced by disruption of glutamate signaling in the NAc shell . At this point, our data not support a role for MC4R signaling in D1R in the striatum in appetitive eating. Consistent with this observation, infusion of the MC4R antagonist HS014 stimulated food intake when infused into the paraventricular nucleus of the hypothalamus or amygdala, but did not affect feeding when infused into the NAc .
Based upon our observations, MC4R may also be expressed in D2R neurons (Fig. 1). D2R signaling has been implicated in the regulation of motivational valence and compulsive food intake. For instance, Richard and Berridge recently demonstrated that both D1R and D2R signaling are required for the generation of fear responses induced by disrupting glutamate signaling in the NAc shell . Likewise, Johnson and Kenny recently employed genetic techniques to delineate the underlying signaling pathways that mediate distinct reward behaviors. They found that rats given extended access to a ‘cafeteria diet’ consisting of bacon, sausage, cheesecake, pound cake, frosting and chocolate resulted in a persistent deficit in reward thresholds and continued food intake even after an aversive conditioned stimulus . Importantly, these behaviors were associated with a reduction in dopamine 2 receptor (D2R) levels within the striatum and viral mediated knock-down of D2R accelerated the development of compulsive food intake. Because D2R is Gi-coupled, one could speculate that MC4R signaling would oppose the action of dopamine in D2R neurons. Loss of MC4R signaling might result in increased D2R signaling with enhancement of fear responses and decreased compulsive intake of a ‘cafeteria diet’. This model is not consistent with previous work that demonstrates that loss of MC4R signaling induces hyperphagia [17, 18], however, it is consistent with the decreased instrumental responding for HFD that we observed in this study (Fig. 3D, 3E, 3F). Additional research is needed to delineate the specific role of dopamine and melanocortin signaling in reward processing.
In conclusion, we describe a novel function of MC4R within D1R neurons in learning of procedural memories. Understanding this pathway may yield important insights into developing novel treatments for obesity and eating disorders.
This work was funded by the following grants: DK081185-01, DK081182-01, MH084058-01A1, Disease Oriented Clinical Scholars Program, and NARSAD Young Investigator Award (ML). CREST program of JST (AN). R01DK53301 and RL1DK081185 (JKE). We would like to thank Brad Lowell, Jeffrey Friedman and Gensat for use of mouse lines. We would like to thank Jeff Long for assistance with statistical analysis.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.