PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
J Neurosci. Author manuscript; available in PMC 2010 June 2.
Published in final edited form as:
PMCID: PMC2862754
NIHMSID: NIHMS175686

Genetic deletion of A2A adenosine receptors in the striatum selectively impairs habit formation

Abstract

A2A receptors are a major class of G-protein coupled receptors for adenosine. Highly expressed in the striatum, on the projection neurons giving rise to the striatopallidal or ‘indirect’ pathway, they have been implicated in sleep, addiction, and other processes, yet their role in the control of striatal circuits and behavior remains unclear. Using established assays from the instrumental learning paradigm, we showed that mice with striatum-specific deletion of A2A receptors were selectively impaired in habit formation. After training that generated habitual lever pressing in wild-type controls, the performance of striatum-specific A2A KO mice remained goal-directed, being highly sensitive to outcome devaluation and reversal of the action-outcome contingency. These data demonstrate a critical role for A2A receptors on striatopallidal medium spiny projection neurons in shaping behavior and decision making, providing the first instance of a selective alteration in instrumental learning after striatum-specific genetic manipulations.

Keywords: striatum, A2A, adenosine, instrumental learning, habit, basal ganglia

A2A adenosine receptors are highly expressed in the striatum, the input nucleus of the basal ganglia, in a major population of medium spiny projection neurons giving rise to the striatopallidal or indirect pathway, and interact with other receptors that modulate of glutamatergic transmission such as D2 dopamine receptors (Gimenez-Llort et al., 2007; Schiffmann et al., 2007; Ferre et al., 2008; Azdad et al., 2009). They are thus in a position to control the cortico-basal ganglia circuits so critical for motivated and voluntary behavior, but their functional role in behavior is poorly understood.

In the laboratory, striatum-dependent behaviors can be studied using analytical tools from the instrumental learning paradigm. Any instrumental behavior, such as pressing a lever for food, can be controlled by two distinct central processes. At first lever pressing is goal-directed and sensitive to manipulations like outcome devaluation. Under certain conditions, it can become more habitual and impervious to changes in the value of the outcome or to changes in the action-outcome contingency itself. Studies in flies, mice, rats, horses, monkeys, and humans have shown some version of this transition from more flexible and goal-directed behavior to inflexible and habitual behavior (Miyachi et al., 2002; Yin et al., 2004; Hilario et al., 2007; Brembs, 2009; Parker et al., 2009; Tricomi et al., 2009), as the neural substrate controlling behavior switches from the associative cortico-basal ganglia network to the sensorimotor network (Yin and Knowlton, 2006; Yin et al., 2008). This transition is thought to involve synaptic plasticity at glutamatergic synapses in the striatum (Yin et al., 2008; Yin et al., 2009).

A2A receptors are required for long-term potentiation (LTP) of glutamatergic transmission in the hippocampus (Rebola et al., 2008) and in the striatum (Flajolet et al., 2008; Shen et al., 2008b). Blockade of A2A receptors abolished spike-timing dependent LTP in striatopallidal neurons (Shen et al., 2008b). As different forms of synaptic plasticity are thought to be involved in striatum-dependent forms of learning and memory, we hypothesized that striatal A2A receptors are necessary for habit formation. Using striatum-specific A2A knockout mice (Shen et al., 2008a), we tested this hypothesis using devaluation and omission—established behavioral assays for habit learning. In outcome devaluation, the value of the reward earned by the previously trained action (e.g. lever press) is reduced by pre-feeding of the reward just before a short probe test. If the action is goal-directed, then a reduction in the current value of the goal should immediately reduce performance (Dickinson, 1985). If, however, the action is habitual, then devaluation should have no effect on performance, since habits are elicited by antecedent stimuli which are not affected by devaluation. In omission, the causal relationship between the lever press and the food reward is reversed (Davis and Bitterman, 1971). Instead of earning the reward, the press now prevents reward delivery (Yin et al., 2006). Again, because habitual behavior is not controlled by the action-outcome contingency, it is expected to be less sensitive to the reversal of this contingency.

Methods

Striatum-specific A2A KO mice

The generation of the striatum-specific A2AR KO mice (st-A2AR KO) by cross-breeding floxed A2AR (A2ARflox/flox) with Dlx5/6-cre transgenic mice has been described previously (Shen et al., 2008a). Dlx5/6-cre transgenic mice display striatum-specific expression of Cre recombinant proteins owing to the restricted striatal activity of the murine Dlx-5/6 regulator element during development (Price et al., 1991; Bulfone et al., 1993a; Bulfone et al., 1993b). Dlx5/6 driven, Cre-mediated deletion of the A2AR genes in the striatum (with only minimal effects in hippocampus, cerebral cortex and other brain regions) has been confirmed by PCR analysis of genomic DNA (Shen, et al. 2008). A2AR protein and mRNA in the striatum of the st-A2AR-KO mice were shown to be reduced to the background level seen in gb-A2AR KO mice (Shen et al. 2008). A recent study using the Rosa26-Cre reporter line also confirmed the specificity of the striatum-specific expression pattern of this Dlx5/6-Cre transgenic line (Ohtsuka et al., 2008).

Instrumental Training

All experiments were conducted in accordance with the Duke University IACUC guidelines. Mice were placed on a food deprivation schedule to reduce their weight to 80–85% of their baseline weight. They were fed 1.5–2g of home chow each day after training. Water was available at all times in the home cages.

Training and testing took place in 8 Med Associates (St. Albans, VT) operant chambers (21.6cm L × 17.8 cm W × 12.7 cm H) housed within light-resistant and sound attenuating walls. Each chamber was equipped with a food magazine that received Bio-Serv 14mg pellets from a dispenser. Each chamber contained two retractable levers on either side of the magazine and a 3 W 24 V house light mounted on the wall opposite the levers and magazine. A computer with the Med-PC-IV program was used to control the equipment and record behavior.

Lever press training

At the beginning of each session, the house light was turned on and the lever inserted. At the end of each session, the house light turned off and the lever retracted. Initial lever-press training consisted of 4 consecutive days of continuous reinforcement (CRF), during which the animals received a pellet for each lever press. Sessions ended after 90 minutes or 30 rewards, whichever came first. After CRF, mice were then trained with random interval (RI) schedules to generate habitual lever pressing (Dickinson et al., 1983). They were trained 2 days on RI 30s, with a 0.1 probability of reward availability every 3 seconds contingent upon lever pressing, followed by 6 days on the 60s interval schedules (0.1 probability of reward availability contingent upon lever pressing).

Devaluation tests

A specific satiety procedure was used for outcome devaluation. This procedure controls the overall level of satiety and motivational state while altering the current value of a specific reward. Mice were given the same amount of either the grain pellets to which they had been exposed in their home cages (non-devalued condition/ control), or the purified pellets they normally earned during lever-press sessions (devalued condition). The grain pellet served as a control for overall level of satiety. Immediately after 1 hour of unlimited exposure to the pellets, the mice received a 5 minute probe test, during which the lever was inserted but no pellet was delivered. This brief extinction test is designed to test whether the acquired lever pressing of the mice was controlled by the action-outcome instrumental contingency or elicited by antecedent stimuli. On the second day of outcome devaluation, the same procedure was used, except that those animals that received control grain pellets on day 1 received pellets on day 2, and vice versa.

Omission test

After devaluation, all mice were retrained on RI 60s for one day. The next day, the instrumental contingency was reversed in an omission procedure, which tests the sensitivity of the animal to a change in the prevailing causal relationship between lever pressing and food reward. For the omission training, a pellet was delivered every 20 s without lever pressing, but each press would reset the counter and thus delay the food delivery.

Results

Initial acquisition

All mice learned to press the lever after 4 sessions of continuous reinforcement (CRF) training, in which each press is reinforced with a food pellet. A two-way mixed ANOVA conducted on the first 8 days of lever press acquisition, with Days and Genotype as factors, showed no main effect of Genotype (F < 1), a main effect of Days (F 7, 98 = 40.7, p <0.05), and no interaction between these factors (F < 1). All mice, regardless of genotype, increased their rate of lever pressing during initial acquisition.

Devaluation

Planned comparison on lever pressing data from the devaluation test showed that the performance of wild-type controls was habitual, there being no significant difference between the devalued and non-devalued condition (p > 0.05). By contrast, the lever pressing of A2A KO mice remained goal-directed after extended training (p < 0.05).

Omission

When the action-outcome contingency was reversed in an omission procedure, the A2A KO mice more readily reduced their lever pressing. This observation was confirmed by a mixed two-way ANOVA with Time and Genotype as factors showed a main effect of Time (F 5, 70 = 15.9, p <0.05), a main effect of Genotype (F 1, 70 = 5.3, p <0.05), and no interaction between these two factors (F 5, 70 = 1.4, p > 0.05). Thus, while all mice reduced lever pressing over time, the A2A KO mice more readily reduced their performance.

Discussion

Conditions such as overtraining, stress, and exposure to drugs of abuse are known to promote habit formation (Adams, 1982; Nelson and Killcross, 2006; Dias-Ferreira et al., 2009). Although previous work has defined the general circuits involved in goal-directed actions and habit formation, the detailed cellular and molecular mechanisms underlying these processes remain poorly understood (Yin et al., 2004, 2005a; Yin et al., 2005b; Yin and Knowlton, 2006; Yin et al., 2006; Yin et al., 2008; Wassum et al., 2009). Our results demonstrate that striatal A2A receptors are necessary for habit formation. Striatum-specific A2A KO mice did not show any impairments in motor control or motivation, but their lever pressing is more goal-directed and flexible than that of wild-type controls with identical training. This is the first report of a striatum-specific genetic manipulation limited to a specific neuronal population leading to a selective deficit in instrumental learning, revealing a novel molecular mechanism for habit formation.

Because LTP of the glutamatergic input to the striatopallidal pathway is known to require the activation of A2A receptors (Shen et al., 2008b), it could be a critical mechanism for habit formation. A recent study showed that overtraining on a skill learning task results in increased synaptic strength in the sensorimotor striatum, particularly in neurons belonging to the striatopallidal pathway (Yin et al., 2009). It would be interesting to examine, as we have begun to do, the nature of the synaptic plasticity in A2A KO mice, which will shed light on how the lack of A2A receptor can impact transmission in the relevant striatal circuits.

Because the striatal A2A receptors are located postsynaptically on projection neurons of the striatopallidal pathway, the current results clarified the mechanisms of habit formation at both the molecular and the circuit level. At the circuit level, they suggest that the indirect pathway is critical for habit formation. In traditional neurological literature, this pathway is thought to be critical for the inhibition of behavior, despite the lack of direct evidence. In light of our data, behavioral inhibition may be too simplistic a description of the functional role of the indirect pathway. At the molecular level, the discovery of the importance of A2A receptors suggests intriguing mechanisms for the control of striatal circuits. Recent work has suggested a functional link between CB1 and A2A receptors (Schiffmann et al., 2007). Indeed, a recent study has linked deficit in habit learning with genetic deletion of CB1 cannabinoid receptor (Hilario et al., 2007). CB1 receptors are highly expressed in the striatum, specifically the sensorimotor striatum, though the previous data come from global CB1 knockouts, thus making it difficult to define the relative contributions of receptors in different brain regions. The use of striatum-specific A2A mice, however, obviates such difficulties with the interpretation of the data.

The differences between CB1 receptors and A2A receptors are striking. A2A receptors are Gs coupled and mainly located on the postsynaptic dendritic spines; CB1 receptors, Gi/o coupled., and found on the presynaptic terminals. That genetic deletion of these receptors produces remarkably similar effects confirms a critical insight: Signaling pathways considered in isolation are not enough to explain behavior. What is needed is a detailed analysis of how diverse molecular mechanisms are coordinated to control the global states of neural networks—undoubtedly a major challenge for the future. In linking molecular mechanisms to specific neural circuits and operationally defined behavioral phenomena, the present study represents an initial step in this direction.

Figure 1
A. Rate of lever pressing during initial phase of training. CRF, continuous reinforcement (i.e. each press rewarded); RI-30s, random interval-30 seconds; RI-60s, random interval 60 seconds. B. Outcome devaluation test after 4 additional sessions of RI-60s ...

Acknowledgements

This work is supported by NIAAA 018018 and 016991 to HHY and NINDS 41083 and 48995 to JFC. We would like to thank Mona Leblond and Alberto Lopez for their help with the experiments.

References

  • Adams CD. Variations in the sensitivity of instrumental responding to reinforcer devaluation. Quarterly journal of experimental psychology. 1982;33b:109–122.
  • Azdad K, Gall D, Woods AS, Ledent C, Ferre S, Schiffmann SN. Dopamine D2 and adenosine A2A receptors regulate NMDA-mediated excitation in accumbens neurons through A2A-D2 receptor heteromerization. Neuropsychopharmacology. 2009;34:972–986. [PubMed]
  • Brembs B. Mushroom Bodies Regulate Habit Formation in Drosophila. Curr Biol. 2009 [PubMed]
  • Bulfone A, Kim HJ, Puelles L, Porteus MH, Grippo JF, Rubenstein JL. The mouse Dlx-2 (Tes-1) gene is expressed in spatially restricted domains of the forebrain, face and limbs in midgestation mouse embryos. Mech Dev. 1993a;40:129–140. [PubMed]
  • Bulfone A, Puelles L, Porteus MH, Frohman MA, Martin GR, Rubenstein JL. Spatially restricted expression of Dlx-1, Dlx-2 (Tes-1), Gbx-2, and Wnt- 3 in the embryonic day 12.5 mouse forebrain defines potential transverse and longitudinal segmental boundaries. J Neurosci. 1993b;13:3155–3172. [PubMed]
  • Davis J, Bitterman ME. Differential reinforcement of other behavior (DRO): A yoked-control comparison. Journal of the Experimental analysis of Behavior. 1971;15:237–241. [PMC free article] [PubMed]
  • Dias-Ferreira E, Sousa JC, Melo I, Morgado P, Mesquita AR, Cerqueira JJ, Costa RM, Sousa N. Chronic stress causes frontostriatal reorganization and affects decision-making. Science. 2009;325:621–625. [PubMed]
  • Dickinson A. Actions and habits: the development of behavioural autonomy. Philosophical Transactions of the Royal Society. 1985;B308:67–78.
  • Dickinson A, Nicholas DJ, Adams CD. The effect of the instrumental training contingency on susceptibility to reinforcer devaluation. Quarterly Journal of Experimental Psychology: Comparative & Physiological Psychology. 1983;35:35–51.
  • Ferre S, Quiroz C, Woods AS, Cunha R, Popoli P, Ciruela F, Lluis C, Franco R, Azdad K, Schiffmann SN. An update on adenosine A2A-dopamine D2 receptor interactions: implications for the function of G protein-coupled receptors. Curr Pharm Des. 2008;14:1468–1474. [PMC free article] [PubMed]
  • Flajolet M, Wang Z, Futter M, Shen W, Nuangchamnong N, Bendor J, Wallach I, Nairn AC, Surmeier DJ, Greengard P. FGF acts as a co-transmitter through adenosine A(2A) receptor to regulate synaptic plasticity. Nat Neurosci. 2008;11:1402–1409. [PMC free article] [PubMed]
  • Gimenez-Llort L, Schiffmann SN, Shmidt T, Canela L, Camon L, Wassholm M, Canals M, Terasmaa A, Fernandez-Teruel A, Tobena A, Popova E, Ferre S, Agnati L, Ciruela F, Martinez E, Scheel-Kruger J, Lluis C, Franco R, Fuxe K, Bader M. Working memory deficits in transgenic rats overexpressing human adenosine A2A receptors in the brain. Neurobiol Learn Mem. 2007;87:42–56. [PubMed]
  • Hilario MRF, Clouse E, Yin HH, Costa RM. Endocannabinoid signaling is critical for habit formation. Frontiers in integrative neuroscience. 2007;1:6. [PMC free article] [PubMed]
  • Miyachi S, Hikosaka O, Lu X. Differential activation of monkey striatal neurons in the early and late stages of procedural learning. Exp Brain Res. 2002;146:122–126. [PubMed]
  • Nelson A, Killcross S. Amphetamine exposure enhances habit formation. J Neurosci. 2006;26:3805–3812. [PubMed]
  • Ohtsuka N, Tansky MF, Kuang H, Kourrich S, Thomas MJ, Rubenstein JL, Ekker M, Leeman SE, Tsien JZ. Functional disturbances in the striatum by region-specific ablation of NMDA receptors. Proc Natl Acad Sci U S A. 2008;105:12961–12966. [PubMed]
  • Parker M, McBride SD, Redhead ES, Goodwin D. Differential place and response learning in horses displaying an oral stereotypy. Behav Brain Res. 2009;200:100–105. [PubMed]
  • Price M, Lemaistre M, Pischetola M, Di Lauro R, Duboule D. A mouse gene related to Distal-less shows a restricted expression in the developing forebrain. Nature. 1991;351:748–751. [PubMed]
  • Rebola N, Lujan R, Cunha RA, Mulle C. Adenosine A2A receptors are essential for long-term potentiation of NMDA-EPSCs at hippocampal mossy fiber synapses. Neuron. 2008;57:121–134. [PubMed]
  • Schiffmann SN, Fisone G, Moresco R, Cunha RA, Ferre S. Adenosine A2A receptors and basal ganglia physiology. Prog Neurobiol. 2007;83:277–292. [PMC free article] [PubMed]
  • Shen HY, Coelho JE, Ohtsuka N, Canas PM, Day YJ, Huang QY, Rebola N, Yu L, Boison D, Cunha RA, Linden J, Tsien JZ, Chen JF. A critical role of the adenosine A2A receptor in extrastriatal neurons in modulating psychomotor activity as revealed by opposite phenotypes of striatum and forebrain A2A receptor knock-outs. J Neurosci. 2008a;28:2970–2975. [PubMed]
  • Shen W, Flajolet M, Greengard P, Surmeier DJ. Dichotomous dopaminergic control of striatal synaptic plasticity. Science. 2008b;321:848–851. [PMC free article] [PubMed]
  • Tricomi E, Balleine BW, O'Doherty JP. A specific role for posterior dorsolateral striatum in human habit learning. Eur J Neurosci. 2009;29:2225–2232. [PMC free article] [PubMed]
  • Wassum KM, Cely IC, Maidment NT, Balleine BW. Disruption of endogenous opioid activity during instrumental learning enhances habit acquisition. Neuroscience. 2009 [PMC free article] [PubMed]
  • Yin HH, Knowlton BJ. The role of the basal ganglia in habit formation. Nat Rev Neurosci. 2006;7:464–476. [PubMed]
  • Yin HH, Knowlton BJ, Balleine BW. Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning. Eur J Neurosci. 2004;19:181–189. [PubMed]
  • Yin HH, Knowlton BJ, Balleine BW. Blockade of NMDA receptors in the dorsomedial striatum prevents action-outcome learning in instrumental conditioning. Eur J Neurosci. 2005a;22:505–512. [PubMed]
  • Yin HH, Knowlton BJ, Balleine BW. Inactivation of dorsolateral striatum enhances sensitivity to changes in the action-outcome contingency in instrumental conditioning. Behav Brain Res. 2006;166:189–196. [PubMed]
  • Yin HH, Ostlund SB, Balleine BW. Reward-guided learning beyond dopamine in the nucleus accumbens: the integrative functions of cortico-basal ganglia networks. Eur J Neurosci. 2008;28:1437–1448. [PMC free article] [PubMed]
  • Yin HH, Ostlund SB, Knowlton BJ, Balleine BW. The role of the dorsomedial striatum in instrumental conditioning. Eur J Neurosci. 2005b;22:513–523. [PubMed]
  • Yin HH, Mulcare SP, Hilario MR, Clouse E, Holloway T, Davis MI, Hansson AC, Lovinger DM, Costa RM. Dynamic reorganization of striatal circuits during the acquisition and consolidation of a skill. Nat Neurosci. 2009;12:333–341. [PMC free article] [PubMed]