Activation of N
-methyl-D-aspartate receptors (NMDARs) initiates a cascade of molecular events underlying synaptic plasticity and memory formation. Genetic or pharmacological inactivation of NMDARs prevents the induction of some forms of long-term potentiation (LTP) and long-term depression (LTD) (Malenka and Bear 2004
) and impairs rodent learning on a range of tasks (Nakazawa et al. 2004
; Bannerman et al. 2006
NMDARs are heteromers composed of an obligatory NR1 subunit, plus varying combinations of NR2 (NR2A–NR2D) and sometimes NR3 subunits (Laube et al. 1998
; Rosenmund et al. 1998
). NR2A and NR2B subunits are differentially expressed over development, with NR2B predominating in the mouse brain until NR2A expression increases from the second postnatal week (Liu et al. 2004
). NMDAR subunits also contribute different physiological properties to NMDAR and differentially interact with intracellular postsynaptic scaffolding and signaling molecules (Cull-Candy et al. 2001
; Kohr 2006
). However, the precise role of NR2A and NR2B in learning and memory remains unclear due to a lack of selective pharmacological compounds, particularly for NR2A (Neyton and Paoletti 2006
; Kash and Winder 2007
Recent work has shown that the relative ratio of NR2A/NR2B is affected by sensory experience and learning, as well as environmental factors that impact learning such as sleep deprivation and stress (Baker and Kim 2002
; Kart-Teke et al. 2006
; Kopp et al. 2007
). The shortening of synaptic NMDAR-mediated currents that correlates with increased developmental NR2A expression can be delayed by sensory deprivation (Carmignoto and Vicini 1992
), while sensory input in previously deprived mice causes rapid synaptic insertion of NMDAR with a high NR2A/NR2B ratio and an increase in the threshold for LTP induction in visual cortex (Kirkwood et al. 1996
; Quinlan et al. 1999
). Along similar lines, successful olfactory discrimination learning in rats correlates with an increase in the NR2A/NR2B ratio, a shortening of NMDAR currents, and an increase in the LTP-induction threshold in cortical slices (Philpot et al. 2003
; Quinlan et al. 2004
; Lebel et al. 2006
). On the basis of these data, Lebel, Quinlan, and colleagues have proposed that a relative increase in NR2A serves to stabilize memories by constraining excessive synaptic plasticity (Quinlan et al. 2004
Supporting the contribution of NR2A to synaptic and behavioral plasticity, targeted gene knockout (KO) of NR2A or its C-terminal domain is sufficient to impair hippocampal LTP and cause learning deficits on the Morris water maze and certain forms of Pavlovian fear conditioning (Sakimura et al. 1995
; Ito et al. 1996
; Kishimoto et al. 1997
; Kiyama et al. 1998
; Sprengel et al. 1998
). To date, NR2A KO mice have not been tested on instrumental associative learning tasks. These tasks require incremental learning over trials, making them well-suited for evaluating the ability of NR2A KO mice to progressively acquire and retain stable associations. Positively reinforced instrumental learning tasks such as the touchscreen method used here also go some way to circumvent the strong aversive component inherent in the Morris water maze and fear conditioning; an important factor given abnormal anxiety-like behavior and stress reactivity in NR2A KO mice (Miyamoto et al. 2001
; Boyce-Rustay and Holmes 2006b
). In the present study, we sought to further elucidate the role of NR2A in mediating associative learning by testing NR2A KO mice on appetitive pairwise visual discrimination and reversal learning tasks and, for comparison, acquisition and extinction of a simple instrumental response that did not require pairwise discrimination.
NR2A mutant mice were generated as previously described (Sakimura et al. 1995
). For the present study, the NR2A null mutation was backcrossed into the C57BL/6J strain for >10 generations to produce a congenic C57BL/6J genetic background. Analysis of 150 SNP markers at ~15–20-Mb intervals across all autosomal chromosomes confirmed >99% C57BL/6J congenicity in the mutant line (JRS Allele Typing Services, The Jackson Laboratory, Bar Harbor, ME) (Boyce-Rustay and Holmes 2006a
). To avoid abnormalities resulting from genotypic differences in neonatal environment (Holmes et al. 2005
), NR2A KO, heterozygous NR2A (HET), and wild-type (WT) mice were generated from HET × HET matings. Mice were bred at The Jackson Laboratory, shipped to the NIH at 7–9 wk old, and tested from 10 wk old. Males and females were used and housed in groups of two to four with same-sex littermates in a temperature- and humidity-controlled vivarium under a 12-h light/dark cycle (lights on at 06:00 h). The number of mice tested is given in the figure legends.
Learning was assessed in a touchscreen-based operant system described previously for rats (Bussey et al. 2001
) and mice (Brigman et al. 2006
; Izquierdo et al. 2006
). The operant chamber measuring 21.6 × 17.8 × 12.7 cm (model no. ENV-307W; Med Associates) was housed within a sound and light attenuating box (Med Associates). The grid floor of the chamber was covered with solid Plexiglas to facilitate ambulation. A pellet dispenser delivering 14-mg dustless pellets (no. F05684; BioServ) into a pellet magazine located at one end of the chamber. At the opposite end of the chamber there was a touch-sensitive screen (Light Industrial Metal Cased TFT LCD Monitor; Craft Data Limited), a house light, and a tone generator. The touchscreen was covered by a black Plexiglas panel that had 2 × 5 cm windows separated by 0.5 cm and located at a height of 6.5 cm from the floor of the chamber. Stimuli presented on the screen were controlled by custom software (MouseCat; L.M. Saksida) and visible through the windows (1 stimulus/window). Nosepokes to the stimuli were detected by the touchscreen and recorded by the MouseCat software.
Body weights were slowly reduced and then maintained at 85% of free-feeding. Prior to exposure to the testing apparatus, mice were acclimated to the 14-mg pellet food reward by provision of ~10 pellets per mouse in the home cage for 1–3 d. They were then acclimated to the operant chamber and taking rewards from the pellet magazine by being placed in the chamber with pellets freely available in the magazine. Apparatus habituation was indicated by eating 10 pellets in 30 min. Mice then underwent Pavlovian autoshaping. Variously shaped stimuli were presented in the touchscreen windows (1 per window) for 10 sec (intertrial interval [ITI] 15 sec). The disappearance of the stimuli coincided with provision of a single-pellet food reward. To further reinforce the conditioned association between the stimuli and reward, pellet delivery was concomitant with illumination of the pellet magazine and the presentation of a 2-sec 65-dB tone. The mouse was required to eat the pellet in the pellet magazine (detected as a single head entry) in order for the next trial to commence. Successful autoshaping was indicated by eating 30 pellets within a 30-min session.
Prior to discrimination learning, mice first underwent three phases of instrumental pre-training (Izquierdo et al. 2006
). Phase 1: To obtain a reward, the mouse was required to respond to a stimulus (variously shaped) that appeared in one of the two windows (spatially pseudorandomized) and remained on the screen until a response was made. Phase 2: This was the same as phase 1, except that the mouse was now required to initiate a new trial by making a head entry into the pellet magazine between trials. Phase 3: This was the same as phase 2 but was introduced to discourage indiscriminate touching of the screen by signaling nosepokes at the blank window with a 5-sec timeout period during which the house light was extinguished. Punishment was followed by correction trials in which the same stimulus and spatial configuration was presented until a correct response was made. Each phase consisted of 30-trial sessions (15 sec ITI) administered 1 session/day. The mouse was required to perform 90% correct responses (excluding correction trials) to progress through each phase and then onto discrimination learning. The effect of genotype on number of sessions to complete habituation, autoshaping, and each pre-training phase was analyzed using analysis of variance (ANOVA).
For discrimination learning, two novel equiluminant stimuli (see Bussey et al. 2001
; Izquierdo et al. 2006
) were presented, one per window. Responses at one stimulus (correct) resulted in reward; responses at the other stimuli (incorrect) resulted in a 5-sec timeout with the house light extinguished. Stimuli remained on screen until a response was made. Designation of the correct and incorrect stimuli was counterbalanced across genotypes. Within-session, left versus right spatial presentation of the correct and incorrect stimuli was pseudorandomized, with a given configuration occurring less than four times consecutively. For reversal learning (commencing the day after reaching criterion discrimination), the correct versus incorrect designation of stimuli was reversed (i.e., previously rewarded stimulus now incorrect, and vice versa). Discrimination and reversal sessions consisted of 30 discrete trials (15 sec ITI). A trial is defined as the first presentation of a stimulus pair after a correct response has been made in either a first presentation trial or a correction trial. The criterion was performance at an average of 85% correct responses on first presentation trials over two consecutive sessions (at least 83% on any one session). Mice were offered a maximum of 60 sessions.
ANOVA, followed by Newman-Keuls post hoc tests, was used to analyze the effect of genotype on total (first presentation) trials, total incorrect responses (on first presentation trials), total correction trials, and total omitted (first presentation) trials to attain criterion, as well as average response reaction time and reward retrieval latency. Performance early during reversal when percent correct responses are low is characterized by perseveration at the previously rewarded stimulus (Jones and Mishkin 1972
; Bussey et al. 1997
; Chudasama and Robbins 2003
). Therefore, we categorized reversal sessions according to whether performance was <50% correct or ≥50% correct and calculated the total number of trials committed in each category. Trials committed on <50% correct reflect performance when perseveration is relatively high. Trials committed on ≥50% correct reflect performance when perseveration is relatively low and learning is high. To further examine perseverative responding on the reversal problem, we calculated a “perseveration index”: operationally defined as the ratio of correction trials to incorrect responses (on first presentation trials).
A separate cohort was tested for the acquisition and extinction of simple instrumental response in which the mouse must learn to respond to stimuli on the screen but is not required to make a discrimination to be rewarded. Mice first underwent habitation, autoshaping, and the first two phases of pre-training as above. They were then required to respond to a stimulus to obtain a pellet: two stimuli were presented (1 × 2.8 cm2 white square per window) with a touch at either resulting in reward. Stimuli remained on the screen until a response was made. Sessions consisted of 30 trials (5 sec ITI). The acquisition criterion was defined as performing 30 trials within 12.5 min on each of five consecutive sessions. The response was then extinguished (i.e., no reward for touches) to a criterion of two consecutive sessions of at least 77% trial omissions. The effect of genotype on number of sessions to attain acquisition and extinction criteria, as well as average response reaction time and latency of reward retrieval, was analyzed using analysis of variance (ANOVA). There was no significant interaction between genotype and sex, nor any main effect of sex, for any dependent measure, and data were collapsed across sex for analysis.
Results showed that genotypes did not differ in the number of sessions taken to complete habituation, autoshaping, or any phase of pre-training, although KO mice showed a trend for progressing more rapidly through pre-training phase 1 than WT or HET mice () (note: 1 KO and 3 HET mice were excluded from pre-training due to high omission rates). All mice tested attained the performance criterion for discrimination learning. However, KO mice required significantly more trials to learn the discrimination (ANOVA effect of genotype: F(2,19) = 4.48, P < 0.05; post hoc tests, P < 0.01) () and made significantly more incorrect responses in doing so (genotype: F(2,19) = 4.31, P < 0.05; post hocs, P < 0.01) (), as compared to WT or HET mice. There was a nonsignificant trend for a genotype difference in correction errors (F(2,19) = 3.16, P = 0.07) (). Genotypes did not significantly differ in the number of trials omitted (WT = 4.6 ± 2.3, HET = 5.5 ± 4.7, KO = 8.0 ± 5.7), response reaction time (WT = 8.0 ± 0.9 sec, HET = 8.9 ± 0.9, KO = 7.0 ± 1.4), or latency to retrieve the reward (WT = 3.5 ± 0.8 sec, HET = 2.6 ± 0.5, KO = 2.7 ± 0.7).
Apparatus habituation, Pavlovian autoshaping, and instrumental pre-training in mice lacking NR2A
Figure 1 Impaired pairwise visual discrimination learning in NR2A KO mice. (A) KO mice committed more trials than WT or HET mice to reach criterion. (B) KO mice committed more incorrect responses than WT or HET mice to reach criterion. (C) KO mice showed a nonsignificant (more ...)
The KO mice were significantly impaired on the reversal task. They committed more trials (genotype: F(2,18) = 8.05, P < 0.01; post hocs, P < 0.01) () and made more incorrect responses (genotype: F(2,18) = 6.30, P < 0.05; post hocs, P < 0.01) () and correction errors (genotype: F(2,18) = 4.61, P < 0.05; post hocs, P < 0.01) () than WT or HET mice. For sessions in which percent correct scores were ≥50%, KO mice committed more trials than WT mice (genotype: F(2,18) = 8.74, P < 0.01; post hocs, P < 0.01), while genotypes were no different during sessions where percent correct scores were <50% (genotype: F(2,18) = 2.57, P = 0.10) (). The perseveration index was greater during sessions when performance was <50% than when it was ≥50%, but was no different between genotypes (<50%, WT = 4.4 ± 0.8, HET = 3.1 ± 0.5, KO = 2.9 ± 0.9; ≥50%, WT = 1.5 ± 0.1, HET = 1.5 ± 0.1, KO = 1.8 ± 0.1). Two of eight KO mice failed to attain criterion after 60 sessions, and their scores up to 60 sessions were included in the analysis. Genotypes did not differ in the number of trials omitted (WT = 35.1 ± 13.4, HET = 66.8 ± 23.3, KO = 31.8 ± 10.0), response reaction time (WT = 5.7 ± 0.7 sec, HET = 7.9 ± 1.3, KO = 4.6 ± 0.7), or reward retrieval latency (WT = 2.1 ± 0.2 sec, HET = 2.9 ± 0.8, KO = 2.0 ± 0.2).
Figure 2 Impaired pairwise visual discrimination learning during reversal in NR2A KO mice. (A) KO mice committed more trials than WT or HET mice to reach criterion. (B) KO mice committed more incorrect responses than WT or HET mice to reach criterion. (C) KO mice (more ...)
There were no genotype differences in the trials taken to either acquire () or extinguish () an instrumental behavior that did not require discrimination. Rates of habituation, autoshaping, and instrumental pre-training prior to acquisition on this task were not different between genotypes (data not shown).
Figure 3 Normal acquisition and extinction of an instrumental behavior requiring no pairwise discrimination in NR2A KO mice. (A) Genotypes did not differ in the number of trials to acquire the instrumental behavior. (B) Genotypes did not differ in the number of (more ...)
The major finding of the present study was impaired discrimination learning in NR2A KO mice. Specifically, NR2A KO mice were markedly slower than WT mice to acquire a pairwise visual discrimination, and when the reinforcement contingencies of the learned association were reversed, NR2A KO mice were significantly impaired relative to WT mice. This was due to deficient learning of the new association rather than impaired reversal per se, as genotypes performed at equivalent levels during reversal sessions when performance was low and perseveration high (i.e., <50%) (Jones and Mishkin 1972
; Bussey et al. 1997
; Chudasama and Robbins 2003
), while KO mice committed more trials and errors during sessions when performance was largely learning-related (i.e., >50%). However, the possible contribution of increased perseverative responding and a more general deficit in cognitive flexibility in NR2A KO mice cannot be excluded.
Impaired discrimination and reversal learning in NR2A KO mice was not due to nonspecific motivation or sensorimotor-related performance, as evidenced by normal scores on trial omissions, response reaction times, or reward retrieval latencies. Moreover, NR2A KO mice were no different from WT controls in the rate of acquisition and extinction of an instrumental touchscreen response that did not require pairwise stimulus discrimination. Finally, NR2A HET mice were unimpaired on any of these tests or in discrimination and reversal, demonstrating that NR2A haploinsufficiency was insufficient to disrupt learning.
The present findings support and extend previous evidence of impaired spatial reference memory and associative fear learning in NR2A KO mice (Sakimura et al. 1995
; Ito et al. 1996
; Kishimoto et al. 1997
; Kiyama et al. 1998
; Sprengel et al. 1998
). The phenotypic profile of NR2A KO mice in our study is also highly reminiscent of recent pharmacological data showing that systemic administration of the noncompetitive NMDAR antagonist MK-801 significantly retarded acquisition on a pairwise olfactory discrimination test in rats (Lebel et al. 2006
). Interestingly, successful discrimination learning in untreated rats was associated with an increase in the NR2A/NR2B ratio in olfactory cortex, and this increase was blocked by treatment with the NMDAR antagonist MK-801 (Lebel et al. 2001
). The loss of this molecular switch would be a plausible mechanism for the impaired discrimination learning seen in NR2A KO mice in the present study.
NR2A KO or MK-801 treatment in rats significantly retarded the rate of discrimination learning, but neither manipulation completely prevented learning when animals were offered extended training. Thus, in the absence of an NR2A-mediated mechanism, alternative, albeit less efficient, molecular pathways are able to support learning. This is consistent with a wider literature showing that where NMDARs are important mediators of learning, as in the Morris water maze, alternative mechanisms can mitigate the effects of NMDAR inactivation under certain conditions (e.g., after pre-training) (for discussion, see Nakazawa et al. 2004
; Bannerman et al. 2006
). Furthermore, it is known that while both NR2A- and NR2B-containing NMDARs subserve synaptic plasticity, this only holds for certain forms of plasticity, and NR2A is clearly not obligatory for LTP (Weitlauf et al. 2005