Production and basic characterization of DA neuron selective NR1 knock-out mice
These mice, named “DA-NR1-KO”, were produced by crossing floxed NR1 (fNR1) mice (Tsien et al., 1996
) with Slc6a3+/Cre transgenic mice which express Cre recombinase under DA transporter promoter (Zhuang et al., 2005
) (, ). The DA neuron specific deletion of the NR1 gene was confirmed by both the reporter gene method () and immunohistochemistry (), which showed that the gene deletion was restricted to the dopamine neuron in regions such as the VTA and the substantia nigra. No obvious changes were observed in expression of tyrosine hydroxylase the catecholamines neuronal marker, suggesting that there was no obvious loss of dopaminergic neurons (Figure S1
Generation and characterization of DA-NR1-KO mice
DA-NR1-KO mice were born in the expected Mendelian ratios and visually indistinguishable from the controls. Additionally, they were normal in locomotor activities in a novel open field (), in learning the rotarod tests (), in an anxiety test using the elevated plus maze (), and in the novel object recognition tests (). These results showed that many of the behavioral functions that were sensitive to dopamine dysfunctions were preserved in the DA-NR1-KO mice.
Basic behavioral characterization of DA-NR-KO mice
DA neurons in the DAT-NR1-KO show normal tonic firing, reduced phasic firing and reduced responses to the reward predicting cue
In order to investigate the impacts of NR1 deletion on the cellular properties of DA neurons, we recorded the activities of these neurons in both the DAT-NR1-KO mice and wild type control littermates. Movable bundles of 8 tetrodes (32 channels) were implanted into the ventral midbrain primarily the VTA. The putative DA neurons were identified based on their firing patterns and their sensitivity to dopamine receptor agonist apomorphine (1 mg/kg, i.p.) at the end of each recording session ().
Bursting firing by DA neurons is impaired in KO mices
14 putative DA neurons from 4 mutant mice and 16 from 6 wild type controls were recorded and analyzed. Phasic firing activities or bursting was defined as a spike train beginning with an inter spike interval (ISI) smaller than 80 ms and terminating with an ISI greater than 160ms. Comparing with the control neurons, phasic firing activities was greatly reduced in the NR1 KO DA neurons. The observed median frequency of phasic firing decreased from 0.78±0.09 Hz in the control DA neurons to 0.36±0.09 Hz in KO DA neurons. (Mann-Whitney U test P<0.01) (). A significant reduction was also observed in the percentages of spikes fired in phasic activities (34.7% in the controls vs 21.2% in the DAT-NR1-KO, Mann-Whitney U test P<0.01) (). The total firing rate was also reduced in the mutant DA neurons. This appeared to be correlated with reduced burst set rate (5.18±0.59 Hz, control, vs. 3.85±0.38Hz, KO; r=0.7719, Mann-Whitney U test P<0.01) (). No significant difference was observed in the tonic firing between the mutant and control groups. (4.42±0.44 Hz in control, vs. 3.29±0.36Hz in KO, Mann-Whitney U test P>0.05) ()
To further evaluate the response of DA neurons in a learning task, mice were trained 40 trials per day in a Pavlovian conditioning paradigm in which a 5 KHz tone that lasted 1 second proceeded immediately before the delivery of a food pellet. DA neurons from both genotypes were able to associate the tone with phasic firing, but the conditioned responses were much weaker in the DAT-NR1-KO group (). Thus while DAT-NR1-KO neurons showed increased firing over the days during the training, their responses were significantly reduced comparing with the controls on day 1 (19.21±3.24 Hz, control, vs. 9.74±0.30Hz, KO; p<0.01), day 2 (36.33±4.39Hz, control, vs. 16.43±4.01Hz, KO; p<0.01) and day 3 (59.38±3.82 Hz, control, vs. 33.88±4.30Hz, KO; p<0.01) (). These data suggested that while NMDAR1 deletion did not prevent DA neurons from developing conditioned responses (bursting) towards reward predicting cues, it did however greatly lowered the robustness of the bursting response, a phenomena which we call DA neuron blunting.
Responses of putative dopamine (DA) neurons in three days reward test
Habit learning, but not goal directed learning, was impaired in the operant appetitive conditioning
To assess habit learning, we first tested the mice in a lever pressing operant conditioning task. In this task, an instrumental action, pressing lever to obtain food, can transform from a goal directed to a habitual response after extensive training and become progressively less sensitive to devaluation of outcome (Dickinson et al., 1983
). The decreased sensitivity can thus be measured as a behavioral readout of habit learning (). Both mutant and control mice learned to press the lever on an extensive training protocol consisting of four days of continuous reinforcement (CRF), two days of random interval RI 30s, and six days of RI 60s schedules (Dickinson et al., 1983
). Mice in both groups increased lever press rates during the training (CRF Day 1 through 4, RI 30s day 5 and 6, RI 60s day 7 through 12) (). A two-way ANOVA of repeated measures, with days and genotype as factors showed no effect of genotype (F(1, 231)
= 0.07), a main effect of days (F(11, 231)
= 51.4, P<0.01), and no interaction between these factors (F(11, 231)
= 0.269). This result suggested that the DA-NR1-KO mice have normal wanting of the pellet reward and exhibited normal goal-directed learning.
Habit and goal directed learning test with operant appetitive conditioning
Lever pressing was then tested after the outcome devaluation. Mice were pre-fed with either regular mouse chow to which they had been exposed in their regular home cages (non-devalued condition/control), or purified high-energy pellets which are identical to the rewards earned during lever-press sessions (devalued condition). Feeding with mouse chow was used as a control for the overall level of satiety, causing little reduction in the rewarding value of the purified high-energy pellets. Levers were inserted in the 5 minute long probe test which immediately followed the hour long unlimited food exposure (pellets or chow). No pellets were given during the tests. Comparing numbers of lever press during the tests showed that while no differences were found between the mutant and the control mice on non-devalued condition (p=0.94) or between the devalued and non-devalued conditions (p = 0.153) in the control group, there was a significant difference in the mutant mice between devalued and non-devalued conditions (p < 0.01). Furthermore, there was also a significant difference between the mutant and control mice on devalued condition (p < 0.05). A two-way ANOVA of repeated measures, with treatment and genotype as factors showed a interaction between the two factors (F(1, 21) = 4.98, p<0.05) (). These suggested that the conditional knockout mice failed to develop the lever-pressing habit despite extensive training, and their action stayed goal directed.
Spatial navigation habit, but not spatial memory, was impaired in the positively reinforced plus maze
Habit learning was then assessed in a navigation-based paradigm using plus maze place/response learning tasks (Devan and White, 1999
; Packard, 1999
; Packard and McGaugh, 1996
). Littermates in genotypes Slc6a3+/Cre; fNR1/+, Slc6a3+/Cre and wild type served as three control groups for the DA-NR1-KO mice. The maze was built with transparent walls and placed in a room furnished with spatial cues. The schematic training and testing schedules are shown in . Naïve animals, always starting from the same location in the maze (the “south” arm), were trained to find a fixed target site (in the “east” arm) (training I in ). In order to facilitate developing habit based navigation, the north and the west arm were both closed. It has been shown that under this paradigm, normal mice would learn to search the target using spatial reference memory after moderate training but would switch to habitual navigation after extensive training (Packard and McGaugh, 1996
). Probe trials, during which the start location switched from the “south” arm to the “north” arm, were given at different time points to allow dissociation of the spatial and habitual strategies. Thus mice using the “habit strategy” were predicted to turn right (into the “west” arm) while the “spatial” mice, guided by distal spatial cues, were predicted to go to the “east” arm, where the target resided during training.
Habit learning analyzed using plus maze
All mice were trained in 10 trials per day for five consecutive days before the first probe trial on day 6 (Probe 1 in ). During this probe trial, the DA-NR1-KO group and control mice showed similar preferences [χ2(3, n = 43) = 0.346, P= 0.951] for the “spatial” strategy, opting to turn left towards the “east” arm () suggesting that they had similarly acquired the spatial memory and that they shared comparable motivation. All mice were then trained for 10 additional days before the second probe trial (probe 2 in ) on day 17. During this probe trial, no significant differences were found among the three control groups (χ2 (2, n=29) = 0.499, p= 0.779). As a group, control mice opted to “turn right” (and into the “west” arm) significantly more on day 17 than on day 6 (χ2 (1, n = 29) = 22.587, p = 0.00000201), indicating a learned “habit” based searching strategy. In contrast, less than 10% of the DA-NR1-KO mice (comparing with 80% of control mice) (mutants vs controls: χ2 =7.244; p =0.007), opted to turn “right” on day 17 (), suggesting that they failed to learn the “habit” based strategies and instead kept using the “spatial” strategy.
To confirm that the deficits in the plus maze tasks were indeed from habit learning, right after the second probe trial mice were further challenged in a “re-learn after 90° rotation” procedure (training II, ), three trials a day for two days within the exact same maze and surrounding cues. During the training, both the west and south arms were blocked. The start box was placed in the “east” arm and the food rewards in the “north” arm. Mice were tested in a rotation test on day 19 and accuracies to locate the food were scored. Mice started from the “east” arm with all arms open during the test (). For “habit” mice who had learned to “turn right” during previous training sessions (day 1 through 16), this new learning was simply a re-training, in which the same habit response (turning “right”) would lead them to the new food location. However, for the “spatial” mice, switching of target location from the “east” arm to the “north” arm conflicted with the previously learned spatial relationship and thus, was predicted to inhibit new learning. As in , the mutants showed significantly less success (turning “right” or into the “north” arm) (χ2(3, n = 42) = 11.667, P= 0.0006) while no difference was found (χ2(3, n = 42) = 0.73, P= 0.694) among the three control groups. This supported the notion that mutant mice failed to learn the habit strategy, even after the extensive training.
Spatial navigational habit learning, but not spatial memory, was impaired in the negatively reinforced plus maze
Since many studies suggested that dopamine is important for reward pathways, we asked whether habit learning deficits seen in the DA-NR1-KO mice hinged on the nature of the reinforcement. The aforementioned experiments were replicated in a water-based plus maze, in which the sole escape from the water was for mice to locate and climb onto a hidden platform in the end of one arm. This water-based plus maze behavior was driven by the desire to escape from the negative environment and offered an additional opportunity to compare with habit learning based on positive reinforcement such as the seeking of a food reward. All parameters such as maze dimensions, cues used, starting and target locations, number of trials per day and numbers of days in training remained the same as those in the previous food rewarded experiments ().
The first probe trial revealed no significant differences between any two of the four genotypes (χ2(3, n = 43) = 0.346, P= 0.951). The second probe trial showed that over 80% of the control mice had adopted the “habit” strategy, while the mutant mice remained strongly “spatial” (). No differences were found among the three control groups (χ2 (2, n=29) = 0.499, p= 0.779). As a group, the control mice opted for the “habit” strategy significantly more on day 17 than on day 6 (χ2 (1, n = 29) = 22.587, p = 0.00000201). A significantly lower percentage of DA-NR1-KO mice opted to “turn right” (7.14% vs. 80% in the control mice; χ2 (1, n = 43) = 20.904, p = 0.00000483). The deficits in habit learning were further confirmed in the rotation test given after two days of the “re-learn after 90° rotation” challenge learning (Training II, ). A significantly smaller proportion of the mutant mice (28.6%) in contrast to 80% of the controls, were able to successfully locate the new platform position (One tailed probability = 0.000388, Fisher’s exact test). These data thus agreed with the findings from the food-rewarded tasks suggesting that the learning deficits were unlikely contingent on the types of reinforcement employed in the training process.
Due to the significant involvement of spatial learning in the plus maze task, mice were tested in a spatial version of the plus maze (). They were trained six trials per day for six days to find a hidden platform in the water filled plus maze. With all four arms open, starting points switched between trials in each day rotating among the distal ends of three arms that did not contain the platform, following a semi-random order. The platform location remained fixed throughout. A probe test was given on day 10, three days after the training session ended. During the test, with the platform removed, mice were released to the center of the maze and allowed to search for 60 seconds. Durations spent by each mouse in each arm were recorded (). Mice from all four groups spent significantly more time searching in the target arm [mutants, F(3,32) = 101.292, p <0.001; Cre, fNR1/+, F(3,28) = 134.996, p<0.001; cre, F(3,36) = 147.806, p <0.001; wild type, F(3, 36) = 294.358, p < 0.001; Newman-Keuls post hoc comparison (the target arm compared to all the other arms), P < 0.01 for all genotypes]. No differences were found between the mutant and any control groups, suggesting that spatial learning abilities were unlikely a factor causing the habit learning deficits observed in the DA-NR1-KO mice.
Spatial memory test using plus maze and habit learning test using zigzag maze
Habit learning in a nonspatial zig-zag maze-based habit task was impaired
Instead of compromising habit learning per se, DA specific NR1 deletion could instead have skewed the competition between “spatial” and “habit” memory systems in the plus maze task. In order to investigate this possibility, we designed a nonspatial “zigzag maze” task as a more direct measurement of habit learning. As shown in , the water filled zigzag maze consisted of eight arms similar in length. Mice were trained to escape onto a hidden platform. Six different starting points were chosen, each paired with its own location of the hidden platform. The platform locations were chosen so that they would be reached after two consecutive right turns from the start point. All mice were trained 12 trials per day for 10 days. To facilitate developing the turning habits, some arms were blocked (red lines) so that mice were only allowed the correct turn at each intersection. A probe test was given on day 11 in which mice were placed at a random start location. Some arms in the maze remained blocked (red lines) but unlike in training, mice were allowed to choose between turning “left” or “right” at two intersections (). Mice were scored for whether they finished the two consecutive right turns (counted as “successful”). No differences were found among the three control genotypes (all between 90% to 100%, χ2(2, n = 29) = 1.968, P= 0.374) (), and they were pooled. The conditional knockout mice showed a significantly lower successful rate in making the two consecutive right turns (One tailed probability = 0.000196, Fisher’s exact test) again suggesting that the DA-NR1-KO mice are defective in developing the navigation habit.
Habit learning test using zigzag maze