All experiments were approved by the NIAAA ACUC and the Portuguese DGV, and done in accordance with NIH and European guidelines. C57BL/6J male mice between 3 and 6 months old, purchased from the Jackson Laboratory at 8 weeks of age, were used in the experiments using WT mice. Striatal-specific NMDAR1-knockout (KO) and control littermates were generated by crossing RGS9-cre
mouse with NMDAR1-loxP
mouse, as formerly described in Dang et al.1
. The behavioral experiments using striatal NR1-KO mice were performed on 8 to 12 weeks old male and female RGS9-cre
+ / NMDAR1-loxP
homozygous mice and all the controls were their littermates, including RGS9-cre
+ / NMDAR1-loxP
heterozygous and NMDAR1-loxP
homozygous mice. There was no difference between the three control groups so the data were combined. TH-cre/NR1 flox mice were generated by crossing TH-cre
mouse (in which Cre expression in the midbrain is localized specifically in dopaminergic neurons) in midbrain dopaminergic neurons2
, with NMDAR1-loxP
mouse, as described3
. A cre-inducible adeno-associated virus (AAV) vector carrying the gene encoding the light-activated cation channel channelrhodopsin-2 (ChR2) and a fluorescent reporter4
was stereotactic delivered into the SNc of TH-cre mice2
enabling specific expression of ChR2 in DA containing neurons (THChR2)4
Behavior training and testing took place in operant chambers as described previously5
. Briefly, each chamber (21.6 cm L × 17.8 cm W × 12.7 cm H) was housed within a sound attenuating box (Med-Associates, St. Albans, VT) and equipped with two retractable levers on either side of the food magazine and a house light (3 W, 24 V) mounted on the opposite side of the chamber. Sucrose solution (10 %) was delivered into a metal cup in the magazine through a syringe pump. Magazine entries were recorded using an infrared beam and licks using a contact lickometer. Mice were placed on food restriction throughout training, and fed daily after the training sessions with ~2.5 g of regular food to allow them to maintain a body weight of around 85 % of their baseline weight.
Training started with a 30 minute magazine training session in which the reinforcer was delivered on a random time schedule, on average every 60 seconds (30 reinforcers). The following day lever-pressing training started with continuous reinforcement (CRF), in which animals obtained a reinforcer after each lever press. The session began with the illumination of the house light and insertion of the lever, and ended with the retraction of the lever and the offset of the house light. In the first day of CRF the sessions lasted 45 minutes or until mice received five reinforcers, the second day of CRF lasted 45 minutes or until mice received 10 reinforcers, and the last day of CRF lasted 45 minutes or until mice received 15 reinforcers. After three days of CRF, animals started to be trained (day 1) on a fixed ratio schedule in which eight presses earn a reinforcer (FR8), without any stimulus signaling when eight presses were completed or when the reinforcer was delivered; this training continued for twelve days. To train animals in a two-lever, two-reward magnitude task, every day animals had two single-lever training sessions, one for the left lever and another for the right lever. Throughout training, one of the levers delivered a small reward (15 μl of solution) after eight presses, while the other delivered a large reward (50 μl of solution) after the same number of presses, and the order of the daily sessions was counterbalanced. After six days of training, animals were given a choice test in extinction with both levers presented for 5 minutes without reward (day 7). Starting the subsequent day, the contingency between lever and reward magnitude was reversed for the following six days, and another extinction test was conducted at the end of training (day 13). The animals were trained daily without interruption and every day the training started approximately at same time. All timestamps of lever presses, magazine entries and licks for each animal were recorded with a 10 ms resolution. The training chambers and procedures for training in striatal NR1-KO and littermate controls were exactly the same as used for C57BL/6J mice.
The beginning and end of a sequence of lever presses was determined by either the statistics of lever pressing for each animal (either bimodal or Poisson distribution, on average a 20 s pause between sequences), or by a bout of licks interrupting lever pressing. The sequence length and duration were thus calculated based on each individual sequence, and the within-sequence press rate computed by the ratio of sequence length (≥ 2 presses) and the corresponding sequence duration. The inter-sequence-interval was defined as the time between two successive sequences. The mean within-sequence inter-press interval was calculated from inter-press intervals within each individual sequence and averaged for all sequences in each animal per session.
Electrophysiological data in C57BL/6J experiment were collected from eleven mice. Each of them was implanted by two electrode arrays ipsilaterally, with one targeting the dorsal striatum and another the substantia nigra. The main electrode design used in this study consists of an array of 2 × 8 Platinum-coated tungsten microwire electrodes (35 or 50 μm diameter)6
. For dorsal striatum, tungsten microwire electrodes of 50 μm diameter with 150 μm spacing between microwires, and 250 μm spacing between rows were used. The more 8 medial electrodes targeted the more medial area of dorsal striatum (associative) while the 8 more lateral targeted more the lateral region of dorsal striatum (sensorimotor subregion)6
. For substantia nigra, tungsten microwire electrodes of 35 μm diameter with 150 μm spacing between electrodes and 150 μm spacing between rows were used. In some experiments the array used for substantia nigra was cut at a 30 to 45 degree angle to better fit the medial-lateral anatomy of the substantia nigra.
The craniotomies were made at the following coordinates: 0.5 mm rostral to bregma and 1.8 mm laterally for dorsal striatum; 3.4 mm caudal to bregma and 1.0 mm laterally for substantia nigra. During surgeries, the microwire arrays were gently lowered ~ 2.2 mm from the surface of the brain for dorsal striatum and ~ 4.2 mm for substantia nigra, while simultaneously monitoring neural activity. Final placement of the electrodes was monitored online during the surgery based on the neural activity, and then confirmed histologically at the end of the experiment after perfusion with 10 % formalin, brain fixation in a solution of 30 % sucrose and 10 % formalin, followed by cryostat sectioning (coronal slices of 40 – 60 μm), and cresyl violet staining (Supplementary Fig. 2
In the TH-ChR2 experiment, the virus was injected into the SNc through a glass pipette (using Nanoject II, Warner Instruments) into two sites: 3.4 mm caudal to bregma, 1.0 mm laterally and ~ 4.1 mm and ~ 4.3 mm bellow the dura, respectively. We injected 0.3 μl of purified virus per site. A guide cannula terminating 300 μm above the injection/recording site was implanted attached to the electrode array, allowing simultaneous electrophysiological recordings and light stimulation, which was delivered through an optical fiber (200 μm core diameter, 0.37 N.A., Thorlabs Inc., NJ) with a diode-pumped solid-state laser (473 nm, LaserGlow Tech Inc., Canada) controlled by TTL pulses (10 ms). The measured output at the tip of the 200 μm fiber was approximately 60 mW.
Neural recordings during operant learning
The animals with implanted electrodes were allowed to recover for 2 to 3 weeks after surgery before training started. The training procedure was exactly the same as described above for the animals only undergoing behavioral testing. Some animals took longer to acquire the task due to the mechanics of the recording wires. In those cases, the data used as day 1 of FR8 training was defined as the first day in which the animal obtained ten or more reinforcers, and day 6, day 12 are the sixth and twelfth day after that. Surgery and electrode array implantation for striatal NR1-KO and littermate controls was the same as for C57BL/6J, but with only one instead array per KO animal (for easier training), targeted to striatum. Since striatal NR1-KO were severely impaired in learning and executing sequences of lever presses, and this was more severe with the headstage and recording cables connected, the neural data of striatal NR1-KO mutants and littermate controls at different training stages was acquired using a between animal design, i.e. one group was trained without cables and recorded during the first day they earned more than 10 reinforcers during FR8, and another group trained was trained without cables and was recorded after 6 consecutive days earning more than 10 reinforcers. There were 6 NR1-KO recorded during the early training stage and 10 NR1-KO recorded during late training stage, and 10 littermate mice during early and the same number during late training.
Neural activity was recorded using the MAP system (Plexon Inc., TX). The spike activity was initially sorted using an online sorting algorithm (Plexon Inc.). Only cells with a clearly identified waveform and relatively high signal-to-noise ratio were used6,7
. At the end of the recording, cells were resorted using an offline sorting algorithm (Plexon Inc.) to isolate single units6,7
. Single units displayed a clear refractory period in the inter-spike interval histogram, with no spikes during the refractory period (larger than 1.3 ms). TTL pulses were sent from a Med-Associates interface board to the MAP recording system through an A/D board (Texas Instrument Inc., TX) so that the animal's behavioral timestamps during operant conditioning were synchronized and recorded together with the neural activity. In order to characterize if SN neurons were dopaminergic neurons, a D2 receptor agonist (quinpirole, 1–2 mg/kg, Sigma Inc., MO) was injected intra-peritoneally at the end of the sessions. Neural activity was recorded for 1 to 2 hours before and after injection for comparison.
Neural recordings in anesthetized mice
Striatal NR1-KO homozygous mice and control littermates were used. Recordings were performed using the Plexon MAP system (Plexon Inc., TX) with the animals under isoflurane anesthesia (1.0 – 1.2 %). The electrode arrays used were the same as those used for in vivo
freely moving recordings6,7
, and a skull screw was used a ground. The coordinates were the same as used for striatal recoding in behaving mice: 0.5 mm rostral to bregma and 1.8 mm lateral to midline. All the units recorded within a depth of 2.0 – 2.7 mm below the cortical surface were then classified as dorsal striatum MSNs or interneurons based on waveform, firing rate, and activity pattern (Supplementary Fig. 3
, also Supplementary Fig. 16
. Only stable recordings lasting more than ten minutes were used for further analysis. The burst-like phasic spontaneous activity in striatal MSNs was defined as two or more spikes occurring with an inter-spike interval of less than 125 ms and terminated with an inter-spike interval more than 280 ms. Spike-triggered average was calculated by averaging the LFP in a time window 1s preceding and 1s following a spike.
Cell type classification
In the dorsal striatum, putative fast-spiking interneurons (FSIs) were identified as having a waveform trough half-width of less than 100 μs with baseline firing rate of more than 10 Hz, and putative cholinergic interneurons (TANs) were identified as those with a waveform trough halfwidth more than 250 μs (Supplementary Fig. 3
, also Supplementary Fig. 16
). All other units were classified as putative projection neurons (MSNs)7
In substantia nigra, putative dopaminergic neurons were classified based on the following criteria8,9
: low baseline firing rate (less than 10 Hz), specific waveform with wide action potential (half-width more than 350 μs), low negative - positive peak amplitude ratio (less than 0.4), and substantial (≥ 50 %) inhibition by the D2-selective agonist quinpirole; further validation of the classification criteria was performed using genetic and optogenetic tools (Supplementary Figs. 4–6
). The rest were classified as putative SN GABA neurons, which are most likely the SNr projection neurons, because the percentage of GABAergic interneurons in the SN is rather small10,11
. The burst firing in SN DA neurons was defined as two or more spikes occurring with an inter-spike interval of less than 80 ms and terminating with an inter-spike interval larger than than 160 ms12
. The burst set rate measured how many bursts occurred per second, and the percentage of spikes fired in bursts were thus calculated for each SN DA neuron. Optical stimulation in TH-ChR2 mice was performed every several minutes with either a train of 60 pulses of 10 ms duration delivered at 11 Hz, or a train of 30 pulses of 10 ms duration delivered at 14 Hz.
Lever press-related neurons throughout a session
Neural activity referenced to lever press onset was averaged in 20-ms bin, shifted by 1 ms, and averaged across trials to construct the peri-event histogram (PETH), which was the basis for analyzing amplitude and latency of press-related firing activity. Distributions of the PETH from 5000 to 2000 ms before lever press were considered baseline activity. We then determined which 20-ms bins, slid in 1 ms steps during an epoch spanning from 1000 ms before and after the event, met the criteria for task-related activity. A significant increase in firing rate was defined if at least 20 consecutive overlapping bins had firing rate larger than a threshold of 99 % above baseline activity, and a significant decrease in firing rate was defined if at least 20 consecutive bins had a firing rate smaller than a threshold of 95 % below baseline activity13
. The onset of press-related firing rate modulation was defined as the beginning of the first of 20 consecutive significant bins. The modulation period was defined as the time window from the beginning of the first of 20 consecutive significant bins to the final one of the consecutive significant bins13
. For across-session comparisons, the modulation rate for each press was normalized to the maximal firing rate in sequence, so the values range from 0 to 1 with larger numbers indicating stronger modulation.
Sequence initiation/termination related neurons
To determine whether a task-related neuron was sequence start/stop related or not, we generated six firing-rate distributions, each one based on the PETH of rate modulation period for a specific press within the sequence: namely the first, second and third or the third to final, second to final and final press of a sequence. Sequence start/stop related neurons were defined as those where the mean peak (or trough) firing-rate modulation of the first press (start), final press (stop), or both was significantly different from the peak/trough of the within sequence presses. Sequence middle-press selective neurons were determined in the same way by looking for those neurons that showed significantly different rate modulation for any of the middle press within the sequence. All data analyses were conducted in Matlab with custom-written programs (The MathWorks Inc., MA).
The statistics were performed (and averaged) on the values for each animal per session except for SN DA neurons because of the low number of neurons recorded per session/animal (In this case the average represents the neurons recorded from all animals for each session). One-way ANOVA and repeated measures ANOVA were used to investigate general main effects; and paired or unpaired t-test were used in all planned and post-hoc comparisons, except for SN DA neurons where a chi-square test was used. Statistical analyses were conducted in Matlab using the statistics toolbox (The MathWorks Inc., MA) and GraphPad Prism 4 (GraphPad Software Inc., CA).