|Home | About | Journals | Submit | Contact Us | Français|
Studies have shown that disruption of cannabinoid receptor (CB1R) signaling reduces operant responses for rewards; yet it is unknown whether changes in neural activity at dopamine terminal regions such as the nucleus accumbens (NAc) underlie these behavioral effects. To study the neural correlates that accompany the disruption of eCB signaling in a food-motivated task we recorded the neural activity and local field potentials (LFPs) from the NAc, core and shell subregions. A within-subject design was utilized for recordings as rats engaged in lever pressing behavior for sucrose chocolate flavored pellets delivered responding in a progressive ratio schedule (PR) of reinforcement. Rats were food restricted to 85±5% of their free body weight and trained under a PR until a stable breakpoint was observed (12 session ±3). Once performance was stable, recordings were made under baseline, vehicle and following administration of the cannabinoid inverse agonist rimonabant (150 µg/kg, i.v). NAc neurons encoded reward-predictive cues as well as food reward delivery. Rimonabant administration robustly reduced breakpoints in all rats tested; as previously reported. We found that this reduction is accompanied by a profound attenuation in the strength and coordination of specific event-related spiking activity. Moreover, rimonabant decreased LFP gamma power at 80 Hz (high gamma) at reward delivery and gamma power at 50 Hz (low gamma) at cue onset. Taken together the present results indicate that the eCB system sculpts neural activity patterns that accompany PR performance and reward consumption.
Endogenous cannabinoids (eCBs) and in particular CB1 receptors (CB1Rs) play a key role in the modulation of reinforcement processing (Cota et al., 2003; Le Foll, 2004; Melis et al., 2004; Robbe, 2002, Oleson et al., 2012). eCBs are released “on demand” and act in a retrograde manner. They activate CB1Rs located presynaptically on excitatory and inhibitory terminals (Alger, 2002; Willson and Nicoll, 2002). CB1Rs negatively modulate transduction pathways through coupling to G proteins; their activation initiates several transduction mechanisms including the activation of potassium channels and of MAP kinase, as well as the inhibition of voltage-dependent calcium channels and of adenylyl cyclase; all leading to a decrease in the probability of neurotransmitter release (Bidaut-Russell et al., 1990; Henry & Chavkin, 1995; Twitchell et al., 1997; Hoffman & Lupica, 2000). These mechanisms play a regulatory role in the excitability of neurons located in several nuclei of brain reward pathways including those that project to the ventral tegmental area (VTA) and the NAc (Domenici, 2006, Oleson et al., 2012). Activation of CB1Rs in the VTA modulates dopamine neuronal activity (Cheer et al., 2000, 2003; Lupica & Riegel, 2005; Szabo, Siemes, & Wallmichrath, 2002), as well as drug and cue-evoked dopamine release in the NAc (Cheer et al., 2007; Li et al., 2009, Oleson et al., 2012).
At the behavioral level, the modulation of reinforcement processing by eCBs has been demonstrated using multiple behavioral paradigms and rewards (Valjent et al., 2002; de Vry et al., 2004; Soria et al., 2005, Filip et al., 2006; Shoaib, 2008). A schedule of reinforcement that has been consistently used is the progressive ratio (PR); here, the requirement to obtain a single reward is exponentially increased within a single session until responding ceases. The ratio at which this occurs is called “breakpoint” (Hodos, 1961). When exogenous as well as eCBs agonists are tested in this schedule, increased breakpoints are observed (Gallate et al., 1999; Higgs et al., 2005; Solinas et al., 2005; Ward and Dykstra, 2005, Oleson et al, 2012); whereas CB1R antagonists have the opposite effect (Solinas and Goldberg, 2005; Ward and Dykstra, 2005; Xi et al., 2007; Rasmussen and Huskinson, 2008).
In spite of abundant evidence supporting the role of CB1Rs in the modulation of reward processing as well as electrophysiological evidence that manipulation of CB1Rs alters neural activity in different reward circuit nuclei, little is known regarding how these behavioral changes accompany real-time neural activity of the NAc, a limbic-motor interface (Mogenson et al., 1980). Different lines of evidence point towards specific roles played by NAc neurons in different aspects of reward processing (Cardinal et al., 2002; Berridge, 2007) and motivational control of behavior (Salamone and Correa, 2002). These processes are assumed to bias behavioral output and action selection towards the highest immediate subjective value (Nicola et al., 2004a; Samejima et al., 2005; Kim et al., 2009). Here we identified electrophysiological changes that occur in the NAc of rats engaged in a PR schedule, using food as a reward, while eCBs signaling was pharmacologically impeded with the use of a CB1R antagonist.
Six (6) male Sprague-Dawley rats (Charles River Laboratories, Wilmington, MA) with indwelling jugular vein catheters were used. Rats were individually housed in a temperature and humidity controlled room with a 12-h light-dark cycle (lights on at 07:00 h). Animals were stereotaxically implanted, under isoflurane anesthesia with two 16 microwire arrays of electrodes made of TeflonÆ insulated stainless steel (0.25 mm interelectrode space, 0.5 mm inter-row space; Micro Probe Gaithersburg, MD). The arrays were aimed at the NAc with the center of each array lowered at the following coordinates from bregma (1.7 AP, 1.5, ML and −8.0 DV from dura). Electrodes were fixed to the skull with acrylic cement secured with stainless steel bone screws. A stainless steel wire from each array served as a ground electrode and was inserted caudal to the arrays in the midbrain/cerebellum interface (a region where gamma oscillations were not observed). Rats were allowed to recover for at least 10 days, during which time they received food and water ad libitum. After that period rats were food restricted at 85% ± 5% of their free-feeding weight before initial training and were maintained around this weight throughout the experiment. All procedures were carried out in accordance with established practices as described in the NIH Guide for Care and Use of Laboratory Animals. In addition, all procedures were reviewed and approved by the Animal Care and Use Committee of University of Maryland School of Medicine.
SR141716A (Rimonabant) was provided by the National Institute on Drug Abuse Drug Supply Program (Research Triangle Park, Raleigh, NC) dissolved in a solution of (1:1:18) ethanol, Emulphor (Rhodia, Cranbury, NJ), and saline and injected intravenously (i.v.) at 0 and 150 µg/kg; a dose selected on its inability to alter locomotor activity (Cippitelli et al., 2005).
Experiments were conducted in rat operant conditioning chambers (12.5" L × 13.5" W × 13.5 H; Med Associates, Georgia, VT) located within ventilated sound attenuation chambers. The operant boxes were equipped with a house light, two cue lights above two retractable levers (Couldbourn Instruments, Whitehall, PA), a modular pellet dispenser and receptacle, a sonoalert module that delivers a 2900 Hz tone, and a white noise amplifier. Rats were initially trained under a fixed-ratio 1 with an inter-trial interval of 10 seconds. During the trials and throughout the experiment both retractable levers were present but only one was associated with an illuminated cue light and reward delivery (active lever). Responses on the other lever (inactive lever) were recorded but did not have any scheduled consequences. A trial began with the cue light on top of the active lever, the house light on and the extension of the active and inactive levers. Once the rat pressed down the active lever both levers retracted and a 45 mg sucrose chocolate flavoured pellet (Bio-Serv, Frenchtown, NJ) was delivered, cue light and house light extinguished and a 2900 Hz tone started, these stimuli will be referred as reward associated cues. At the end of the 10-second inter-trial interval, the tone was turned off and a new trial began. Once rats reliably pressed the lever on 3 consecutive days (earned more than 280 pellets in one-hour session) they were considered lever-trained. A PR schedule was then implemented in which the number of responses required to obtain a food pellet increased for successive reinforcers. The progression was derived from the equation: response ratio = [5 × e(0.2 × reinforcer number) − 5], and yielded ratios of 1, 2, 4, 6, 9, 12, 15, 20, 25, 32, 40, 50, 62, 77, 95, 118 etc. following rounding to the closest integer. A trial was defined by the start of a new response ratio and it ended when the reward was delivered upon completion of the response ratio schedule in effect. As with the FR1 schedule a trial began with the cue light on top of the active lever, the house light on and the extension of the active and inactive levers. Once the reward was delivered, cue light and house light extinguished and a 2900 Hz tone started. At the end of the 10-second inter-trial interval, the tone was turned off and a new trial began. Sessions lasted until a period of 20 min had elapsed without earning a food pellet. The number of food pellets earned before each respective breakpoint was defined as the highest completed ratio, and the total number of responses made was recorded. Testing began when breakpoints did not vary by more than 1 ratio under or over the previous session for at least 3 sessions. Once rats had stable breakpoints, sequential progressive ratio schedules were run on a single day. The rats were treated with vehicle and rimonabant (150 µg/kg i.v.) injected 1 minute before the recording of the neural activity of the corresponding condition and the start of the schedule.
Neural activity was recorded using commercially available hardware and software, including headstage and programmable amplifiers, filters and multichannel spike sorting software (Plexon Instruments, Dallas, TX). Discrimination of individual units was performed off-line using principal component analysis of waveform shape, followed by manual confirmation of sorting validity. Signals were routed to a differential preamplifier (fixed 50 × gain; Plexon Inc.) and a relayed Multineuron Acquisition Processor (Plexon Inc.) which allows for computer-controlled, channel-specific signal amplification (gain steps 1–30, total gain 1000 × to 32000 ×), filtering (second-order 500 Hz low cut, 5 kHz high cut) and analog to digital conversion (32 simultaneous sampling 12-bit converters, 75 kHz). Single units were identified by constancy of waveform shape, autocorrelograms and interspike interval. The spiking frequency, amplitude of the spikes as the maximal peak-valley difference and the full width at half maximum for valley were used to help in the classification between fast-spiking interneurons (FSI) and medium spiny neurons (MSN) (Burkhardt, Jin & Costa, 2009).
Cross-correlograms were constructed using 160 bins (bin width: 0.005 sec) within a time range of ± 0.4 sec (erected using the same number of spikes per condition for bias correction). Peaks and troughs in cross-correlation histograms were initially identified through visual inspection and then confirmed by using Z scores of the peaks and troughs, the data contained 0.1 sec before histogram shoulder was used for the background calculation. Those peaks and troughs with Z scores above/below 2.58/−2.58 (p=0.01) were selected for analysis as a reliable indicator that a non-flat cross-correlation existed.
LFPs were recorded simultaneously with single unit activity from the same micro-wire arrays and acquired at 1 kHz. LFP signals were referenced to ground on a skull screw in the contralateral occipital bone. All raw LFP voltage traces and whole-session power spectral densities (PSDs) were visually inspected to confirm recording quality prior to further analysis. Sorted files were analysed using NeuroExplorer (Plexon Instruments, Dallas, TX) and custom developed code. Neural activity was initially characterized via perievent rasters. A change in firing rate was considered significant if the neural response changed more than 2 standard deviations from baseline firing rate. For LFP analysis, the multitaper method was used to estimate frequency spectra (Pesaran et al., 2002). Spectrograms were constructed by plotting spectral power during a series of overlapping constant-width time windows. The data contained 2 sec before the peak or through shoulder was used for Z score background calculations. Spike-field coherence was performed using the Sigtool routine (Lidierth, 2009) a bootstrapping sampling procedure (Efron, 1979) implemented in Matlab (The MathWorks, Natick, MA), was used to generate 1000 samples with replacement of the spike-field coherence values and compute 95% confidence intervals. Figures were prepared in Origin (OriginLab Corporation, Northampton, MA) or Matlab (The MathWorks, Natick, MA).
Repeated measures analysis of variance (ANOVA) was used to determine statistical significance for the change in breakpoint and Tukey’s honestly significant difference (HSD) post-hoc test was used to evaluate any difference between the treatments. The firing rate of each neuron was normalized using Z scores, in which the firing rate of the 7 seconds before the event of interest was used to obtain the average and standard deviation of the firing rate. The normalized data was used for the statistical analysis of the neuronal firing rates.
In order to classify neurons as lever press excited (LPE) or lever press inhibited (LPI) using integrated firing rate histograms following Nicola and Deadwyler’s (2000) classification we divided the firing rate and lever press responses in 180 second windows and for each lever press bout or pause, the average firing rate at that window was contrasted against the previous one; so that the firing rate could be classified as increased or reduced. This produced a 2 × 2 table for which a χ2 test was calculated. Significant χ2 values (p ≤ 0.05) were taken as an indication of a non-random distribution between the variables and on those tables the coefficient of association was calculated. The neurons with a positive coefficient were classified as LPE; whereas those with a negative coefficient were classified as LPI.
For perievent raster and local field potential analyses paired Student’s t-tests were used to contrast peaks and troughs against vehicle and baseline conditions. All statistical tests were calculated using Statistica (Statsoft, Tulsa, OK).
During baseline and vehicle conditions rats reached breakpoints following an average of 14 (S.E.M =0.45) and 14 (S.E.M = 0.87) pellets respectively. These correspond to an average breakpoint or last response ratio completed during baseline of 78.2 (S.E.M =7.39) and during vehicle of 89.8 (S.E.M =16.85). Rimonabant (150 µg/kg, i.v.) robustly reduced the breakpoint in all animals tested, average breakpoint of 18.4 (S.E.M =6.30) with the corresponding reduction in the number of pellets earned [6.8 (S.E.M =1.62)]. Repeated measures ANOVA showed a significant effect of rimonabant across conditions (F(2,10) = 12.47, p < 0.001). Post-hoc comparisons showed that breakpoints obtained under rimonabant were significantly lower than those obtained during baseline and vehicle (Fig. 1).
Neurons in the NAc are compromised mostly by medium spiny neurons (MSNs), which constitute ~95% of the total neurons in this structure (Kemp and Powell 1971), with the rest of neuronal population consisting mainly of cholinergic interneurons (Armstrong et al.,1983), persistent low-threshold spiking interneurons (Bevan et al., 1998), and fast-spiking interneurons (FSI) (Berke et al., 2004; Lansink et al., 2010; Morra et al., 2010). In behaving animals most of the neurons that show a correlation with operant tasks are believed to be MSNs; whereas FSIs do not show a precise spike synchronization with different events in operant tasks (Berke, 2008), but they may control LFP oscillatory power (Berke et al., 2004; Berke, 2005; van der Meer et al., 2010). Given these differential characteristics between MSNs and FSIs it was necessary to distinguish these populations of neurons. A total of 133 (mean number of neurons per rat = 26.6; SEM=4.03) neurons were recorded in the NAc during the progressive ratio task; of those neurons 7 (5.26%) were classified as FSI and 125 (93.98%) as MSN and 1 as other (0.75%). The Characterization of neuronal types was done based on waveform shape (Fig. 2a table 1); putative FSIs were identified as having a valley full width at half maximal (FWHM) equal or inferior than 150 µs, and a firing rate of more than 7.5 Hz. The rest were treated as putative medium spiny neurons. The neuron with a valley width of more than 500 µs was classified as other. Representative FSI and MSN waveform shapes are presented in figures 2b and 2c respectively.
Nicola and Deadwyler (2000) showed that the firing rate of neurons in the NAc can be broadly classified into lever press-excited (LPE) and lever press-inhibited (LPI) when subjects are working for cocaine under a progressive ratio schedule. We aimed to verify if this classification applied for rats working for food pellets and whether these distinctive firing rates encompassed a CB1R-mediated component. Analysis of integrated firing rate histograms identified 21 neurons (16%) as LPE and 14 neurons (11%) as LPI whereas the remaining neurons did not show any recognizable firing patterns during the task, consistent with previous studies (Carelli and Deadwyler, 1994; Carelli et al, 2000). None of the neurons showing LPE or LPI firing patterns were identified as FSI. The firing rates of LPE cells increased gradually across trials with increases in firing before the onset of each bout of lever pressing (Fig. 3). The highest overall firing rate was observed at breakpoint. During vehicle, peak firing rates for LPE neurons averaged 1.99 Hz (S.E.M=0.59); following rimonabant this peak did not significantly change 2.16 Hz (S.E.M= 0.78; t(20) = −0.61; p=0.54). In contrast, the firing rates of LPI cells decreased across trials. Specifically, these neurons showed a decrease in activity at the beginning of the session which was sustained for the duration of responding (Fig. 4). Rimonabant significantly reduced activity prior to the beginning of the session compared to vehicle (t(13)=−3.42; p= 0.004) from 1.43 (S.E.M.=0.29) to 1.08 Hz (S.E.M.=0.24).
In order to analyze time-locked firing patterns within the task, we examined activity at reward delivery and its associated cues, and cue onset at the beginning of each trial. Table 2 shows the percentage of neurons showing different responses types to these events. Of the 133 neurons recorded during the task, 78 (58.64%) exhibited no patterned activity. The remaining 55 cells (41.35 %), all MSNs, exhibited excitatory or inhibitory firing patterns relative to cue presentation or reward delivery. Specifically, 36 (27%) neurons exhibited patterned activity relative to reward delivery; of these 27 (20%) showed a transient increase in firing rate, 5 (3.75%) showed a sustained increase, and 4 (3%) showed a sustained inhibition. 19 (14.3%) neurons exhibited patterned activity relative to the cue presentation; of these 10 (7.5%) showed a transient increase in firing rate, 4 (3%) showed a sustained increase, and 5 (3.75%) showed a sustained inhibition.
A three dimensional representation of the perievent rasters of representative neurons responsive to reward delivery and its associated cues following vehicle and rimonabant administration, along with the normalized population activity graph are shown in figure 5. Neural activity is depicted 8 seconds before and after reward delivery, which is set at 0 across all trials. The translucent rectangle marks the injection of rimonabant. Figure 5a depicts a representative neuron that showed a transient increase in firing during vehicle at reward delivery and a profound decrease in this transient activity was observed following rimonabant. The decrease in transient activity is evident when the mean normalized firing rate from the 27 neurons that displayed phasic excitation upon reward delivery is plotted (Fig. 5b). Rimonabant significantly attenuated the peak observed when contrasted against vehicle (t(26) =3.510; p= 0.0008).
Figure 5c shows a representative neuron that showed a sustained increase in firing rate after reward delivery and its associated cues, and the effect of rimonabant in this particular type of patterned discharge (Fig. 5d). Paired t-tests carried out for the collapsed data before and after the reward delivery shows that there is a significant difference before and after the reward when the firing rate is recorded under vehicle (t(4) = −9.460; p= 0.0006) which remains unchanged following rimonabant (t(4) = −2.262; p= 0.086). However, when firing rates after reward delivery are compared, a statistical difference (t(4)= 10.441; p= 0.0004) is observed, confirming that rimonabant significantly reduced the post-reinforcement excitation.
Finally, a subset of neurons showed a time-locked decrease in firing at reward delivery and its associated cues. In these neurons rimonabant failed to significantly alter the observed decrease (fig. 5e shows a representative neuron). Firing rate after reward delivery for this population of cell was not significantly different between vehicle and rimonabant (fig. 5f; t(3)= −2.177; p= 0.117).
The three dimensional representation of the perievent rasters of representative neurons responsive to the cue that indicates the beginning of the trial during vehicle and rimonabant treatment, along with a normalized Z-score of population neural activity graph are shown in Figure 6. As before, the translucent rectangle in the 3D graphs marks the injection of rimonabant. Figure 6a shows a representative neuron displaying a brief cue-evoked increase in firing at the cue. CB1R blockade significantly attenuated this pattern compared to vehicle (Fig. 6b; t(9) =6.010; p= 0.0002).
CB1R blockade also disrupted neurons showing a sustained increase in firing rate observed after cue onset (figs. 6c and 6d; t(3)= 3.365; p= 0.0435). This significant effect indicates that the sustained firing rate obtained under rimonabant is lower than that observed following vehicle administration. Lastly, rimonabant treatment altered the firing rate decrease observed after cue onset compared to vehicle; such that there was no differentiation in firing activity before or after cue onset (t(4)= −1.310; p = 0.130). Such effect is evident in the representative example (Fig. 6e) as well as in the mean normalized firing rate graph for the pooled population data (t(4)= −13.4737; p = 0.0001; fig. 6f).
Functional connectivity among simultaneously recorded neurons was analyzed via pair-wise cross-correlations (Fig. 7). For each subject, each of the recorded neurons was used as a reference and cross-correlated with the rest of the neurons (all cell-pair combinations were analyzed). During vehicle there were 29 cross-correlograms with a peak (mean Z score peak= 4.85; SEM= 0.57) and 6 with a trough at time 0 (mean Z score trough = −4.38; SEM= 0.42). Following rimonabant administration there was a significant reduction in the peak of the population cross-correlogram (mean Z score peak = 2.54 SEM= 0.37; Fig. 7a; t(28)= 4.55 p= 0.0001). A reduction in the population trough was also observed following rimonabant (mean Z score trough = −2.43 SEM= 1.031; Fig. 7b); but was not statistically reliable (t(5)= 0.73; p = 0.49). None of the cross-correlograms analyzed that utilized FSI neurons as a reference yielded significant cross-correlations with either MSNs or FSIs.
LFP oscillations are believed to reflect the organization and synchronized activity between different brain areas related to different cognitive and motor processes (Buzsáki, 2006; Sanes and Donoghue, 1993). Here we measured LFPs with the same electrodes used to record action potentials. We constructed session-wide spectrograms to analyze changes in LFPs before and after the administration of rimonabant (Fig. 8a). Rimonabant changed the frequency at which gamma power peaked and attenuated overall power in the gamma band. Specifically, the peak of the gamma band shifted from 64.11 Hz (S.E.M.= 2.38) during vehicle to 55.84 Hz (S.E.M.= 1.49) following rimonabant (t(4)= 2.99 p=0.0402). Figure 8b shows the distribution of power across all spectral bandwidths during the task obtained after vehicle (blue) and rimonabant (red) administration. The only significant changes were observed in the gamma bandwidth (55–100 Hz; t(4)= 2.93 p=0.0424).
Gamma oscillations are hypothesized to couple to FSIs in the striatum (Berke, 2005; Kalenscher et al., 2008; van der Meer, 2009). We analyzed these neurons to calculate spike-field coherence to determine whether LFPs were partially affected by the CB1R antagonist. We found, as previously reported (Berke, 2009; van der Meer and Redish, 2009), that there were two groups of FSI neurons one of which changed their firing rate differentially with “gamma-50” (45–55 Hz) and the other with “gamma-80” (70–85 Hz) powers (Figs. 8c–d). Following rimonabant administration, FSIs lost their synchronous firing to these frequency bands and became coherent to multiple frequencies.
Perievent spectrograms showing the frequencies of the low and high gamma spectral bands were constructed around cue presentation (Figs. 9a–d) and reward delivery (Figs. 10a–d). The different bands were studied at cue onset for either vehicle (Fig. 9a), or rimonabant (Fig. 9c). Baseline-subtracted spectrograms depicts time-locked changes in power at the different spectral bands during vehicle (Fig. 9b), or rimonabant (Fig. 9d). Mean “gamma-50” (45–55 Hz) power Z-scores increased slightly following cue onset during vehicle. Under rimonabant this increase disappears and instead a through in “gamma-50” power is observed (Fig. 9e). This decrease is confirmed when the Z values of the troughs observed after cue onset, from the baseline subtracted powers, are contrasted (t(4)=3.34 p=0.002).
Figure 10a depicts spectrograms obtained during vehicle administration at reward delivery and its associated cues. A significant increase in power is observed around high gamma immediately following reward delivery (t(4)=3.20 p=0.03; Fig. 10b). Rimonabant administration (Fig. 10c), significantly reduced high gamma after reward delivery and its associated cues (t(4)=2.31 p=0.02, Fig. 10d). Mean “gamma-80” (70–85 Hz) power Z-scores observed are transiently elevated following vehicle and this increase is completely abolished by rimonabant administration (Fig. 10e). Analysis of gamma power peaks, obtained from the baseline subtracted powers, confirms a highly significant statistical difference between vehicle and rimonabant (t(4)= 3.80; p=0.01).
The present research replicates and extends knowledge from previous studies showing that blockade of CB1Rs potently reduces breakpoints in the PR schedule (Ward and Dykstra, 2005; Maccioni et al., 2008). We further find that this behavioral effect is accompanied by profound changes in neural encoding related to food delivery, its associated cues and to cue onset in the NAc.
Our results confirm previous categorizations of session-wide NAc activity during PR (LPE or LPI cells; Nicola and Deadwyler, 2000). The reduction in breakpoints observed following CB1R blockade was accompanied by a modest effect on LPE cells and a reduction of the initial peak of LPI cells. Nicola and Deadwyler (2000) suggested that the increase in firing rate observed in LPE neurons is correlated with the abolishment of the operant response. Their interpretation fits our data in that the breakpoint was followed by the highest increase in firing rate during which responding was absent. If the firing increase leads to response cessation; then rimonabant should not affect this firing increase as our data demonstrate. Nicola and Deadwyler (2000) proposed that the firing rate increase observed in LPE neurons is controlled by DA release within the NAc due to fluctuating intracerebral cocaine concentrations. However, our recordings suggest that LPE patterns can emerge in the absence of cocaine. The intrinsic increase in the inter-reward interval brought by the PR schedule may be sufficient to produce an increase of DA levels as has recently been shown in voltammetric experiments (Wanat et al., 2010); this would allow LPE cells to change their activity and ultimately lead to decreased responding. It is likely that the decrease in breakpoint observed was in part product of the inhibitory effect that CB1R blockade has over DA release (Cheer et al., 2007; Melis et al., 2007; Trujillo-Pisanty et al., 2011, Oleson et al., 2012). CB1R antagonists can decrease DA release by affecting the binding of eCBs to CB1Rs present on glutamatergic terminals located in several nuclei that send glutamatergic projections to the VTA (Gerisler, Derst,Veh and Zahm, 2007) and control its bursting activity (Riegel and Lupica, 2004); or by interfering with eCBs binding to presynaptic CB1Rs on GABA terminals in the VTA (Cohen et al., 2002; Szabo et al., 2002; Lupica and Riegel, 2005). The significant reduction observed in the initial peak on LPI cells after rimonabant administration could underlie the decreased reward density that is observed at the beginning of the session, and this could be tied to a reduction of DA release following reward delivery.
We observed patterned activity to specific events in the task (reward delivery and the cue that signals the beginning of the trial) as previously reported (Nicola and Deadwyler, 2000). In particular, transient and long lasting excitations as well as inhibitions that coincided with reward delivery and cue onset were observed. Such patterns of activation have been described before in rodents performing different appetitively motivated tasks (Hollander et al., 2002; Nicola et al., 2004b, 2004a; Roesch et al., 2009; Cacciapaglia et al., 2011). The neural excitation observed following reward delivery is believed to encode the unconditional properties of the reward; in particular short phasic responses have been associated with reward characteristics such as palatability and magnitude (Cromwell and Schultz, 2003; Taha and Fields, 2005). Functionally, this excitation has been suggested to provide information about the motivational significance of reward to other circuits (Apicella, Ljungberg, Scarnati, & Schultz, 1991). On the other hand, it has been suggested that the inhibition observed during reward delivery facilitates disinhibition of target neurons in motor areas related to consummatory behaviors (Roitman et al., 2005; Nicola et al., 2004b; Taha and Fields, 2006; Krause et al., 2010). The firing changes correlated with cue presentation have also been linked to different functional aspects of reward-directed behavior. The transient increase in firing rate observed with cue onset is hypothesized to encode the learned significance of the reward and the potential consequences of the behavior (Hassani et al., 2001; Cromwell and Schultz, 2003; Roesch et al., 2009). Therefore, these neurons are thought to encode incentive information (Smith et al., 2011; Nicola et al., 2004a). Sustained increases in firing rate have been suggested to serve in maintaining a representation of the anticipated reward (Roesch et al., 2009). Time-locked inhibitions, such as those reported here, have been proposed to gate goal-directed behaviors. Specifically, the activity of a subpopulation of NAc neurons is believed to tonically inhibit behavior and suppression of activity in this particular population is necessary for operant behavior to occur (Taha and Fields, 2006; Krause et al., 2010).
Blockade of CB1Rs produced profound changes in the firing patterns associated with each of these functional roles. Rimonabant decreased the excitation peaks related to the unconditional properties of the reward as well as of the magnitude of peaks related to incentive salience. It also affected neural activation related to the representation of the anticipated reward, and produced a change in inhibitory signaling related to aspects of consummatory and reward directed-behavior. The blockade of eCB signaling by rimonabant was not restricted to individual neural firing patterns. It significantly weakened the connection strength between neighboring neurons within the NAc, specifically those showing positive cross-correlations. The reduction in the strength of functional connectivity could underlie the alteration of firing patterns, probably through an increase in neural noise among NAc ensembles. The lack of negative cross-correlation between FSI and MSNs has been previously reported (Berke, 2010) and this could mean that not all recorded neurons are connected (this may also be a result of our limited data set for FSIs).
Gamma oscillations are believed to serve as temporal structures that allow the alignment of spike trains within and between brain areas (Buzsáki, 2006; Kalenscher et al., 2010) and thus facilitate the integration of different afferents into NAc firing patterns (Kalenscher et al., 2010). LFPs can be the product of electrical field changes produced in other brain areas rather than the one in which the electrodes are located (Berke 2005). However, and in accordance with prior studies, our data shows that LFPs recorded from the NAc are in part controlled by FSIs (Berke, 2005; van der Meer and Redish, 2009; Kalenscher et al., 2010, Morra et al., 2012). “Gamma-80” has been previously linked to reward delivery (Berke, 2004; van der Meer and Redish, 2009); whereas “gamma-50” has been linked among other functions to movement initiation (Masimore et al., 2004; for a review see van der Meer et al., 2010). CB1R blockade altered the synchronicity between FSIs and these bands; expanding the range of frequencies at which they entrained. Consequently, the communicational bridge among different brain structures and neural ensembles that were coordinated by these gamma bands and possibly facilitated the organization and control of the behavior was disrupted. Evidence of this disruption is evident when rimonabant, along with the reduction in breakpoints produced a decrease in gamma-80 that was coincident upon reward delivery and a decrease in gamma-50 that was coincident with the cue signaling the start of the trial.
The changes in activation patterns and connection strength shown here reveal the heterogeneity of neural ensembles that are contained in the NAc; and shed light on the surprising uniformity of effects that the blockade of CB1Rs has on those ensembles. eCBs work as retrograde messengers and contribute to short-term and long-term modulation of synaptic transmission via presynaptic mechanisms. Blockade of CB1Rs could have affected neural encoding via perturbations of glutamatergic or GABAergic siganling. Glutamatergic afferents to the NAc; including the in amygdala, the cortex, and the hippocampus may have compromised glutamatergic transmission at the synapse at NAc when responding occurred under rimonabant (Robbe, 2002; Domenici, 2006). Likewise, blockade of CB1Rs could have prevented eCB-mediated reduction of GABA release in the NAc (Hoffman and Lupica, 2001). The reduction of inhibition in GABAergic neurotransmission by eCBs is believed to facilitate the responsiveness to excitatory cortical inputs and increase NAc output pathways (Hoffman and Lupica, 2001). The interference of eCB signaling by rimonabant would have probably interfered with this inhibitory mechanism making NAc neurons less responsive to cortical inputs. Indirectly the blockade of CB1 signaling on excitatory and inhibitory synapses alters the excitability of DA neurons in the VTA and DA release in the NAc; which in turn affects the synaptic efficacy of MSNs (O'Donnell and Grace, 1995, 1998). Accordingly, previous research has shown that inactivation of VTA DA cell bodies or injections of D1 or D2 receptor antagonists into the NAc significantly reduced the peak and inhibition firing patterns associated with incentive cues (Yun et al., 2004; Cheer et al., 2007) as well as rewards (Cacciapaglia et al., 2011).
The present results demonstrate that the eCB system plays a fundamental role in the encoding of reward and incentive cues at the level of the NAc, by allowing for neural synchrony and rhythmicity patterns to emerge during reinforcement processing. Furthermore, our findings suggest that eCBs are a highly conserved regulatory system that fine tunes the function of limbic and motor inputs to organize motivated behaviors.
This research was supported by NIH grants DA022340 and DA025890 to JFC and a FRSQ fellowship to GH. The authors would like to thank Mr. Mike Beckert for expert technical assistance and Drs Joshua Berke, David Redish, Matthijs Van Der Meer and Carien Lansink for valuable comments.
Any Conflict of Interest: none