We analyzed the activity of DRN neurons using two tasks with biased reward schedules: a memory-guided saccade task (1DR-MGS, ; 17 neurons from monkey E, 67 from monkey L) and a visually-guided saccade task (1DR-VGS, ; 96 neurons from monkey E, 71 from monkey L). Because the biased reward schedule was introduced in blocks, on each trial the animal could predict the reward value based on the location of the target cue (). Indeed, saccadic reaction times were significantly shorter for large-reward than small-reward trials in both monkeys in both tasks (Supplementary Table 1
, see also ).
The electrode was directed to the DRN through a recording chamber which was implanted over the midline of the parietal cortex. During the initial survey of DRN, the following brain structures were identified and used as landmarks: superior colliculus with receptive fields in the upper visual field with large eccentricities, inferior colliculus with auditory responses, mesencephalic trigeminal nucleus with responses to mouth movements, the locus coeruleus with phasic responses to salient sensory stimuli, and trochlear nucleus with increased firing during downward eye movements. We analyzed neurons located 0–2 mm anterior to the trochlear nucleus.
Traditionally, it has been accepted that serotonin neurons fire broad spikes spontaneously in a slow and regular ’clock-like’ firing pattern (Aghajanian et al., 1978
; Sawyer et al., 1985
; Jacobs and Fornal, 1991
; Hajos et al., 1998
). Therefore, we computed the baseline firing rate, spike duration, and regularity of sampled neurons (see Methods). The baseline firing rate across neurons ranged from 0 to 22 spikes/s with a mean of 4.9 spikes/s (SD 4.3, median 4.0). The spike duration ranged from 1.0 ms to 3.7 ms (mean 2.2 ms, SD 0.58 ms). Different methods have been used to quantify the regularity of neuronal firing (Shinomoto et al., 2003
). In this paper we used the irregularity metric ‘IR’, which was the median value of the differences between adjacent inter-spike intervals during the whole task period (Davies et al., 2006
, see Methods). Smaller IR values indicate more regular firing. There was no significant difference in IR value between 1DR-MGS and 1DR-VGS (Wilcoxon signed rank test, p=0.79, Supplementary Fig 1A
). The IR values for the DRN neurons we sampled were significantly smaller - i.e., more regular - than those for putative projection neurons in the caudate nucleus (p<0.0001) and putative dopamine neurons in the substantia nigra pars compacta (p=0.02, Supplementary Fig 1B
). Among DRN neurons, there was no significant correlation between IR values and spike duration (p=0.4, Spearman rank correlation) or baseline firing rate (p=0.05).
Reward-dependent modulations in DRN neuronal activity
DRN neurons exhibited task-related modulations with distinctive features during the performance of the 1DR-MGS. Most notably, DRN neurons often showed reward-dependent modulations in activity after reward onset. shows a representative example. This neuron was characterized by long spike duration (2.76 ms), low baseline activity (2 Hz), and regular firing (median IR 0.31). The neuron exhibited an increase in activity after the onset of the fixation point (FPon) followed by regular and tonic firing until reward onset. The activity further increased after the onset of a large reward but ceased after the onset of a small reward. This modulation occurred regardless of the direction of the saccade, and lasted for 860 ms after reward onset (permutation test, p<0.05, see Methods). Such reward-dependent modulations during the post-reward period lasted longer for other DRN neurons. For example, the neuron in was also characterized by long spike duration (2.6 ms), low baseline activity (6 Hz), and regular firing pattern (median IR 0.50). For both saccade directions, there was a long-lasting decrease in activity starting 400 ms after the onset of large reward (permutation test p<0.05). The activity of the neuron in (baseline firing rate 3Hz, spike duration 1.9 ms, IR=0.47) was significantly stronger for large than small reward trials starting 800 ms to 1500 ms after reward onset. The neuron in (baseline firing rate of 10Hz, spike duration 1.4 ms, IR=0.48) also exhibited a long-lasting reward effect starting around the time of reward offset. Note that in all of these examples the post-reward modulations of activity disappeared before the next trial started (Supplementary Fig 2
Fig. 2 Activity of four neurons (A–D) in the dorsal raphe nucleus (DRN) in 1DR-MGS task. For each neuron, action potentials are shown by raster plots in chronological order of trials, separately for leftward and rightward saccades. The changes in firing (more ...)
In some neurons reward dependent modulations were also observed before reward onset during the delay period. The neuron in exhibited stronger activity on small reward than large reward trials (p=0.8×10−6). The neuron in also exhibited stronger activity on small than large reward trials, but only when leftward saccades were required (two way ANOVA, reward effect, p=0.005; interaction, p=0.02). Such direction selectivity, however, was relatively rare among DRN neurons.
Reward-dependent modulations in activity during the delay and the post-reward periods, as shown in the example neurons in , were commonly observed in the population of DRN neurons. illustrate the time course of these modulations using receiver operating characteristic (ROC) analysis, by comparing each neuron’s firing rate for each task condition to the baseline activity during 400ms before fixation onset. During the delay and post-reward periods of the task, many DRN neurons had tonic increases in activity (shown in warm colors) or decreases in activity (cool colors).
shows the time course of reward selectivity, using ROC analysis to compare each neuron’s activity between large- and small-reward trials. shows a similar analysis for direction selectivity, comparing contraversive- and ipsiversive-saccade trials. The reward effect was present in many neurons during both task periods before (mainly the delay period) and after reward, while direction effects were uncommon.
The data in reveal a notable difference in the reward-dependent modulations between the pre-reward period and the post-reward period. For each neuron, the changes in activity during the pre-reward period, compared with the baseline activity, tended to be in the same direction on both large- and small-reward trials (). On the contrary, the changes in activity during the post-reward period, compared with the baseline activity, tended to be in opposite directions (). For example, for the neuron shown in , the pre-reward activity increased compared with the baseline on both large- and small -reward trials. On the other hand, the post-reward activity increased on large-reward trials, but it was inhibited on small-reward trials.
The main cause of the reward effect during the pre-reward period was that the changes in activity tended to be stronger on large-reward trials than on small reward trials, which is illustrated by the greater intensity of colors in than in . To quantify the trend, we computed the pre-reward activity as the firing rate during 400ms after target onset minus the baseline firing rate, and the results are shown in . Among 22 neurons (22/84, 26%) that showed significant reward effects during the pre-reward period, 20 neurons exhibited significant activity changes on large-reward trials whereas only 10 neurons did on small-reward trials. This tendency is illustrated by a wider distribution of the pre-reward activity on large-reward trials than that on the small-reward trials (marginal histograms in ). When the firing rate in the pre-reward period was compared between the reward conditions, 16 neurons showed higher firing rates on the large-reward trials than on the small-reward trials; the other 6 neurons showed the opposite pattern (two-way ANOVA, p<0.01).
Fig. 4 Comparison of DRN neuronal activity between large reward trials and small reward trials. In the scatter plot, the activity of each neuron in the two reward conditions is expressed as the change in firing rate from the pre-fixation period (duration: 400ms) (more ...)
Reward-dependent modulations were clearer and more prevalent in post-reward activity. Among 42 neurons (42/84, 50%) that showed significant reward effects during the post-reward period, 24 neurons showed changes in activity in opposite directions between large- and small-reward trials (data points in the upper-left and lower-right quadrants in ). When post-reward activity was compared between the reward conditions, 18 neurons showed a large-reward preference (i.e., higher firing rates on large-reward trials than on small-reward trials); the other 24 neurons showed a small-reward preference (two-way ANOVA, p<0.01).
As discerned from , some DRN neurons also exhibited changes in activity (1) after fixation onset: increases for 23/84 (27.4 %) or decreases for 12/84 (14.3 %) neurons (comparison between activity during 400ms before and 200ms after fixation onset, Mann-Whitney U test p<0.01), and (2) during the later fixation period: increases for 17/84 (20.2 %) or decreases for 20/84 (23.8 %) neurons (comparison between activity during 400ms before fixation onset and 800–400 ms before target onset, p<0.01).
Comparison of reward-dependent modulations between DRN and dopamine neurons
To understand the functional significance of the reward-related activity of DRN neurons, we compared it to the activity of dopamine neurons in the same two monkeys. For this purpose, we used a visually-guided version of the biased-reward saccade task (, ‘1DR-VGS’). We recorded from 167 DRN neurons (96 from monkey E, 71 from monkey L) and 64 dopamine neurons (20 from monkey E, 44 from monkey L).
The characteristics of the reward-dependent modulations in the activity of DRN neurons in 1DR-VGS were similar to those found in 1DR-MGS. Thus, many DRN neurons exhibited increases or decreases in tonic activity (usually increases) after the onset of the fixation point. These changes became more evident during the pre-reward period, after the onset of the saccade target which indicated the size of the upcoming reward. As in 1DR-MGS, changes in pre-reward activity occurred in the same direction on both large- and small-reward trials (), but tended to be greater on large-reward trials (), thus leading to differences in activity between the two reward conditions (). Among 44 neurons (44/167, 26%) that showed significant reward effects during the pre-reward period, 34 exhibited significant activity changes on large-reward trials (29 increase and 5 decrease) whereas only 15 did on small-reward trials (13 increase and 2 decrease).
Comparison of reward-dependent activity between DRN neurons and dopamine neurons. Data were obtained from 167 DRN neurons and 64 dopamine neurons using 1DR-VGS. The same format as .
Fig. 6 Contrasting effects of expected and received rewards on DRN neurons and dopamine neurons. (A–D) The same format as . (E and F) Relationship of reward preference between the pre-reward and the post-reward periods. For each neuron, (mean firing (more ...)
In the post-reward period, the same DRN neurons tended to exhibit opposite changes in activity (). Among 74 neurons (74/167, 44%) that showed significant reward effects, 40 neurons changed their activity in opposite directions on large- and small-reward trials (). About half (n=36) showed a large-reward preference, while the other 38 neurons showed a small-reward preference (two-way ANOVA, p<0.01). The direction of the reward preference was not always the same between the pre- and post-reward periods ().
The activity pattern of dopamine neurons was distinctively different from DRN neurons (). Dopamine neurons exhibited a phasic increase in activity after fixation onset, as reported by Takikawa et al for 1DR-MGS (Takikawa et al., 2004
). They also exhibited a phasic increase in activity after the onset of the target indicating an upcoming large reward () and a phasic decrease in activity after the onset of the target indicating an upcoming small reward (), leading to a strong and transient large-reward preference in the pre-reward period ().
In contrast to the pre-reward period, changes in the post-reward period were less clear in dopamine neurons. Small increases in activity were observed in some neurons after a large reward (), leading to weak reward effects (). Whereas 53 of 167 DRN neurons (31.7%) exhibited significant activation modulation long after reward (600–1000ms after reward onset, sign test, p<0.01), only 5 of 64 dopamine neurons (7.8%) did so. Thus the duration of the post-reward activity in dopamine neurons was shorter than that in DRN neurons (chi-square test, p<0.0001). Overall, most of dopamine neurons showed large-reward preference in the pre-reward period and some did so in the post-reward period ().
shows the proportions of neurons that exhibited significant reward and direction effects for both DRN and dopamine neurons. Statistical significance was determined using a two-way ANOVA for each task period (p<0.01). In both DRN and dopamine neurons, reward effects were more prevalent than direction effects. For DRN neurons, the large-reward preference was more common than the small-reward preference in the pre-reward period, while these kinds of preferences were equally common in the post-reward period. The reward effect was more robust among dopamine neurons. They predominantly showed the large-reward preference in the pre-reward period and less commonly in the post-reward period. The ratio of large- vs. small- reward preference was significantly different between DRN neurons and DA neurons (chi-square, p<0.0001 for both pre- and post-reward periods).
Fig. 7 Similarities and differences between DRN neurons and dopamine neurons. (A) Proportions of DRN neurons that showed significant reward-dependent modulations in activity during the pre-reward and post-reward periods in 1DR-VGS (two-way ANOVA, p<0.01). (more ...)
Changes of pre- and post-reward activity after the reversal of position-reward contingency
In both of our tasks, the contingency between target position and reward value was fixed during one block of trials, but was then reversed with no external cue. This allowed us to examine how the monkey’s performance and neuronal activity changed adaptively to the new position-reward contingency. As in previous studies from our laboratory, the saccadic reaction time changed quickly after the reversal of the position-reward contingency () (Lauwereyns et al., 2002
; Watanabe and Hikosaka, 2005
We therefore examined the time course of the changes in the activity of DRN and dopamine neurons (). We computed the mean normalized firing rates for the pre-reward period (0–400ms after target onset) and the post-reward period (400–800ms after reward onset for DRN neurons; 0–400ms after reward onset for dopamine neurons) as a function of the trial number after the reversal. To assess the speed of activity change after the reversal we tested whether the neuronal activity on each trial number was significantly different from the mean activity on the last five trials of the new block (Mann-Whitney U test, p<0.01). This analysis was restricted to neurons whose firing rates were significantly modulated by reward value (two-way ANOVA, p<0.01), and was performed separately for the pre- and post-reward periods.
The changes in pre-reward activity after the contingency reversal were qualitatively similar for DRN neurons and dopamine neurons (). In both DRN neurons and dopamine neurons, the activity on the first trial after the contingency reversal was not different from the last trial of the block before the reversal. This is not surprising because the changed reward had not yet been delivered when the activity occurred. Interestingly, however, the change in activity of DRN neurons was delayed by one trial after the reversal from large rewards to small rewards (), unlike dopamine neurons ().
The difference between DRN neurons and dopamine neurons was clearer in the post-reward period (). Unlike in the pre-reward period, the changed reward had already been delivered on the first trial after the contingency reversal. The activity of DRN neurons followed the size of the reward faithfully (). In contrast, the activity of dopamine neurons only changed transiently on the first trial, and thereafter returned to a level close to baseline activity (). Specifically, dopamine neurons decreased their activity on large-to-small reward reversals and increased their activity on small-to-large reversals. These transient changes in activity represent the ‘reward prediction error’, which is the difference between the expected reward value (e.g., small reward) and the actual reward value (e.g., large reward). This pattern of dopamine neuron activity has been shown previously using other tasks (Hollerman and Schultz, 1998
; Takikawa et al., 2004
). The results thus indicate that DRN neurons encode the actual reward value, not the reward prediction error.
Relationship between the firing pattern and the reward-effect of DRN neurons
In the present experiment we studied all well-isolated neurons in the DRN whose activity changed during saccade tasks. It has traditionally been accepted that serotonin neurons in the DRN show slow and regular firing with broad spikes (Aghajanian et al., 1978
; Sawyer et al., 1985
; Jacobs and Fornal, 1991
; Hajos et al., 1998
), although recent studies may not agree with this characterization (Allers and Sharp, 2003
; Kocsis et al., 2006
). To examine whether such electrophysiological properties were correlated with reward-related modulation, we first grouped 71 DRN neurons (whose spike shapes were successfully recorded) based on their spike durations (shorter or longer than 2 ms) and baseline firing rates (higher or lower than 3 Hz) (). These criteria were chosen based on a previous study reporting that the mean spike duration of immunohistochemically identified serotonin neurons was 2.17 ms (range, 1.67–3.5) and the mean baseline firing rate was 1.67 Hz (range, 0.37–3.0), respectively (Allers and Sharp, 2003
). During both pre-reward and post-reward periods, there was no tendency that neurons in specific categories show specific types of reward modulation (chi-square test, p>0.5).
Number of neurons with reward-dependent modulation
We further examined whether the reward-related features of DRN neurons were correlated with any combination of the electrophysiological properties (). There was no significant difference between large- and small-reward preferring neurons in baseline firing rate, spike duration, or irregularity (Kruskal-Wallis, p>0.05). Furthermore, multiple regression analysis indicated that reward effects in ROC values could not be significantly predicted by any linear combination of these three variables (pre-reward, p=0.17; post-reward, p=0.68).
Fig. 9 Electrophysiological properties of different groups of DRN neurons. Neurons with different reward-dependent modulation (large >small, red stars; small>large, blue circles; no significant change, black dots) during pre- (A–C) and (more ...)