|Home | About | Journals | Submit | Contact Us | Français|
Although positive reward prediction error, a key element in learning that is signaled by dopamine cells, has been extensively studied, little is known about negative reward prediction errors in humans. Detailed animal electrophysiology shows that the habenula, an integrative region involved in many processes including learning, reproduction, and stress responses, also encodes negative reward-related events such as negative reward prediction error signals. In humans, however, the habenula's extremely small size has prevented direct assessments of its function. We developed a method to functionally locate and study the habenula in humans using fMRI, based on the expected reward-dependent response phenomenology of habenula and striatum and, we provide conclusive evidence for activation in human habenula to negative reward prediction errors.
Learning from mistakes is a critical survival skill for all mobile creatures and requires the nervous system to generate predictions about plausible future outcomes within a given behavioral context. Detailed electrophysiological work has identified networks of neurons in the midbrain and diencephalon responsible for predicting the timing and valence (aversive or rewarding) of near-term outcomes during reward-dependent learning tasks (Montague et al., 2004; Daw and Doya, 2006). This prediction circuitry is complex and poorly understood, but it is now known to include midbrain dopamine neurons that encode reward prediction error signals (Montague et al., 1996; Schultz et al., 1997; Hollerman and Schultz, 1998; Bayer and Glimcher, 2005; Tobler et al., 2005) and neurons in the lateral habenula that activate to negative prediction errors (Matsumoto and Hikosaka, 2007, 2009). The lateral habenular responses include activations when expected reward is not delivered or when unexpected punishment is received. Such signals may be one important source of negative reward prediction error inputs to the ventral tegmental area (VTA) since activity increases in lateral habenular neurons inactivate dopamine neurons in the VTA (Matsumoto and Hikosaka, 2007, 2009). In addition to signaling negative prediction errors, lateral habenular neurons also inactivate upon positive prediction error signals (Matsumoto and Hikosaka, 2007), possibly providing a permissive signal for dopamine cells to increase firing during positive prediction error events. These exciting new findings in animal models have not been extended to human subjects primarily because of the difficulty of identifying the habenula in human subjects using existing imaging techniques.
Besides modulating reward, the habenula can influence sexual and feeding behavior, drug withdrawal, pain, and sleep (Klemm, 2004; Hikosaka et al., 2008; Salas et al., 2009). Functional studies of the habenula in healthy humans can be performed only with non-invasive methods. Functional MRI (fMRI) and positron emission tomography (PET) imaging are available techniques but they are typically not suitable for the imaging of such small brain regions. Only two fMRI studies have explicitly reported habenular activation: Ullsperger and von Cramon (2003) showed that a region that includes the habenula is activated by negative feedback in a goal prediction task and Shepard et al. (2006) demonstrated a similar concept. Despite these successes, there are sizable variations in the exact location of the habenula; consequently, region-of-interest (ROI) analyses with ROIs defined using ordinary techniques cannot conclusively identify habenular responses. There are two main reasons why this claim is valid: (i) as a region of approximately 3 ×3 ×6mm (e.g. see Brainmaps, www.brainmaps.org), the habenula occupies only a handful of voxels in typical fMRI scans using 3Tesla scanners, which use voxels that yield either 3 ×3mm or 4×4mm in-plane resolution. Therefore, the signal accessible from the habenula via fMRI is small and contaminated by nearby areas. (ii) A second complicating feature is that group analyses across multiple brains necessitate normalizing images to a brain template; a maneuver that leads to additional blurring of the signal from distinct areas.
In this paper, we arrive at a proper ROI study of the habenula by (i) imaging a smaller region of the brain at higher resolution, (ii) avoiding the signal-blurring steps of normalization and smoothing, and (iii) exploiting our knowledge of the response phenomenology expected of other easier-to-resolve brain areas. We show the first demonstration of habenular activity in humans during negative prediction error events, using a novel technique that can be adapted to study not only the habenula, but other small areas of the brain.
Our classical conditioning experiment is outlined in Figure Figure1.1. Normal events used a 1-s duration yellow light followed 6s later by 0.8ml juice delivery (Figure (Figure1A).1A). Catch events delayed juice delivery by 4s. Functional imaging was restricted to a 4.4-cm slab as indicated in Figure Figure1B1B using 2mm by 2mm in plane resolution. The experiment was divided into four separate sessions as indicated in Figure Figure1C.1C. This stimulus arrangement allowed us to identify habenula voxels in one session (run 1) and probe them in separate sessions (runs 3 and 4). Three important response features should occur during run 1 (Schultz et al., 1997; Hollerman and Schultz, 1998; Montague et al., 2004; Bayer and Glimcher, 2005). First, habenula neurons should give a negative-going response to the unexpected juice delivery. Second, dopamine neurons should give an activation response to the initially unexpected juice delivery. Third, there should be a learning effect where these two responses to the juice should begin to return to baseline and similar response profiles should develop at the time of the predictive cue (here the yellow light).
The Baylor College of Medicine Institutional Review Board approved this experiment for human participants, which were recruited from the community. Written consent was obtained from all participants. Subjects were told that we were studying the way the brain process reward, and that they would see icons on a computer screen and receive juice in their mouth, but had no information about the cue-juice relationship. Fifty control subjects (all right handed, 18 males) were scanned in 3T Siemens Trio MR scanners. Subjects were first scanned with a 1×1×1mm voxel structural sequence TE =3.93ms, TR=2500ms, flip angle =12°, 256×256 matrix, 160 1mm axial slices, followed by four functional sessions. Functional scans were restricted to a 4.4-cm slab aligned along the anterior commissure-posterior commissure line using isotropic voxels (2×2×2mm; TE =40ms, TR=2s, flip angle =90°). During runs 3 and 4, that contained 12 normal and 6 catch events, the order of events was randomized for all subjects. Between sessions, the subject was queried through headphones about whether they were comfortable to continue. During these rest periods, functional imaging was halted, but the subjects did not move.
Events were either “normal” (juice delivered 6s after cue disappeared) or “catch” (juice delivered 10s after cue disappeared). For a detailed explanation of event schedule see Figure Figure1.1. Subjects were given a choice of juice flavor. Juice was delivered using a Harvard Apparatus pump connected to a computer, which pushed 0.8ml boluses of juice into the subject's mouth (juice delivery lasted 1s each time). The tubes were attached to a pacifier for comfort.
To define right and left habenular coordinates in functional images, we first looked at the structural image of each subject (1 ×1 ×1mm voxels, T1 images in Figure Figure2A)2A) and recorded the position of the habenula (black arrows in inset panel A). In this example, two landmarks are shown: the bottom of the corpus callosum (white line) and the right ventricle (white arrow). These and other landmarks (which were different for each subject based on best visibility) were identified in functional images (2×2×2mm voxel size, T2* images in Figure Figure2B),2B), which helped to manually co-register the structural and functional images. To identify the exact placement of the habenula in functional images that had been manually co-registered, we used the triangular white matter tract by the habenula (black arrows in inset, panel B). Two cubic (3 ×3 ×3 voxels) ROIs were placed centered on the habenula coordinates of each subject and voxels in those cubes were used in the anti-correlation approach (as discussed in results below). To verify that the manual placement of habenular ROIs was necessary, we studied the distance from the habenular coordinates to a fixed landmark (bottom of corpus callosum at the midline). We found that the standard deviation in habenular coordinates was about 3mm, similar to the size of the habenula. Therefore, we believe that automatic co-registration (which uses cortical landmarks far from the midbrain) would blur the signal from habenular tissue if subjects were pooled (Napadow et al., 2006).
We used an ROI to see striatal activity upon unexpected juice delivery during run 1. To obtain this ROI we used the SPM2 software and analyzed the time course of activity in the whole slab around the time of juice delivery. By changing the threshold for the p value we obtained a 34 voxel ROI. We chose this size because it is very similar to the 27 voxel cubic ROI used for the habenula.
Functional images were preprocessed only by slice timing and motion correction (verification of motion correction was done on each subject to ensure that the habenula was in the same coordinate in the first and the last functional image). In summary, we avoided automatic co-registration, normalization and smoothing of our functional images for habenular ROI.
We identified habenula-containing voxels by a combination of anatomical and functional criteria (Figure (Figure3).3). In step 1, we manually placed 3 by 3 by 3 voxel (6mm by 6mm by 6mm) bounding boxes around regions known to include the habenula. This step produces 54 voxels from each subject that may overlap the habenula. In step 2, we identified habenula-containing voxels in each subject by computing correlations between each of the 54 time series from the bounding boxes and the striatal time series that occurs near unexpected juice delivery during run 1 (Figures (Figures3B,C).3B,C). The contrast in Figure Figure3C3C is unexpected juice delivered versus baseline (see legend). In step 3, we placed all 2700 correlations (54 per subject ×50 subjects) in a distribution (right panel of Figure Figure3A)3A) and selected the voxels that possessed negative correlations (values ranged from −0.89 to 0.92). This choice exploits the expected phenomenology cartooned in Figure Figure3B,3B, and, since the entire time series was included, this procedure would also capture any other correlations that exist or develop during run 1 from the time of the light cue up until the time of juice delivery. It must be noted that what we termed “negative correlation” does not necessarily mean voxels with negative correlation coefficients. In order to include the entire time series for each event, we used 22s around the event to classify the voxels. In fact, only during some of those seconds we expect the habenula and the putamen activities to anti-correlate, while they may positively correlate during the time before or after the event. Therefore, the voxels that “negative correlate” are actually the voxels that correlate less, within the voxels contained in the bounding box. Using shorter times would increase the value of the anti-correlations, but the data would be noisier. We divided the voxels into “habenula” and “not habenula” (see Figure Figure2A,2A, right panel) approximately at the median of the correlations histogram (corr coeff 0.1). There are two main reasons why we chose this threshold. First, according to the anatomy, we expect about half of our binding box to be habenular tissue. Second, we did not want to choose only those voxels with very negative correlations because that would increase the percentage of spurious anti-correlations, or false positive habenular voxels. Other similar thresholds render comparable results. We then plotted the time course of the voxels identified as habenula during normal events in run 1. As expected, those voxels anti-correlate to the putamen signal at those events, especially during putamen activation after juice delivery (Figure (Figure3C,3C, red trace habenula, black trace putamen).
Figure Figure4A4A shows the predicted response profiles (habenula= red; striatum= black) for habenula and striatum during a catch event. Figure Figure4B4B displays the measured response of the functionally identified habenula voxels to the unexpected omission of juice delivery and unexpected delivery of juice at 10s after the cue light. As anticipated from electrophysiological response phenomenology (Matsumoto and Hikosaka, 2007), the functionally-defined habenula voxels show a positive-going response to unexpected juice omission (red asterisk). As a consistency check for our procedures, the striatal voxels show the anticipated response profiles – a negative-going response to unexpected juice omission (single black asterisk) and a positive-going response to unexpected juice delivery (double black asterisk) (Montague et al., 2004). In our experiment, habenular voxels do not show the expected negative-going response to unexpected juice delivery. To answer this discrepancy, we studied the average time course of the whole habenular bounding box (Figure (Figure5A5A black trace). We found that there are two major peaks during this time course: one peak at t =12, resembling the peak we identified as true habenula, and a later, long lasting increase in activity that loosely correlates with the activity in the putamen. The nature of the second peak eluded our analysis. It could be either an artifact of the experiment (the habenula lies by the third ventricle) or perhaps a feedback signal from striatal areas to some specific part of the habenula or nearby tissue. Whatever the nature of the second peak, it is likely that because of the inherent smoothness of the fMRI signal, our habenular voxels are contaminated by activity from these other voxels. Since the lowering of the habenular activity upon positive prediction error events is expected to be very small, it is not surprising that it was masked by contamination from a bigger, nearby positive-going signal (even after the anti-correlation approach was used). To verify that the second peak is not a true habenular response to a negative prediction error, we performed a regression analysis in which we subtracted the activity of the habenular bounding boxes during catch events minus the activity of those same voxels at normal events, shifted 4s to align juice delivery (Figure (Figure5A5A red trace, regression coefficient 1.3). We found that this approach effectively diminished the size of the second peak. Therefore, although we still don't know the exact nature of the second peak, we believe the first peak is a due to the non-delivery of expected reward. Finally, to verify that our approach was actually needed to measure activity in the habenula, we performed a traditional GLM analysis of our data, using complete pre-processing (slice timing, realignment, automatic co-registration, segmentation, normalization to a template and smoothing using a 4-mm kernel). Using this approach, we found activity at the contrast “juice expected but not delivered minus juice neither expected nor delivered” (during catch events) in an area near the habenula (Figures (Figures5B,C).5B,C). Although this is encouraging and agrees with our more sophisticated ROI data, we believe that it would not be possible to confidently call this signal “habenular activity” had we not done the anti-correlational approach using our subject-by-subject analysis.
It is currently believed that dopaminergic neuron firing encodes reward prediction errors. In fact, the reward prediction error theory of dopamine function proposes that dopamine release modulates reinforcement learning. Electrophysiological recordings in non-human primates provided support for this theory (Montague et al., 1996; Schultz et al., 1997). For several years, it was unclear which brain regions may be involved in the generation of the signals that dopamine cells must receive upon negative prediction error events. Recently, it was shown that the primate lateral habenula is a major source of negative reward-related signals in dopamine neurons (Matsumoto and Hikosaka, 2007). Whether the human habenula behaves the same way remained to be shown and was the focus of our studies.
Broadly, the logic of our approach had two basic components. (1) Identify habenular voxels by seeking anti-correlations with striatal responses to unexpected reward delivery during early conditioning (run 1). This identifies the habenula using a negative-going BOLD response in regions large enough to contain the habenula. (2) In a separate session, probe the habenular voxels using the unexpected non-delivery of juice. Based on single unit electrophysiology (Matsumoto and Hikosaka, 2007, 2009), the habenular neurons should give an activation response to the unexpected juice omission. This fact gives a nice internal control: we identify the habenula voxels based on a negative-going signal (Figures (Figures11 and and2)2) and we probe these voxels with a stimulus that should yield a positive-going response (Figure (Figure33).
The habenula is a major player in the reward signal pathway (Matsumoto and Hikosaka, 2007), may be a critical structure mediating the effects of drugs of abuse (Matsumoto and Hikosaka, 2007), and is implicated in a series of important behaviors such as stress management and decision making (Klemm, 2004). We have shown that by using a large number of subjects, small voxel size and a manual co-registration technique coupled to a correlational approach to identify the habenula, we can assess the activity of the habenula in humans undergoing a passive learning task. Our ROI analysis relied on both the positioning of the habenula in each subject and on the correlations of voxel activity between habenula and putamen voxels in run 1. In that sense, we carefully avoided a circular type of analysis discussed by Kriegeskorte et al. (2009) by using data from run 1 to identify voxels and probing these voxels’ response in later, separate sessions (runs 3 and 4). These results expose the feasibility of human studies of habenular function under conditions modulated by the habenula, such as tobacco withdrawal (Salas et al., 2009), alcohol abuse (Taraschenko et al., 2007; Wang et al., 2009), depression (Morris et al., 1999; Shumake and Gonzalez-Lima, 2003) or schizophrenia (Lecourtier et al., 2004; Shepard et al., 2006). In fact, a recent study has shown that deep brain stimulation of the habenula may be an effective therapy for treatment-resistant depression (Sartorius et al., 2010). In addition, our approach may provide a way to image activity in analogously small brain structures. Of course, only areas whose activity correlates in a systematic fashion with large easier-to-image brain regions will be amenable to this type of analysis. Whether this approach can be used only to confirm prior knowledge, or also to gather novel insights in the function of the structure of interest is an open question.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This work was supported by Baylor College of Medicine and grants R01 DA11723, R01 NS045790 (PRM), RO1 DA 017173 (MDB), and The Kane Family Foundation (PRM). We thank Dr. Jorge Albin and Ms. Krystle Bartley, Ms. Carrie Howard, Ms. Christine Cortelyou, Ms. Elyse Aurbach, Mr. Mark Ross, Dr Henning Voss and Dr. Stephen LaConte.