|Home | About | Journals | Submit | Contact Us | Français|
The neurobehavioral underpinnings of pathological gambling are not well understood. Insight might be gained by understanding pharmacological effects on the reward system in patients with Parkinson’s disease (PD). Treatment with dopamine agonists (DAs) has been associated with pathological gambling in PD patients. However, how DAs are involved in the development of this form of addiction is unknown. We tested the hypothesis that tonic stimulation of dopamine receptors specifically desensitizes the dopaminergic reward system by preventing decreases in dopaminergic transmission that occurs with negative feedback. Using functional magnetic resonance imaging, we studied PD patients during three sessions of a probabilistic reward task in random order: off medication, after levodopa (LD) treatment, and after an equivalent dose of DA (pramipexole). For each trial, a reward prediction error value was computed using outcome, stake, and probability. Pramipexole specifically changed activity of the orbitofrontal cortex (OFC) in two ways that were both associated with increased risk taking in an out-of-magnet task. Outcome-induced activations were generally higher with pramipexole compared with LD or off medication. In addition, only pramipexole greatly diminished trial-by-trial correlation with reward prediction error values. Further analysis yielded that this resulted mainly from impaired deactivation in trials with negative errors in reward prediction. We propose that DAs prevent pauses in dopamine transmission and thereby impair the negative reinforcing effect of losing. Our findings raise the question of whether pathological gambling may in part stem from an impaired capacity of the OFC to guide behavior when facing negative consequences.
Gambling—a harmless pastime for most people—can become an addictive and harmful behavior in pathological gambling (PG). Similar to drug addiction, PG has features of tolerance, withdrawal, or preoccupation (American Psychiatric Association, 1994) and is often referred to as a ‘behavioral addiction’ (Potenza, 2008). Although PG, similar to drug addiction, has been linked to alterations in the dopaminergic reward system, value representation, and feedback processing (Reuter et al, 2005; Steeves et al, 2009; Volkow et al, 2009), the neurobehavioral underpinnings of PG remain poorly understood. On the roadmap to understanding PG, a clearer appreciation of pharmacological effects on the reward system in patients with Parkinson’s disease (PD) may be an important landmark. Loss of striatal dopaminergic transmission in PD is associated with below-average risk-taking behavior (Ragonese et al, 2003; Tomer and Aharon-Peretz, 2004). However, the initiation of dopamine replacement therapy has been associated with the development of PG (Seedat et al, 2000; Driver-Dunckley et al, 2003). Although, so far, insufficient longitudinal data are available to suggest a particular therapeutic approach (for a review see Galpern and Stacy, 2007), recent studies indicate that the risk to develop PG is specifically increased when treated with dopamine agonists (DAs) compared with treatment without DAs (Voon et al, 2006; Pontone et al, 2006; Weintraub et al, 2008). Paradoxically, a dose effect has not been found across patient populations, whereas in the individual patient with PG, a dose threshold can be evident (Voon et al, 2006; Weintraub et al, 2008). Although the causality has yet to be determined, we assume that, to develop PG, a generic pharmacological trigger interacts with an intrinsic trait in the individual patient. This study focuses on a potential generic pharmacological trigger by studying DA-driven abnormalities in reward processing in PD patients.
In computational models of reward processing, the reward-prediction error (RPE) represents the difference between expected and actually obtained rewards (Sutton and Barto, 1998). Dopamine release of mesolimbic neurons reflects RPE values remarkably well. Positive errors in reward prediction (ie ‘better-than-expected’) are conveyed by phasic bursts of dopamine neuron firing (Hollerman et al, 1998; Waelti et al, 2001). Conversely, negative errors in reward prediction (ie ‘worse-than-expected’) lead to phasic pauses in dopamine neuron firing (Schultz, 2002; Bayer et al, 2007). As DAs, in contrast to levodopa (LD), tonically stimulate dopamine receptors, we propose that DAs may prevent pauses in dopamine transmission and thereby impair the negative reinforcing effect of losing. Although this neurobehavioral effect may well increase the risk of developing PG, direct evidence for this relationship is lacking.
Here, we studied PD patients without dopamine replacement therapy (OFF), after LD, and after DA treatment while they performed a ‘roulette’ game during functional magnetic resonance imaging (fMRI). Using similar tasks, earlier fMRI studies successfully modeled activity in the dopaminergic reward system by using RPE values as a regressor (Knutson et al, 2001; Yacubian et al, 2006). We were interested in (i) mean activity change following feedback, and (ii) trial-by-trial correlation with RPE values—as an indicator of local reward processing. Avoiding confounding behavioral effects during fMRI, we assessed risk-taking behavior offline.
On the basis of the hypothesis that DAs prevent decreased dopaminergic transmission with negative RPE values, we predicted that in contrast to OFF and LD, DAs would relatively increase mean feedback-induced activation and desensitize the reward system toward RPE. We further hypothesized that reward desensitization would be associated with increased risk-taking behavior in the offline task.
Eight male right-handed patients (age, mean±SD: 56±9 years) with early stage PD (disease duration, mean±SD: 4±3 years) were enrolled in the study. Their anti-Parkinsonian medications included a combination of LD (daily dose, mean±SD: 594±290 mg) and pramipexole (daily dose, mean±SD: 2.3±1.1 mg). We selected patients without history of overt neuropsychiatric conditions (including depression, dementia, or any impulse control disorder). The Beck Depression Inventory II (mean± SD: 7±5), the Montreal Cognitive Assessment (mean±SD: 27±2), and the Barratt Impulsivity Scale-11 (mean± SD: 71±10) were used to assess covert depression, cognitive impairment, and individual impulsivity, respectively. All subjects provided written informed consent to participate. The study was approved by the Research Ethics Committee for the University Health Network, Toronto.
Patients were studied in three sessions on different evenings (1–3 weeks apart). Dopamine replacement therapy was with-held for at least 12 h before each session. In counterbalanced order, patients were studied off medication (OFF), after oral administration of LD (100mg LD+25mg benserazide), or an equivalent dose of DA (1mg pramipexole) (Figure 1a). Patients underwent a risk-taking task 37±7 min after drug administration, 21±5 min later, the motor section of the Unified PD Rating Scale was assessed by a neurologist specializing in movement disorders and 13±2 min later, the probabilistic financial reward task was performed during event-related fMRI.
The Balloon Analogue Risk-Taking task is a theoretical empirical measure of individual risk-taking behavior in which participants can win or lose money (White et al, 2008). Participants pump up a balloon presented on a screen by clicking a computer mouse. For each pump, a counter on the screen increases by 5 cents. After an unpredictable number of pumps, the balloon may explode, resulting in a loss of the money accumulated in the counter. Participants who emitted more pumps (average adjusted pumps) were considered more inclined to take risks (Lejuez et al, 2002). We tested for effects of medication in an analysis of variance (ANOVA) using STATISTICA for Windows 6.0 (www.statsoft.com).
This computerized task resembles a roulette game (Figure 1b). After running around the circumference of a stationary roulette wheel, a ball slowed down and stopped in 1 of 16 colored pockets (4 of each: yellow, red, blue, green). The participant had to guess the color of the pocket the ball would stop in by choosing one of four options: In half of the trials, he had to choose between four single winning colors (winning probability, 0.25); in the other half, he had to choose between four triplets of winning colors (winning probability, 0.75). The stake in a given trial was either 1 or 5 Canadian dollars. The computer program produced a pseudo-randomized sequence of these trial categories (three different preprogramed sequences were used in a random order). The only trial-by-trial decision of the participant was the option to choose. If the ball stopped in a pocket painted in one of the winning colors, the stake was won. Otherwise, it was lost. To rule out variability because of chance, the sequence of winning and losing was also preprogramed and included in the script for that session (the program made the ball stop in a particular pocket). The initial balance was $20. The first frame of a trial presented the stake (either a $1 coin or a $5 bill) and the options for 2 s (Figure 1b, top). The decision had to be made within the following 3 s (indicated by a countdown bar). If no button was pressed during that time, the program randomly picked one option. The program stopped if that happened three times in a row. The second frame of a trial featured the roulette wheel (Figure 1b, 2nd from top). While the ball was running around (8 s), the stake was displayed in the center of the wheel; the chosen option and the balance were displayed below the wheel, and 0.5 s after the ball stopped, the outcome was displayed (3 s) in the center of the wheel (algebraic sign and amount; green ink for winning; red ink for losing) and the balance changed accordingly (Figures 1b, 3rd from top). The final balance was paid out in cash.
Patients played the game (Java 2 Platform Standard Edition 5.0; Sun Microsystems Inc, Santa Clara, CA) during fMRI wearing video goggles and indicating decisions by pressing buttons on response boxes placed under each hand (boxes and goggles, Resonance Technology, Los Angeles, CA, USA). With a preprogramed sequence of 280 trials, the $ balance never went below 0 and the final balance was $8, $10, or $12 (counterbalanced over sessions). To avoid fatigue, we split the game in nine runs, each lasting 9 min. Alertness was assessed by recording response times and response omissions.
In fMRI studies of reward processing, RPE values have been used to model fMRI data (O’Doherty et al, 2003; Yacubian et al, 2006), assuming a linear relationship between RPE values and local blood–oxygen level-dependent (BOLD) signal in reward processing areas of the brain. Using a task with fixed, explicit probabilities and stakes, we can express the reward prediction value as the arithmetic product of stake and probability of winning. The RPE value represents the difference between the outcome value and the reward prediction value (outcome value−reward prediction value =outcome value− (stake × probability of winning)) (Figure 1c).
Using a 3 T GE MRI scanner, echo planar T2*-weighted images with BOLD contrast were acquired every 2.23 s in nine runs with 245 volumes. The field-of-view was designed to cover the frontal brain, the striatum, and the midbrain. Volumes contained 30 oblique slices (3 mm, no gap), in-plane voxel dimensions were 2mm × 2 mm. Images were processed and analyzed using SPM5 software (http://www.fil.ion.ucl.ac.uk/spm). The first two scans of each run were discarded to allow for steady-state magnetization. The remaining images were realigned to the first image and spatially normalized to a standard template (MNI 305). The normalized images were spatially smoothed with a Gaussian kernel of 8mm at full-width half-maximum to reduce intersubject differences in anatomy and enable the application of the Gaussian random field theory.
First-level analyses were performed separately for each subject and each medication state based on the general linear model (Friston et al, 1995). Local relative BOLD-signal change was modeled using separate regressors for the onsets (convolved with a hemodynamic response function) of each of the following events: presentation of stake and options; button press; start of the ball; outcome. As an additional column in the design matrix, mean corrected RPE values were introduced as a separate regressor to explain BOLD-signal change during outcome. Single contrast images (per subject, medication state, and session) for the linear contrasts reflecting plain outcome induced BOLD changes (one on event regressor) and correlation of this change with RPE value (one on RPE regressor) entered separate repeated-measures ANOVAs with the factors ‘subject’ (8 levels) and ‘medication’ (3 levels; OFF, LD, DA) to perform a voxel-wise comparison of local BOLD-signal change. We considered a statistical threshold of p<0.05 (after false discovery rate correction) as being significant (Genovese et al, 2002).
Furthermore, we explored a potential behavioral relevance of the effects seen in the above-mentioned analyses. In particular, we wanted to see whether the putative DA effects correlate with increased out-of-magnet risk-taking behavior in the Balloon analogue risk-taking task. To this end, we introduced an individual score in the out-of-magnet risk-taking task (average adjusted pumps) as a covariate of activation in both ANOVAs (one covariate per analysis, interaction with the factor ‘medication’).
As expected, motor scores of the Unified PD Rating Scale improved both with LD (19.6±7.9) and DA (21.5±9.2) compared with OFF (27.5±9.9) (paired t-tests: DA vs. OFF p<0.01; LD vs. OFF p<0.01; DA vs. LD p=0.16). Medication did not influence measures of alertness in the fMRI task. Response times (mean±SD: OFF 1270±300 ms; LD 1329±419 ms; DA 1250±349 ms) and response omissions (mean±SD: OFF 9.75±5.2 ms; LD 9.25±5.6 ms; DA 9.75±3.1 ms) did not differ between conditions (response times: F(2, 21)=0.12, p=0.90; response omissions: F(2, 21)=0.03, p=0.97). Medication also did not significantly influence risk-taking scores in the Balloon analogue risk-taking task F(2, 21)=0.2, p=0.98; mean average adjusted pumps±SD: OFF 37.6±11.4ms; LD 38.1±14.4ms; DA 38.8±10.8ms.
The presentation of outcomes per se elicited changes in BOLD signal in several networks. Increases were observed in a bilateral visuo-motor network (visual cortex: x= −18/18, y= −93, z=6/0 mm; cerebellum: x= −30/30, y= −66/−57, z= −27/−21 mm; putamen: x= −21/24, y= −3/6, z= −3/0 mm; cingulate motor area: peaks: x= −12/12, y=6/8, z=45/44 mm; ventral premotor cortex: x= −55/45, y=3/6, z=45/36 mm). Decreases were found in the anterior cingulate cortex at the genu of the corpus callosum (x=0, y=39, z=0 mm) and the medial prefrontal cortex (x=0, y=57, z=0 mm).
When looking at the effect of medication, a significant effect on feedback-induced BOLD-signal change was only found in the left lateral orbitofrontal cortex (OFC) (Table 1). T-tests showed that the average BOLD signal after outcomes was higher in the DA condition than in the LD or OFF condition (Table 1). In the covariance analysis, the DA condition significantly strengthened a positive correlation between the average number of adjusted pumps and plain outcome-induced BOLD-signal changes in the left lateral OFC (Table 1).
Strong positive correlation with trial-by-trial RPE values was found in areas of the main target areas of the mesolimbic dopaminergic system (Figure 2a and b; Table 2). In the ventral striatum, both dopaminergic medications (LD/DA) equally diminished local reward processing compared to OFF (Figure 3a and b; Table 2). In the OFC, however, only DAs greatly diminished local reward processing (Figure 3c and d; Table 2). The covariance analysis with offline risk-taking scores showed that the DA condition significantly strengthened a negative correlation between the average number of adjusted pumps and local reward processing in the left lateral OFC (Table 2).
Taking both OFC findings together—augmented mean response following feedback and abolished correlation with RPE values—one may conclude that the magnitude of DA-related increase in OFC activation depended on the RPE value. In trials with negative RPE values, DAs may have increased OFC activation to a greater extent than in trials with positive RPE values. To confirm this notion, we further explored mean outcome-induced responses in relation to RPE values in a categorical fashion. However, as the coordinates of the greatest difference in both comparisons did not completely overlap (outcome-induced activation: z= −18; reward processing: z= −3), we extracted mean values from a 10mm sphere, centered in between the two maxima (x= −24, y=42, z= −10). Relative to OFF, DA specifically increased orbitofrontal activation in trials with negative RPE values (Figure 4).
The main finding of our study is that tonic dopaminergic stimulation with DAs in PD patients specifically diminished reward processing in the lateral OFC by relatively increasing activity during negative errors of reward prediction. To our knowledge, this represents the first empirical evidence that DAs may abate negative reinforcement in feedback-based learning by preventing phasic decreases in synaptic activity that occurs with negative errors of reward prediction. Critically, this finding was drug specific, as it was not observed after LD administration—which instead is believed to enhance pulsatile stimulation of dopaminergic receptors. This notion concurs with a specifically increased risk to develop PG in DA-treated PD patients (Voon et al, 2006; Pontone et al, 2006; Weintraub et al, 2008).
Our observation is in line with current theoretical models and empirical data of dopamine-dependent reinforcement learning (Frank et al, 2004, 2007; Cools et al, 2006). Unmedicated PD patients showed impaired feedback-based learning in various tasks (Frank et al, 2004; Shohamy et al, 2004; Cools et al, 2006). Although some findings indicate that unmedicated patients may be specifically impaired in learning from positive feedback (Frank et al, 2004; Cools et al, 2006), empirical evidence for a detrimental effect of dopamine replacement therapy in negative feedback learning seems more consistent (Cools et al, 2006; Frank et al, 2007). According to the computational model proposed by Frank and colleagues, phasic bursts of dopamine after unexpected rewards exert a positive reinforcing effect by stimulating D1 receptors (Frank et al, 2004). Conversely, unexpected punishments or withheld rewards lead to negative reinforcement by transient reduction in D2 signaling. Persisting tonic stimulation of dopamine receptors—as with DA medication—could therefore enhance D1-mediated effects (eg positive reinforcement). On the other hand, it could prevent pauses in D2 signaling and consequently impair negative feedback learning. Our results point toward a greater effect of the latter, which may well be explained by the D2/D3 selectivity of pramipexole (Seeman, 2007). In fact, outcome-induced activation in the OFC was higher with DA and the boosting effect seemed greater for unexpected losses than for unexpected gains, thereby diminishing correlation with RPE values. However, the fact that our paradigm is different from the one used in the studies of Frank and coworkers represents an important caveat (Frank et al, 2004, 2007). Moreover, an alternative theoretical consideration is that tonic stimulation of presynaptic autoreceptors may reduce correlation with RPE values by suppressing firing of midbrain dopaminergic neurons.
Our results point toward a relative preservation of reward processing in unmedicated PD patients, whereas LD and DA both diminished reward processing in the ventral striatum and OFC. This corroborates the view that with dopamine replacement therapy, restoration of dopamine levels in the motor part of the striatum (dorsal putamen) might also come with detrimental overdosing of more cognitive (dorso-medial caudate) and limbic (ventral striatum, nucleus accumbens) parts (Swainson et al, 2000; Cools et al, 2001; Cools, 2006).
Could neuronal activity before the outcome have influenced neuronal processing of the RPE values in different medication states? In young healthy subjects, one would indeed expect a relationship of ventral striatal activity during anticipation and reward prediction value. It should be noted, however, that this effect is much more subtle than the relationship with RPE (Yacubian et al, 2006). In a preliminary analysis of our data, we could not find such a relationship in any of the pharmacological conditions (OFF, LD, DA). In fact, one might not assume this relationship to be maintained in PD. A recent neuroimaging study in PD patients after withdrawal of medication, elderly and young healthy controls showed that though RPE processing seems relatively preserved, PD patients and elderly controls show a markedly impaired reward prediction signal (Schott et al, 2007). Given the subtle nature of this relationship in young participants, the relative loss of this relationship in elderly and PD patients, and the lack of such a relationship in our study, we assume that a putative influence can only be of negligible quantity.
This study may also bear important implications for pathological gamblers without PD. Reuter et al (2005) found that the difference in ventral striatal activation after positive vs negative financial feedback was diminished in pathological gamblers relative to healthy controls. As the authors pointed out, it remains to be elucidated, how much this finding stems from blunted response to gains, or from augmented responses to losses. Our findings raise the question of whether PG may be associated with an impaired capacity of the OFC to guide behavior when facing negative consequences.
As outlined in the introduction, there are two main reasons to compare our findings with those in drug addiction. First, current diagnostic criteria of PG and drug addiction overlap (American Psychiatric Association, 1994). Second, several recent functional imaging studies on substance addiction have underlined the critical role of mesolimbic dopaminergic pathways (Garavan et al, 2000; Volkow et al, 2004; Goldstein et al, 2007). In the addict, the value that is attributed to certain events or cues seems to be altered (Garavan et al, 2000; Ahmed et al, 2002; Grigson and Twining, 2002). There is substantial evidence that the OFC mediates subjective value attribution and is an integral part in adaptive decision making (Tremblay and Schultz, 1999; Knutson et al, 2000; Breiter et al, 2001; Elliott et al, 2003; Valentin et al, 2007). Indeed, a recent activation study in cocaine users confirmed the involvement of the lateral OFC in deficient attribution of feedback values (Goldstein et al, 2007). Control subjects valued high wins more than low wins, whereas over half of the cocaine-addicted subjects valued all wins equally. This finding was significantly correlated with high, unmodulated activations to money in the lateral OFC. Our results suggest that DAs in PD patients shift the lateral OFC toward high, unmodulated activations after financial feedback—a finding that strikingly resembles those made in cocaine addicts.
Although DA-mediated effects on lateral OFC function were associated with relative changes in risk taking in the offline task, pramipexole administration had no measurable direct effect on behavior, replicating earlier findings in young healthy volunteers (Hamidovic et al, 2008). In other words, neuronal effects of DAs may not be strong enough to actually alter behavior in every individual. But what happens, if this pharmacological trigger interacts with an individual vulnerability? Reduced availability of striatal D2 receptors is a trait that has been associated with drug addiction (Volkow et al, 1997). Interestingly, we recently found that reduced availability of striatal D2 receptors also distinguishes PD patients with PG from PD patients without PG (Steeves et al, 2009). One may speculate that in individuals with reduced D2 receptor density, the interference of DAs with D2-mediated negative feedback learning could be amplified. However, one cannot rule out that the individual vulnerability to develop behavioral addictions also stems from neurobehavioral mechanisms that are not related to mesolimbic dopamine. In the absence of an external task (ie freely fluctuating brain activity), PD patients experiencing heavy PG symptoms at the time of study showed increased brain perfusion in dopaminergic mesolimbic structures, but also in the insula, the hippocampus, and the amygdala (Cilia et al, 2008). More studies are needed in this area to distinguish traits that predict vulnerability from an abnormal neurobehavioral pattern that may evolve once PG consolidates as a behavior.
In sum, we provide some evidence that tonic stimulation of frontal dopamine receptors may impair physiologic (specifically negative) reinforcement value attribution by preventing decreases of cortical synaptic activity that occurs with negative feedback. Our findings raise the question, whether PG may in part stem from an impaired capacity of the OFC to guide behavior when facing negative consequences.
However, there are several limitations of our study that may challenge our conclusion. First, given that the findings in our study represent a generic pharmacological mechanism, it may not be the only trigger for PG in vulnerable patients with PD. Second, with fMRI, we measured change in blood oxygenation. Although this may serve as an index of synaptic activity, this study does not investigate frontal dopamine receptors directly (eg through use of radioligands targeting dopamine receptors) and therefore, we cannot draw any specific conclusion on the neurotransmitters involved. Third, we investigated performance-independent feedback processing. Although we were able to indirectly link findings with offline risk-taking scores, we did not gather any more direct evidence of the behavioral importance of DA-induced lateral OFC dysfunction. Further limitations are the relatively small sample size and the risk of circular relationships with potentially nonindependent measures (Kriegeskorte et al, 2009). Future studies may be able to directly elucidate the role of frontal dopaminergic transmission in negative feedback learning and to assess pharmacological interference with DAs or specific deficits in pathological gamblers.
We thank the staff of the medical imaging department (especially Adrian Crawley) and Movement Disorders center (especially Rosalind Chuang, MD and Thomas Steeves, MD) of the Toronto Western Hospital for their assistance in carrying out the study. This work was partially supported by a grant from the Canadian Institutes of Health Research (MOP-64423 to APS) and Safra Foundation. APS is supported by the Canadian Institutes of Health Research New Investigator Research Award.
The authors declare no conflict of interest.