In our sample of young healthy volunteers, influx constant Ki values varied between 0.018 and 0.027, falling well within the range of ‘normal’ values observed previously (
Eberling et al., 2007). Subjects performed well on the reversal learning task, with an average accuracy rate on trials after the unexpected outcomes greater than 90% (
Supplemental Table 2).
First, we analyzed the data from the placebo session. A repeated measures ANOVA was conducted with valence as the within-subject factor and synthesis capacity and acquisition delay as covariates. Consistent with neurophysiological evidence from nonhuman primates (
Hollerman and Schultz, 1998), this analysis revealed a highly significant interaction between valence and synthesis capacity (F
1,8 = 19.0,
P = 0.002) and no effects of acquisition delay. This interaction reflected a positive correlation between dopamine synthesis capacity in the striatum and reversal learning from reward relative to punishment under placebo (). It was present across the entire striatum (averaged across right and left caudate nucleus and putamen; r
8 = 0.84,
P = 0.004), and also within striatal subregions (bilateral caudate nucleus: r
8 = 0.8,
P = 0.007; bilateral putamen: r
8 = 0.87,
P = 0.001). The correlation between dopamine synthesis capacity and relative performance on non-switch trials (trials requiring reward-prediction minus trials requiring punishment-prediction) was also positive, albeit non-significant (entire striatum: r
8 = 0.54,
P = 0.1).
The positive correlation between synthesis capacity and relative reversal learning scores (i.e. the difference between reward and punishment) was driven by a positive correlation between synthesis capacity and reward-based reversal under placebo (accuracy: r
8 = 0.79,
P = 0.007), indicating that greater dopamine synthesis capacity predicted better reward-based reversal. Conversely, punishment-based reversal under placebo did not depend on baseline dopamine synthesis capacity (accuracy: r
8 = -0.2,
P = 0.6) (). This finding is remarkably consistent with neurophysiological evidence for associations between phasic dopamine burst firing and reward prediction error (
Hollerman and Schultz, 1998), given that synthesis capacity likely influences the efficacy of impulse-dependent phasic dopamine release. The finding that the effect did not extend to learning from unexpected punishment suggests that synthesis capacity did not influence the efficacy of the impulse-dependent pause in dopamine firing that accompanies unexpected reward omission (
Hollerman and Schultz, 1998).
Next we assessed whether baseline striatal dopamine synthesis capacity also predicted the effects of the dopamine D2 receptor agonist bromocriptine on reversal learning. To this end, we conducted a repeated measures ANOVA with drug and valence as within-subject factors and dopamine synthesis capacity and acquisition delay as covariates. As predicted, this analysis revealed highly significant two-way drug by valence (F1,7 = 17.4, P = 0.004) and three-way drug by valence by synthesis capacity interactions (F1,7 = 29.4, P = 0.001). There were no main effects (acquisition delay: F1,7 = 0.2, P = 0.7; synthesis capacity: F1,7 = 3.0, P = 0.1; valence: F1,7 = 0.8, P = 0.4; drug: F1,7 = 0.03, P = 0.9) and no other interaction effects (valence by synthesis capacity: F1,7 = 0.6, P = 0.5; valence by delay: F1,7 = 0.3, P = 0.6; drug by synthesis capacity: F1,7 = 0.2, P = 0.7; drug by delay: F1,7 = 0.003, P = 0.96; drug by valence by delay: F1,7 = 2.4, P = 0.2). The significant three-way interaction reflected a significant negative correlation between synthesis capacity and drug-induced improvement on relative reversal learning scores (r7 = -0.9, P = 0.001; ). Consistent with an ‘inverted u’-shaped dose-response curve, bromocriptine improved reward-based reversal relative to punishment-based reversal in subjects with low baseline levels of striatal dopamine synthesis capacity, but had the reverse effect in subjects with high baseline levels. Again the effect extended across striatal subregions (bilateral caudate nucleus: r7 = -0.89, P = 0.001; bilateral putamen: r7 = -0.89, P = 0.001). The correlation with (relative) performance on non-switch trials was not significant (r7 = -0.45, P = 0.2).
Breakdown of the three-way interaction effect into simple interaction effects for reward and punishment separately revealed a significant interaction between drug and synthesis capacity for punishment-based reversal (F1,7 = 14.2, P = 0.007), as well as a near-significant interaction between drug and synthesis capacity for reward-based reversal (F1,8 = 3.4, P = 0.1). These interactions reflected a highly significant positive correlation between striatal dopamine synthesis and drug-induced improvement in punishment-based reversal (r7 = 0.8, P = 0.007) (), while the negative correlation between dopamine synthesis and drug effects on reward-based reversal was less convincing (r8 = -0.55, P = 0.10) (see ).
In supplementary analyses, we aimed to disentangle two alternative hypotheses regarding dopaminergic modulation. Specifically, to establish whether the here described effects reflect a modulation of learning or switching, we applied computational reinforcement learning algorithms to fit individual subjects' trial-by-trial sequence of choices (
Sutton and Barto, 1998;
Frank et al., 2007b). These algorithms allowed us to generate learning-rate parameters (separately for reward and punishment) that were not directly observable in the behavioural data. Detailed methods and results are presented in the
Supplementary Materials. Critically, a significant relationship was obtained between dopamine synthesis and the drug effect on reward learning rate (r
8 = -0.71,
P = .02), as well as between dopamine synthesis and the drug effect on punishment learning rate (r
10 = 0.78,
P = 0.01) (
Supplementary Figure and Table 3).
In summary, higher dopamine synthesis capacity in the striatum was associated with better reward-based reversal learning under placebo. Furthermore, bromocriptine improved reward-based reversal learning in subjects with low synthesis capacity, while impairing it in subjects with high synthesis capacity. Conversely, the same drug dose impaired punishment-based reversal learning in subjects with low synthesis capacity, while improving it in subjects with high synthesis capacity.