This study shows that the two basic economic decision parameters, expected value and uncertainty of reward, were coded in distinct structures of the human brain. The coding of expected value involved the striatum and, to a lesser extent, parts of frontal cortex. The responses covaried with expected value irrespective of different combinations of magnitude and probability, although some regions of striatum and frontal cortex coded specifically only magnitude or probability. These activations were unrelated to reward uncertainty. By contrast, the coding of reward uncertainty as measured by variance involved regions in the orbitofrontal cortex. Uncertainty responses correlated with individual risk attitudes without reflecting reward value. Although expected value and uncertainty appear to be coded mostly separately from each other, some prefrontal regions showed value-related activations that covaried with uncertainty depending on individual risk attitudes. Taken together the data suggest that crucial parameters for reward-directed decision-making were coded in the prime reward structures of the human brain.
The coding of expected value in some striatal regions occurred irrespectively of different multiplicative combinations of reward magnitude and probability. This was unlikely due to insensitivity of these regions to magnitude or probability as these regions showed increasing responses when these parameters varied independently (). Neither was expected value coding due to simple coincidence or conjunction of magnitude and probability coding. To achieve expected value coding irrespective of magnitude-probability combinations would require closely matching response gains, so that response reduction with one parameter is compensated by response augmentation with the other parameter. Unmatched gains for magnitude and probability responses would not lead to unchanged brain responses when decrease in one parameter together with increase in the other results in the same expected value. The required matching of response gains for magnitude and probability in regions in which both variables are processed make the coding of expected value a remarkable achievement of neural coding.
Apart from the activations reflecting expected value we confirmed previous results indicating separate, regionally distinct relationships of striatal activations to reward magnitude (Breiter et al. 2001
; Delgado et al. 2003
; O'Doherty et al. 2001
) and probability (Dreher et al. 2006
), with the exception of a block design study that failed to find magnitude relationships (Elliott et al. 2003
). A previous study found covariations with magnitude in nucleus accumbens but not with probability or expected value (Knutson et al. 2005
). However, that study used an anticipatory delay between cues and outcomes in a contingent action-outcome design including loss trials, which may preclude direct comparisons with the present study. Thus it had been unclear until now whether these separate reward parameters might be coded in combination as expected value in parts of the human brain and specifically in the striatum. The present results suggest that fMRI activations reflecting reward magnitude, probability, and expected value occur in separate striatal regions and well separated from uncertainty coding.
Activations in ventromedial prefrontal regions increased with reward probability. Previous imaging studies found also no relation of medial prefrontal responses to variations in reward magnitude, irrespective of probability being kept constant or varied (Knutson et al. 2003
). These results together suggest a preferential relation of ventromedial prefrontal activation with reward probability rather than magnitude. The preferential ventromedial prefrontal coding of reward probability contrasts with the distinct relationships to both reward magnitude and probability in the striatum. Thus our findings confirm that some reward structures process the basic reward components of magnitude and probability separately. It would be interesting to ask what the function of such independent coding might be. In the St. Petersburg Paradox, individuals typically refuse to pay all their finite possessions for options associated with infinite magnitude and expected value, but at near-zero probability (Bernoulli 1954
). Thus they remain sensitive to independent variations in the components of expected value, and the presently observed separate coding of probability and magnitude may support such sensitivity.
The short trial duration of 1.5 s might have compromised the separation of activations in relation to the cues and rewards. However, we analyzed rewarded trials separately from unrewarded trials and found comparable results. The separations suggest that the observed relationships to reward magnitude, probability, and expected value reflect predominantly responses to the specific cues rather than the rewards. The similar activations in rewarded and unrewarded trials would rule out major contributions of reward prediction error coding that should differ across the different degrees of positive and negative reward prediction errors in probability schedules (McClure et al. 2003
; O'Doherty et al. 2003
). Despite the motivating influences of expected value on behavioral reaction times, we found no correlation of expected value coding to this behavioral parameter, suggesting that the activations did not reflect simple motivational factors suggested to play a role in reward processing in monkey premotor cortex (Roesch and Olson 2003
). Although penalty and perception of outcome control can influence striatal reward processes (Tricomi et al. 2004
), our experiments held these variables constant and the described activations should not be due to them.
Phasic responses of dopamine neurons are consistently stronger to stimuli associated with higher reward magnitude, probability, and expected value (Fiorillo et al. 2003
; Tobler et al. 2005
). Conversely, striatal output neurons show equal proportions of both increasing and decreasing responses during expectation and receipt of increasing reward magnitudes (Cromwell and Schultz 2003
), although probability and expected value remain to be investigated. The striatum forms the primary target region of dopamine projections (e.g., Lynd-Balta and Haber 1994
), and hemodynamic responses measured by fMRI primarily reflect input activity (Logothetis et al. 2001
). Accordingly, the presently observed increasing magnitude-related striatal activations resemble more closely possible inputs from dopamine neurons rather than local striatal activity. Moreover, the similarity between the currently observed striatal activations and phasic dopamine responses extends to probability and expected value. It is thus conceivable that the observed striatal activations are partly driven by dopaminergic inputs, although dilatory effects on the vascular system (e.g., Hughes et al. 1986
) cannot be entirely excluded. In addition, nondopaminergic inputs to the striatum or intrinsic computations within the striatum might be responsible for the nonhomogeneous, differential coding of magnitude and probability separate from expected value.
A major current finding consists of separate regions in the striatum and lateral orbitofrontal cortex that show distinct activations with expected value and uncertainty. Expected value and uncertainty of choice options are important parameters that determine behavioral preferences. They often vary independently when individual risk attitudes change over situations and time (Caraco et al. 1980
; Stephens and Krebs 1986
). It is therefore advantageous for agents to have an independent neuronal representation of both to choose according to individual risk preference while retaining sensitivity to variations in expected value. Thus by independently representing expected value and uncertainty, the currently observed striatal and orbitofrontal activations could make independent contributions to decisions involving risky choices.
The lateral orbitofrontal cortex showed stronger uncertainty-related activity with increasing individual risk aversion, whereas medial orbitofrontal activations correlated with increasing risk-seeking. Thus uncertainty responses are differentially modulated by individual risk attitudes in the two orbitofrontal regions. Individual risk attitudes are crucial in determining the utility of uncertain rewards. Expected utility theory postulates that the utility of a reward decreases with increasing uncertainty for a risk-averse individual but increases for a risk seeker. The negative and positive influences of uncertainty increase with increasing degrees of individual risk-avoiding and -seeking behavior, respectively. The differential orbitofrontal relationships of uncertainty coding to individual risk attitudes may contribute to the varying influences of uncertain rewards on utility for the individual decision maker.
Different prefrontal regions showed different forms of combined coding of expected value and variance. Taylor series expansion suggests that the expected utility of an option can be approximated by its mean and variance (and additional higher moments) (Huang and Litzenberger 1988
; Stephens and Krebs 1986
). As a consequence, expected value and uncertainty can separately influence the expected utility of an outcome. Risk-averse individuals aim to maximize expected reward value as well as minimize variance, whereas risk-seekers tend to maximize both expected value and variance. A variety of species, such as bumblebees (Real et al. 1982
) and juncos (Caraco and Lima 1985
), are sensitive to both expected value and variance. The present activations directly reflect the influence of individual risk attitude on uncertainty coding in voxels that also show expected value coding, both for risk averters and risk seekers. Although these activations may involve separate individual neurons, the close proximity of value and uncertainty coding may suggest an involvement of prefrontal cortex in the computation of an integrated expected utility signal. The selective influence of individual risk aversion on decreasing uncertainty coding contrasts with the selective influence of risk seeking on positive uncertainty coding in a different prefrontal region and may suggest that activations in different prefrontal regions underlie the pronounced differences between risk averters and risk seekers in choice preferences involving uncertain outcomes.
In conclusion, we show that reward structures of the human brain separately encode basic microeconomic reward parameters. Specifically, the striatum carries rather distinct representations of reward magnitude, probability, and expected value. Separate activations in the orbitofrontal cortex increase with reward uncertainty and correlate with individual risk attitudes. The data suggest largely distinct contributions of reward structures to the coding of value and uncertainty of reward-predicting stimuli. The particular prefrontal activations combining expected value and uncertainty into a single response may provide the basis for an expected utility signal. Thereby the presently observed activations may serve as a basis for economic decision-making.