The tendency to make unhealthy choices is hypothesized to be related to an individual's temporal discount rate, the theoretical rate at which they devalue delayed rewards. Furthermore, a particular form of temporal discounting, hyperbolic discounting, has been proposed to explain why unhealthy behavior can occur despite healthy intentions. We examine these two hypotheses in turn. We first systematically review studies which investigate whether discount rates can predict unhealthy behavior. These studies reveal that high discount rates for money (and in some instances food or drug rewards) are associated with several unhealthy behaviors and markers of health status, establishing discounting as a promising predictive measure. We secondly examine whether intention-incongruent unhealthy actions are consistent with hyperbolic discounting. We conclude that intention-incongruent actions are often triggered by environmental cues or changes in motivational state, whose effects are not parameterized by hyperbolic discounting. We propose a framework for understanding these state-based effects in terms of the interplay of two distinct reinforcement learning mechanisms: a “model-based” (or goal-directed) system and a “model-free” (or habitual) system. Under this framework, while discounting of delayed health may contribute to the initiation of unhealthy behavior, with repetition, many unhealthy behaviors become habitual; if health goals then change, habitual behavior can still arise in response to environmental cues. We propose that the burgeoning development of computational models of these processes will permit further identification of health decision-making phenotypes.
discounting; health; addiction; model-based; model-free; habit; preference reversal; hyperbolic
Standard theories of decision-making involving delayed outcomes predict that people should defer a punishment, whilst advancing a reward. In some cases, such as pain, people seem to prefer to expedite punishment, implying that its anticipation carries a cost, often conceptualized as ‘dread’. Despite empirical support for the existence of dread, whether and how it depends on prospective delay is unknown. Furthermore, it is unclear whether dread represents a stable component of value, or is modulated by biases such as framing effects. Here, we examine choices made between different numbers of painful shocks to be delivered faithfully at different time points up to 15 minutes in the future, as well as choices between hypothetical painful dental appointments at time points of up to approximately eight months in the future, to test alternative models for how future pain is disvalued. We show that future pain initially becomes increasingly aversive with increasing delay, but does so at a decreasing rate. This is consistent with a value model in which moment-by-moment dread increases up to the time of expected pain, such that dread becomes equivalent to the discounted expectation of pain. For a minority of individuals pain has maximum negative value at intermediate delay, suggesting that the dread function may itself be prospectively discounted in time. Framing an outcome as relief reduces the overall preference to expedite pain, which can be parameterized by reducing the rate of the dread-discounting function. Our data support an account of disvaluation for primary punishments such as pain, which differs fundamentally from existing models applied to financial punishments, in which dread exerts a powerful but time-dependent influence over choice.
People often prefer to ‘get pain out of the way’, treating pain in the future as more significant than pain now. One explanation, termed ‘dread’, is that anticipating pain is unpleasant or disadvantageous, rather like pain itself. Human brain imaging studies support the existence of dread, though it is unknown whether and how dread depends on the timing of future pain. We address this question by offering people decisions between moderately painful stimuli, and separately between imagined painful dental appointments occurring at different time points in the future, and use their choices to estimate dread. We show that future pain initially becomes more unpleasant when it is delayed, but as pain is moved further into the future, the effect of delay decreases. This is consistent with dread increasing as anticipated pain draws nearer, which is then combined with a general (and opposing) tendency to down-weight the significance of future events. We also show that dread can be attenuated by describing pain in terms of relief from an imagined even more severe pain. These observations reveal important principles about how people estimate the value of anticipated pain – relevant to a diverse range of human emotion and behavior.
Predictions about sensory input exert a dominant effect on what we perceive, and this is particularly true for the experience of pain. However, it remains unclear what component of prediction, from an information-theoretic perspective, controls this effect. We used a vicarious pain observation paradigm to study how the underlying statistics of predictive information modulate experience. Subjects observed judgments that a group of people made to a painful thermal stimulus, before receiving the same stimulus themselves. We show that the mean observed rating exerted a strong assimilative effect on subjective pain. In addition, we show that observed uncertainty had a specific and potent hyperalgesic effect. Using computational functional magnetic resonance imaging, we found that this effect correlated with activity in the periaqueductal grey. Our results provide evidence for a novel form of cognitive hyperalgesia relating to perceptual uncertainty, induced here by vicarious observation, with control mediated by the brainstem pain modulatory system.
punishment; reward; decision making; dopamine; serotonin
Bradykinesia is a cardinal feature of Parkinson’s disease (PD). Despite its disabling impact, the precise cause of this symptom remains elusive. Recent thinking suggests that bradykinesia may be more than simply a manifestation of motor slowness, and may in part reflect a specific deficit in the operation of motivational vigour in the striatum. In this paper we test the hypothesis that movement time in PD can be modulated by the specific nature of the motivational salience of possible action-outcomes.
We developed a novel movement time paradigm involving winnable rewards and avoidable painful electrical stimuli. The faster the subjects performed an action the more likely they were to win money (in appetitive blocks) or to avoid a painful shock (in aversive blocks). We compared PD patients when OFF dopaminergic medication with controls. Our key finding is that PD patients OFF dopaminergic medication move faster to avoid aversive outcomes (painful electric shocks) than to reap rewarding outcomes (winning money) and, unlike controls, do not speed up in the current trial having failed to win money in the previous one. We also demonstrate that sensitivity to distracting stimuli is valence specific.
We suggest this pattern of results can be explained in terms of low dopamine levels in the Parkinsonian state leading to an insensitivity to appetitive outcomes, and thus an inability to modulate movement speed in the face of rewards. By comparison, sensitivity to aversive stimuli is relatively spared. Our findings point to a rarely described property of bradykinesia in PD, namely its selective regulation by everyday outcomes.
The role dopamine plays in decision-making has important theoretical, empirical and clinical implications. Here, we examined its precise contribution by exploiting the lesion deficit model afforded by Parkinson’s disease. We studied patients in a two-stage reinforcement learning task, while they were ON and OFF dopamine replacement medication. Contrary to expectation, we found that dopaminergic drug state (ON or OFF) did not impact learning. Instead, the critical factor was drug state during the performance phase, with patients ON medication choosing correctly significantly more frequently than those OFF medication. This effect was independent of drug state during initial learning and appears to reflect a facilitation of generalization for learnt information. This inference is bolstered by our observation that neural activity in nucleus accumbens and ventromedial prefrontal cortex, measured during simultaneously acquired functional magnetic resonance imaging, represented learnt stimulus values during performance. This effect was expressed solely during the ON state with activity in these regions correlating with better performance. Our data indicate that dopamine modulation of nucleus accumbens and ventromedial prefrontal cortex exerts a specific effect on choice behaviour distinct from pure learning. The findings are in keeping with the substantial other evidence that certain aspects of learning are unaffected by dopamine lesions or depletion, and that dopamine plays a key role in performance that may be distinct from its role in learning.
Parkinson’s disease; learning; functional MRI; dopamine
The mesostriatal dopamine system is prominently implicated in model-free reinforcement learning, with fMRI BOLD signals in ventral striatum notably covarying with model-free prediction errors. However, latent learning and devaluation studies show that behavior also shows hallmarks of model-based planning, and the interaction between model-based and model-free values, prediction errors and preferences is underexplored. We designed a multistep decision task in which model-based and model-free influences on human choice behavior could be distinguished. By showing that choices reflected both influences we could then test the purity of the ventral striatal BOLD signal as a model-free report. Contrary to expectations, the signal reflected both model-free and model-based predictions in proportions matching those that best explained choice behavior. These results challenge the notion of a separate model-free learner and suggest a more integrated computational architecture for high-level human decision-making.
Disordered dopamine neurotransmission is implicated in mediating impulsiveness across a range of behaviors and disorders including addiction, compulsive gambling, attention-deficit/hyperactivity disorder, and dopamine dysregulation syndrome. Whereas existing theories of dopamine function highlight mechanisms based on aberrant reward learning or behavioral disinhibition, they do not offer an adequate account of the pathological hypersensitivity to temporal delay that forms a crucial behavioral phenotype seen in these disorders. Here we provide evidence that a role for dopamine in controlling the relationship between the timing of future rewards and their subjective value can bridge this explanatory gap. Using an intertemporal choice task, we demonstrate that pharmacologically enhancing dopamine activity increases impulsivity by enhancing the diminutive influence of increasing delay on reward value (temporal discounting) and its corresponding neural representation in the striatum. This leads to a state of excessive discounting of temporally distant, relative to sooner, rewards. Thus our findings reveal a novel mechanism by which dopamine influences human decision-making that can account for behavioral aberrations associated with a hyperfunctioning dopamine system.
Human have the arguably unique ability to understand the mental representations of others. For success in both competitive and cooperative interactions, however, this ability must be extended to include representations of others belief about our intentions, their model about our belief about theirs intentions, and so on. We developed a ‘Stag-hunt’ game in which human subjects interacted with a computerized agent using different degrees of sophistication (recursive inferences) and applied an ecologically-valid computational model of dynamic belief inference. We show that rostral medial prefrontal (paracingulate) cortex, a brain region consistently identified in psychological tasks requiring mentalizing, has a specific role in encoding the uncertainty of inference about the other’s strategy. In contrast, dorsolateral prefrontal cortex encodes the depth of recursion of the strategy being used, an index of executive sophistication. These findings reveal putative computational representations within prefrontal cortex regions supporting the maintenance of cooperation in complex social decision-making.
Game theory; medial prefrontal cortex; belief inference; model-based fMRI; social decision-making; Bayesian model
Studies of human decision making emerge from two dominant traditions: learning theorists [1–3] study choices in which options are evaluated on the basis of experience, whereas behavioral economists and financial decision theorists study choices in which the key decision variables are explicitly stated. Growing behavioral evidence suggests that valuation based on these different classes of information involves separable mechanisms [4–8], but the relevant neuronal substrates are unknown. This is important for understanding the all-too-common situation in which choices must be made between alternatives that involve one or another kind of information. We studied behavior and brain activity while subjects made decisions between risky financial options, in which the associated utilities were either learned or explicitly described. We show a characteristic effect in subjects' behavior when comparing information acquired from experience with that acquired from description, suggesting that these kinds of information are treated differently. This behavioral effect was reflected neurally, and we show differential sensitivity to learned and described value and risk in brain regions commonly associated with reward processing. Our data indicate that, during decision making under risk, both behavior and the neural encoding of key decision variables are strongly influenced by the manner in which value information is presented.
► Learned and explicitly described value and risk have different effects on behavior ► Learned and described value and risk have separable neural correlates ► Learned and described value are traded off in several brain regions ► Activity in the orbitofrontal cortex predicts bias toward learned options
A pernicious paradox in human motivation is the occasional reduced performance associated with tasks and situations that involve larger-than-average rewards. Three broad explanations that might account for such performance decrements are attentional competition (distraction theories), inhibition by conscious processes (explicit-monitoring theories), and excessive drive and arousal (overmotivation theories). Here, we report incentive-dependent performance decrements in humans in a reward-pursuit task; subjects were less successful in capturing a more valuable reward in a computerized maze. Concurrent functional magnetic resonance imaging revealed that increased activity in ventral midbrain, a brain area associated with incentive motivation and basic reward responding, correlated with both reduced number of captures and increased number of near-misses associated with imminent high rewards. These data cast light on the neurobiological basis of choking under pressure and are consistent with overmotivation accounts.
Post-encounter and circa-strike defensive contexts represent two adaptive responses to potential and imminent danger. In the context of a predator, the post-encounter reflects the initial detection of the potential threat, whilst the circa-strike is associated with direct predatory attack. We used fMRI to investigate the neural organization of anticipation and avoidance of artificial predators with high or low probability of capturing the subject across analogous post-encounter and circa-strike contexts of threat. Consistent with defense systems models, post-encounter threat elicited activity in forebrain areas including subgenual anterior cingulate cortex (sgACC), hippocampus and amygdala. Conversely, active avoidance during circa-strike threat increased activity in mid-dorsal ACC and midbrain areas. During the circa-strike condition, subjects showed increased coupling between the midbrain and mid-dorsal ACC and decreased coupling with the sgACC, amygdala and hippocampus. Greater activity was observed in the right pregenual ACC for high compared to low probability of capture during circa-strike threat. This region showed decreased coupling with the amygdala, insula and ventromedial prefrontal cortex. Finally, we found that locomotor errors correlated with subjective reports of panic for the high compared to low probability of capture during the circa-strike threat and these panic-related locomotor errors were correlated with midbrain activity. These findings support models suggesting that higher forebrain areas are involved in early threat responses, including the assignment and control of fear, whereas as imminent danger results in fast, likely “hard-wired”, defensive reactions mediated by the midbrain.
Fear; fMRI; Midbrain; Anxiety; Defense; Pain
Estimating the financial value of pain informs issues as diverse as the market price of analgesics, the cost-effectiveness of clinical treatments, compensation for injury, and the response to public hazards. Such valuations are assumed to reflect a stable trade-off between relief of discomfort and money. Here, using an auction-based health-market experiment, we show that the price people pay for relief of pain is strongly determined by the local context of the market, that is, by recent intensities ofpain or immediately disposable income (but not overall wealth). The absence of a stable valuation metric suggests that the dynamic behavior of health markets is not predictable from the static behavior of individuals. We conclude that the results follow the dynamics of habit-formation models of economic theory, and thus, this study provides the first scientific basis for this type of preference modeling.
Marginal utility theory prescribes the relationship between the objective property of the magnitude of rewards and their subjective value. Despite its pervasive influence, however, there is remarkably little direct empirical evidence for such a theory of value, let alone of its neurobiological basis. We show that human preferences in an inter-temporal choice task are best described by a model that integrates marginally diminishing utility with temporal discounting. Using functional magnetic resonance imaging (fMRI), we show that activity in the dorsal striatum encodes both the marginal utility of rewards, over and above that which can be described by their magnitude alone, and the discounting associated with increasing time. In addition, our data show that dorsal striatum may be involved in integrating subjective valuation systems inherent to time and magnitude, thereby providing an overall metric of value used to guide choice behaviour. Furthermore, during choice we show that anterior cingulate activity correlates with the degree of difficulty associated with dissonance between value and time. Our data support an integrative architecture for decision-making, revealing the neural representation of distinct subcomponents of value that may contribute to impulsivity and decisiveness.
Utility; Intertemporal; fMRI; striatum; decision-making; impulsivity
The human orbitofrontal cortex is strongly implicated in appetitive valuation. Whether its role extends to support comparative valuation necessary to explain probabilistic choice patterns for incommensurable goods is unknown. Using a binary choice paradigm we derived the subjective values of different bundles of goods, under conditions of both gain and loss. We demonstrate that orbitofrontal activation reflects the difference in subjective value between available options, an effect evident across valuation for both gains and losses. By contrast activation in dorsal striatum and supplementary motor areas reflects subjects’ choice probabilities. These findings indicate that orbitofrontal cortex plays a pivotal role in valuation for incommensurable goods, a critical component process in human decision making.
orbitofrontal cortex; dorsal striatum; basal ganglia; neuroeconomics; decision making; reward
Genetic variation at the serotonin transporter-linked polymorphic region (5-HTTLPR) is associated with altered amygdala reactivity and lack of prefrontal regulatory control. Similar regions mediate decision-making biases driven by contextual cues and ambiguity, for example the “framing effect.” We hypothesized that individuals hemozygous for the short (s) allele at the 5-HTTLPR would be more susceptible to framing. Participants, selected as homozygous for either the long (la) or s allele, performed a decision-making task where they made choices between receiving an amount of money for certain and taking a gamble. A strong bias was evident toward choosing the certain option when the option was phrased in terms of gains and toward gambling when the decision was phrased in terms of losses (the frame effect). Critically, this bias was significantly greater in the ss group compared with the lala group. In simultaneously acquired functional magnetic resonance imaging data, the ss group showed greater amygdala during choices made in accord, compared with those made counter to the frame, an effect not seen in the lala group. These differences were also mirrored by differences in anterior cingulate–amygdala coupling between the genotype groups during decision making. Specifically, lala participants showed increased coupling during choices made counter to, relative to those made in accord with, the frame, with no such effect evident in ss participants. These data suggest that genetically mediated differences in prefrontal-amygdala interactions underpin interindividual differences in economic decision making.
Reward processing is linked to specific neuromodulatory systems with a dopaminergic contribution to reward learning and motivational drive being well established. Neuromodulatory influences on hedonic responses to actual receipt of reward, or punishment, referred to as experienced utility are less well characterized, although a link to the endogenous opioid system is suggested. Here, in a combined functional magnetic resonance imaging–psychopharmacological investigation, we used naloxone to block central opioid function while subjects performed a gambling task associated with rewards and losses of different magnitudes, in which the mean expected value was always zero. A graded influence of naloxone on reward outcome was evident in an attenuation of pleasure ratings for larger reward outcomes, an effect mirrored in attenuation of brain activity to increasing reward magnitude in rostral anterior cingulate cortex. A more striking effect was seen for losses such that under naloxone all levels of negative outcome were rated as more unpleasant. This hedonic effect was associated with enhanced activity in anterior insula and caudal anterior cingulate cortex, areas implicated in aversive processing. Our data indicate that a central opioid system contributes to both reward and loss processing in humans and directly modulates the hedonic experience of outcomes.
naloxone; opioid; reward; fMRI; cingulate; insula; human
In economic decision making, outcomes are described in terms of risk (uncertain outcomes with certain probabilities) and ambiguity (uncertain outcomes with uncertain probabilities). Humans are more averse to ambiguity compared to risk with a distinct neural system suggested as mediating this effect. However, there has been no clear disambiguation of activity related to decisions themselves from perceptual processing of ambiguity. In a functional magnetic resonance imaging (fMRI) experiment we contrasted ambiguity, defined as a lack of information about outcome probabilities, to risk where outcome probabilities are known or ignorance where outcomes are completely unknown and unknowable. We modified previously learned Pavlovian CS+ stimuli such that they became an ambiguous cue and contrasted evoked brain activity both with an unmodified predictive CS+ (risky cue), and a cue that conveyed no information about outcome probabilities (ignorance cue). Compared to risk, ambiguous cues elicited activity in posterior inferior frontal gyrus and posterior parietal cortex during outcome anticipation. Furthermore, a similar set of regions was activated when ambiguous cues were compared with ignorance cues. Thus, regions previously shown to be engaged by decisions about ambiguous rewarding outcomes are also engaged by ambiguous outcome prediction in the context of aversive outcomes. Moreover, activation in these regions was seen even when no actual decision is made. Our findings suggest that these regions subserve a general function of contextual analysis when search for hidden information during outcome anticipation is both necessary and meaningful.
ambiguity; risk; uncertainty; probability distribution; probabilistic outcome prediction; Pavlovian conditioning; fear conditioning; fMRI; BOLD
Humans, like other animals, alter their behavior depending on whether a threat is close or distant. We investigated spatial imminence of threat by developing an active avoidance paradigm in which volunteers were pursued through a maze by a virtual predator endowed with an ability to chase, capture, and inflict pain. Using functional magnetic resonance imaging, we found that as the virtual predator grew closer, brain activity shifted from the ventromedial prefrontal cortex to the periaqueductal gray. This shift showed maximal expression when a high degree of pain was anticipated. Moreover, imminence-driven periaqueductal gray activity correlated with increased subjective degree of dread and decreased confidence of escape. Our findings cast light on the neural dynamics of threat anticipation and have implications for the neurobiology of human anxiety-related disorders.
The vigor with which a participant performs actions that produce valuable outcomes is subject to a complex set of motivational influences. Many of these are believed to involve the amygdala and the nucleus accumbens, which act as an interface between limbic and motor systems. One prominent class of influences is called pavlovian–instrumental transfer (PIT), in which the motivational characteristics of a predictor influence the vigor of an action with respect to which it is formally completely independent. We provide a demonstration of behavioral PIT in humans, with an audiovisual predictor of the noncontingent delivery of money inducing participants to perform more avidly an action involving squeezing a handgrip to earn money. Furthermore, using functional magnetic resonance imaging, we show that this enhanced motivation was associated with a trial-by-trial correlation with the blood oxygenation level-dependent (BOLD) signal in the nucleus accumbens and a subject-by-subject correlation with the BOLD signal in the amygdala. Our data dovetails well with the animal literature and sheds light on the neural control of vigor.
motivation; learning; reward; pavlovian conditioning; reinforcement; decision
The neural processes underlying empathy are a subject of intense interest within the social neurosciences1-3. However, very little is known about how brain empathic responses are modulated by the affective link between individuals. We show here that empathic responses are modulated by learned preferences, a result consistent with economic models of social preferences4-7. We engaged male and female volunteers in an economic game, in which two confederates played fairly or unfairly, and then measured brain activity with functional magnetic resonance imaging while these same volunteers observed the confederates receiving pain. Both sexes exhibited empathy-related activation in pain-related brain areas (fronto-insular and anterior cingulate cortices) towards fair players. However, these empathy-related responses were significantly reduced in males when observing an unfair person receiving pain. This effect was accompanied by increased activation in reward-related areas, correlated with an expressed desire for revenge. We conclude that in men (at least) empathic responses are shaped by valuation of other people's social behaviour, such that they empathize with fair opponents while favouring the physical punishment of unfair opponents, a finding that echoes recent evidence for altruistic punishment.
Decision making in an uncertain environment poses a conflict between the opposing demands of gathering and exploiting information. In a classic illustration of this ‘exploration–exploitation‘ dilemma1, a gambler choosing between multiple slot machines balances the desire to select what seems, on the basis of accumulated experience, the richest option, against the desire to choose a less familiar option that might turn out more advantageous (and thereby provide information for improving future decisions). Far from representing idle curiosity, such exploration is often critical for organisms to discover how best to harvest resources such as food and water. In appetitive choice, substantial experimental evidence, underpinned by computational reinforcement learning2 (RL) theory, indicates that a dopaminergic3,4, striatal5-9 and medial prefrontal network mediates learning to exploit. In contrast, although exploration has been well studied from both theoretical1 and ethological10 perspectives, its neural substrates are much less clear. Here we show, in a gambling task, that human subjects' choices can be characterized by a computationally well-regarded strategy for addressing the explore/exploit dilemma. Furthermore, using this characterization to classify decisions as exploratory or exploitative, we employ functional magnetic resonance imaging to show that the frontopolar cortex and intraparietal sulcus are preferentially active during exploratory decisions. In contrast, regions of striatum and ventromedial prefrontal cortex exhibit activity characteristic of an involvement in value-based exploitative decision making. The results suggest a model of action selection under uncertainty that involves switching between exploratory and exploitative behavioural modes, and provide a computationally precise characterization of the contribution of key decision-related brain systems to each of these functions.