|Home | About | Journals | Submit | Contact Us | Français|
Reward information is represented by many subcortical areas and neuron types, which constitute a complex network. Its output is usually mediated by the basal ganglia where behaviors leading to rewards are disinhibited and behaviors leading to no reward are suppressed. Midbrain dopamine neurons modulate these basal ganglia neurons differentially using signals related to reward-prediction error. Recent studies suggest that other types of subcortical neurons assist, instruct, or work in parallel with dopamine neurons. Such reward-related neurons are found in the areas which have been associated with stress, pain, mood, emotion, memory, and arousal. These results suggest that reward needs to be understood in a larger framework of animal behavior.
Animals can maintain their body states only by actively foraging for food and water and can ensure the survival of their own species only by actively acquiring appropriate mates. This suggests that many animals share common neural mechanisms of such reward-directed behaviors. Over the course of evolution many new brain areas have emerged, notably the cerebral cortex. However, it is likely that phylogenetically older structures (collectively called subcortical structures) have retained fundamental mechanisms for reward-directed behavior. Indeed, lesions in subcortical structures such as the hypothalamus and the basal ganglia render animals unable to control goal-directed behaviors even when their basic sensory and motor functions appeared normal. On the other hand, mammals whose cerebral cortex was removed in infancy could perform many reward-directed behaviors normally. Supporting these earlier discoveries, recent studies have provided evidence that many subcortical areas represent reward information. In this review we first characterize the nature of reward representation in each area and discuss the possible subcortical mechanisms of reward-directed behavior.
Neurons in the dorsal part of the striatum are activated both by preparing and executing actions, and by anticipating and receiving rewards. Thus the dorsal striatum is well-positioned to guide motivated behavior, since its neural information could be used to select the action whose reward value is greatest. Indeed, although some striatal neurons act as pure reward predictors, others anticipate the reward value of specific cues and actions [1–3]. These signals are probably computed within the dorsal striatum, as they are different from reward representations in striatum-projecting regions of frontal cortex [4•,5]. Interestingly, some neurons track the value of specific actions, regardless of whether the animal chooses to perform them ; but other neurons track values in terms of choice, signaling the ‘chosen value’ or ‘non-chosen value’ [6•]. Also, although many neurons link actions to their expected rewards, fewer neurons appear to link actions to their resulting actual rewards [7•]. This is consistent with recent findings that electrical stimulation of the caudate nucleus can act much like a reinforcer, biasing animals to prepare actions that were ‘rewarded’ with stimulation [8•,9•], but that stimulation is equally reinforcing for both contralateral and ipsilateral actions [9•].
Compared to the dorsal striatum, the ventral striatum has been linked more clearly with stimulus-outcome learning than with action selection. At the level of single cells, however, neurons in both regions encode a similar mixture of signals, responding to both actions and rewards . The different functions of dorsal and ventral striatum may emerge more clearly when viewed at the level of neural populations. Functional imaging studies in humans have found striatal activity to correlate with an astounding variety of reward-related variables [11–13]. These studies have led to theories that different parts of the striatum predict rewards at different timescales [14•]; or preferentially encode predictions vs. deviations from predictions ; or have distinct selectivity for the outcomes of freely chosen actions . Functional segregation within the striatum has also been investigated in lesion and inactivation studies in rats, suggesting that different parts of the dorsal striatum have distinct roles in learning stimulus-action vs. action-outcome associations [17•], and that dorsal and ventral striatum may make different contributions to skill learning and expression [18•].
A series of pioneering studies by Schultz and colleagues have shown that midbrain dopamine neurons behave like a ‘reward-prediction error’ signal – they fire a burst of spikes when the reward value is higher than expected, but are inhibited when the value is lower . These dopaminergic value signals combine information about several aspects of rewards, including their probability, magnitude, and delay [20,22••].
Yet if dopamine neurons are to guide decision-making, they must not only signal the value of individual reward outcomes, but also the value of an opportunity to choose between them. A recent investigation found that when monkeys are presented with a pair of potential actions, dopamine neurons signal the value of the monkey’s chosen action, even if it is the less-valuable option [21••]. However, a second study in rats reported a very different result: that dopamine neurons initially signal the value of the better action [22••]. One possible reason for this difference is that the two studies recorded from different populations of dopamine neurons [22••] – one in the substantia nigra pars compacta (SNc), and the other in the ventral tegmental area (VTA).
Indeed, there is evidence that the SNc and VTA have distinct roles in reward learning. For example, rats learn and perform orienting to reward-predictive cues via a pathway from the central amygdala to the SNc to the dorsolateral striatum . A similar circuit may exist in monkeys, where reward-motivated orienting depends on striatal dopamine transmission, with distinct contributions from different dopamine receptors [24••,25]. In contrast to the SNc, the VTA is not a part of this pathway, but VTA lesions reduce the potency of reward-associated cues to drive reward-seeking actions [26,23].
The lateral habenula has been implicated in many emotional and cognitive functions including anxiety, stress, pain, learning and attention . In addition, recent studies reported that the lateral habenula also plays a crucial role in reward processing, especially in relation to midbrain dopamine neurons. Matsumoto and Hikosaka [28••] found that lateral habenula neurons in monkeys responded to rewards and reward-predicting sensory stimuli. They were excited by non-reward-predicting stimuli and inhibited by reward-predicting stimuli. In addition, lateral habenula neurons were excited when the expected reward was omitted, and inhibited when a reward was given unexpectedly. All of these responses were opposite to those of dopamine neurons.
The opposite response pattern led to the hypothesis that the response of lateral habenula neurons are causally related with the response of dopamine neurons. On unrewarded trials, the excitation of lateral habenula neurons started earlier than the inhibition of dopamine neurons [28••]. Electrical stimulation of the lateral habenula inhibits dopamine neurons [28••,29•]. These observations suggest that the excitation of lateral habenula neurons can trigger the inhibition of dopamine neurons. Thus, the lateral habenula is likely to be a major source of negative reward-related signals in dopamine neurons, and perhaps in other subcortical areas as well.
Damage to the hypothalamus, especially the lateral hypothalamus and the mediodorsal hypothalamus, disrupts feeding behavior. Earlier studies showed that neurons in the lateral hypothalamus become active in anticipation of food rewards, responding to the sight of foods or the arbitrary sensory cues that predict the upcoming food rewards. It was subsequently discovered that a group of lateral hypothalamic neurons contain orexin (hypocretin) and serve both to maintain arousal level and to promote feeding . Recent studies have revealed that reward-seeking behavior is, at least partly, mediated by the orexin neuron-induced activation of VTA dopamine neurons projecting to the nucleus accumbens [31,32•]. Mice lacking the orexin precursor gene showed no morphine-induced place preference [32•]. Orexin also mediates rewarding effects of sexual behavior. In rats orexin neurons were activated during copulation, which in turn increased the dopamine level in the nucleus accumbens [33•].
Although previous research on the amygdala tended to focus on the influence of emotions on perception and cognition, recent studies by Salzman and his colleagues highlighted the value representation of the amygdala. Paton et al. [34••] examined the value representation while monkeys were conditioned in a Pavlovian procedure in which the monkeys formed associations between conditioned stimuli and reward or aversive-airpuffs. They found that separate populations of neurons in the amygdala represent the positive and negative values assigned with the conditioned stimuli. Belova et al. [35•] examined the response of amygdala neurons to reward and airpuffs themselves under two conditions: one in which the outcomes occurred predictably, the other in which the outcomes occurred unpredictably. They found that many amygdala neurons responded differently to reward and airpuffs, and that these responses were frequently modulated by the prediction. The reward representation in the amygdala may serve to consolidate memory formation . Paz et al. [37••] demonstrated that reward-dependent activation of baso-lateral amygdala neurons facilitates impulse transmission from perirhinal to entorhinal neurons. Because the rhinal cortices constitute the main route for impulse traffic into and out of the hippocampus, the perirhinal and entorhinal interaction seems likely to be linked to memory formation. Indeed, the strength of the interaction was tightly correlated with animals’ associative learning. This finding may explain how animals form more vivid memories of emotionally charged events.
Serotonin is involved in many functions, ranging from the development of the brain  to social behaviors . There is no consensus so far on the exact roles and mechanisms of serotonin function. Some of the recognized theories include, (1) defense mechanisms , (2) temporal discounting of reward value , and (3) negative reward signal as an opponent of dopamine signals . The last theory postulates that the phasic discharge of serotonin acts as a negative prediction-error signal. However, pure opponency seems too simple, considering the fact that serotonin and dopamine systems interact in various levels . Recent studies seem to support the temporal discounting theory .
Despite these many experiments and theories, it was unclear whether serotonin neurons carried reward information. Using reward-biased saccade tasks Nakamura et al. [43••] clarified that neurons in the monkey dorsal raphe, presumably including serotonin neurons, changed their activity differentially depending on the value of the expected reward, as well as the received reward. In striking contrast to dopamine neurons, the response to the reward was invariant whether or not it was expected.
The pallidum is divided into the dorsal pallidum (internal and external globus pallidus) and the ventral pallidum. A series of studies from Berridge and colleagues have suggested that reward information is strongly represented in the ventral pallidum. Their recent work has teased apart the ‘liking’ (hedonic fillings) and ‘wanting’ (motivation) systems in the limbic part of the basal ganglia [44•,45]. They concluded that while nucleus accumbens and ventral pallidum acted together to represent ‘liking’, nucleus accumbens alone could represent ‘wanting’, independent of the ventral pallidum. One tempting conclusion drawn from recent experiments is that the opioid system is necessary for hedonic experience, ‘liking’, whereas the dopamine system is important for motivation, ‘wanting’ .
While less well known, reward signals also infiltrate into the dorsal pallidum. The dorsal pallidal neurons may inherit reward information from the dorsal striatum where neurons are known to be modulated quite strongly by expected rewards [1,15]. It should be noted that a portion of neurons in the internal segment of the globus pallidus (GPi) project to the lateral habenula . A recent study has shown that the lateral habenula-projecting GPi neurons encode strong reward prediction signals similar to the lateral habenula neurons (S Hong and O Hikosaka, abstract in Soc Neurosci Abstr 2007, 749.25). Moreover the latency of this reward-related modulation was shorter for these neurons compared to the lateral habenula neurons indicating excitatory connection. These results suggest that GPi may initiate reward-related signals through its effects on the lateral habenula, which then influences the dopaminergic and serotonergic systems.
Recent studies suggest that a number of subcortical areas and neuron types represent reward information and constitute complex networks (Figure 1). As theoretical studies have suggested , different types of neurons appear to contribute to different aspects of reward-based learning and decision-making. Unlike dopamine neurons and lateral habenula neurons, dorsal raphe neurons (including serotonin neurons) do not represent reward-prediction error, and amygdala neurons do so only partially. Unlike neurons in the striatum and the amygdala, dopamine and serotonin neurons as well as lateral habenula neurons do not encode specific sensorimotor signals, such as target direction. Unique among subcortical areas (as tested so far), dorsal striatal neurons have activity related to the value of specific actions, goal positions or object identities [1–3]. These signals would create a bias in the basal ganglia network so that the action preferred by the striatal neurons is more likely to occur . This may be the ultimate mechanism of action for subcortical reward-directed activity.
Interestingly, the new players in the subcortical reward network have traditionally been associated with other functions: serotonin neurons for mood, stress, and sleep; amygdala for emotion and memory; lateral habenula for circadian rhythm, pain, and stress; orexin neurons for arousal. This implies that reward needs to be understood in a larger framework of animal behaviors. For example, omission of expected reward is similar to punishment  which animals would want to avoid. Arousal is the state where motor behaviors are activated in general, while reward-seeking behaviors largely involve a large part of motor behaviors including locomotion and orienting. It is thus feasible that the reward network and the arousal (or circadian) network have evolved by sharing the same mechanism.
This work was supported by the intramural research program of the National Eye Institute.
Papers of particular interest, published within the period of review, have been highlighted as:
• of special interest
•• of outstanding interest