One aspect of ESB [electrical brain stimulation] that made it appear most artificial was that there was no external embodiment of the incentive object from which the pleasurable sensations arose. No taste, smell, or tactile sources of positive affect, and no consummatory behavior associated with it. Another feature of EBS that puzzled many was that there appeared to be no motive for taking it, no naturally occurring drive for it, and that, paradoxically, it could induce motivated behaviors such as eating and predatory attack, as well as act as a rewarding event. It soon became clear, however, that animals would become attracted to objects and places associated with the delivery of EBS. They would run alleys and mazes to get to a place where it was normally available and would choose to be in places associated with its delivery.
A similar state of affairs holds in the case of self-administration of stimulant and opiate drugs. (P. 252,
Stewart et al., 1984)
The purpose of this section is to provide a conceptual framework for understanding how the meso-ventromedial and ventrolateral striatal dopamine systems play roles in motivated behavior, especially drug reward* (asterisks indicate technical terms that are described in ). The present theoretical framework is a product of synthesis of previous incentive motivation* hypotheses (
Bindra, 1968;
Trowill et al., 1969;
Bolles, 1972,
Panksepp, 1982;
Stewart et al., 1984;
Fibiger and Phillips, 1986) and related perspectives (
Hebb, 1955;
Schneirla, 1959;
Glickman and Schiff, 1967). Specifically, the present paper adopts the perspective that the meso-striatal dopamine systems are part of the set of coordinated neuronal mechanisms that allows organisms to find biologically important stimuli from the environment, to promote and sustain life (SEEKING or expectancy/foraging system) (
Panksepp, 1982;
1998). This perspective has been applied in a review of the functional roles of nucleus accumbens dopamine (
Ikemoto and Panksepp, 1999).
4.1. Behavioral variation and selection
This section discusses historical background followed by a hypothesis to address how the meso-striatal dopamine systems, particularly the ventromedial and ventrolateral striatal components, participate in goal-directed learning such as drug self-administration.
Scientific explanations of goal-directed learning began with seminal studies by
Thorndike (1898;
1911). He introduced the notion of “the law of effect” to explain how animals learn to acquire adaptive responses when they are confronted with a new environmental situation. For example, a hungry rat placed in a novel, unthreatening environment would explore the environment; if the rat finds food, it would consume food. When this procedure is repeated, the rat would display seeking behavior ever more efficiently to obtain food in that environment.
Thorndike (1911) proposed that “Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur” (p. 244). In other words, when animals are challenged by a new situation, they display an “impulse to act” in a trial-and-error fashion, and some actions are followed closely in time by adaptive consequences which are then strengthened or stamped in, thus occurring more often in the future. It is not difficult to recognize a remarkable conceptual parallel between learning by Thorndike’s law of effect and evolution by Darwin’s variation-selection. However, this parallel largely escaped the attention of dominant learning theorists (e.g.,
Hull, 1943) for decades to come. This state of affairs undermined early investigation of how, in the beginning of learning sessions, a variety of responses occur at high frequencies, so that some responses would be “strengthened” by consequences. Instead, goal-directed learning had been explained with the concept of reinforcement* as the major mechanism supplemented by motivational concepts such as drive.
Skinner (1938), who played a key role in establishing instrumental (operant) conditioning as a scientific paradigm, proposed that “If the occurrence of an operant [instrumental response] is followed by presentation of a reinforcing stimulus, the strength is increased” and “If the occurrence of an operant already strengthened through conditioning is not followed by the reinforcing stimulus, the strength is decreased” (p. 21).
Decades later,
Skinner (1981) suggested reinforcement as Darwinian selection operating at the level of behavior. He wrote, “Selection by consequences is a causal mode found only in living things, or in machines made by living things. It was first recognized in natural selection, but it also accounts for the shaping and maintenance of the behavior of the individual and the evolution of cultures” (p. 501). But Skinner left out variation, which must come before selection.
Staddon and Simmelhag (1971) suggested, 10 years earlier than Skinner, a concept that unifies motivation and learning, and stated that “both evolution and learning can be regarded as the outcome of two independent processes: a process of variation that generates either phenotypes, in the case of evolution, or behavior, in the case of learning; and a process of selection that acts within the limits set by the first process” (p. 19). Staddon and Simmelhag’s variation* and selection* hypothesis of goal-directed learning provide more seamless explanations than the concept of reinforcement for learning and other anomalous learning phenomena such as “superstition” and “autoshaping”.
In his influential study,
Skinner (1948) argued that response-independent deliveries of food to hungry animals are sufficient to induce instrumental conditioning, a phenomenon he called “superstition”. In his study, pigeons were maintained at 75% of their original body weights and, thus, were presumably highly motivated for food at the time of testing. They received a piece of food every 15 seconds, and food deliveries were independent of their behavior. Overtime, they developed certain specific responses that were displayed just before food delivery. Skinner accounted for this with the idea of reinforcement, arguing that the animals adventitiously made the association between their responses and food delivery and, hence, increased responses. His explanation may partly account for this phenomenon, but is at best incomplete because it does not explain how this procedure so reliably induced adjunctive behavior. After all, the animals did not have to do anything to get food. Staddon and Simmelhag’s variation-selection hypothesis explains that intermittent scheduled delivery of food elicited responses via the variation process. Some of those responses were then selected or reinforced via the selection process.
Another behavioral phenomenon as intriguing as superstition is “autoshaping” or “sign-tracking” (
Brown and Jenkins, 1968,
Hearst and Jenkins, 1974), initially discovered in pigeons. When a key is illuminated just before food is delivered to food-restricted, experimentally naïve pigeons, they quickly learn to peck at the illuminated key, even though food delivery is response-independent. This phenomenon cannot be readily explained by reinforcement, because animals in an autoshaping procedure keep responding at the illuminated key even when responding at the key actually prevents food delivery (
Williams and Williams, 1969). According to Staddon and Simmelhag’s variation-selection hypothesis, intermittent delivery of food in hungry animals triggers various responses via the variation process. In addition, salient cue illumination that predicted food delivery guided (or selected) state-appropriate responses, in this case, pecking via the selection process. Thus, the variation-selection hypothesis offers an explanation for increased responses in which the delivery of reward is not dependent on response-contingency*.
Behavioral variation concerns diversity in behavior. It is not appropriate, however, to surmise that environmental demands generate any variation in physical movements, because behavior of organisms is constrained by phylogeny (e.g.,
Breland and Breland, 1961), just like evolutionary variation. It would be more appropriate to think of variation as increased episodes of unconditioned responding that are elicited by the perception of novelty or uncertainty in the environment. As a result of variation, some behavioral episodes that lead to significant consequences will be strengthened, and others that do not lead to adaptive consequences will be inhibited by selection mechanisms, leaving adaptive ones to recur.
4.1.1. Variation-selection hypothesis of striatal functional organization The meso-dorsal striatal (i.e., nigro-striatal) dopamine system has long been thought to control behavioral selection (
Robbins, 1976;
Cools, 1980;
Wickens, 1993;
Mink, 1996;
Yin and Knowlton, 2006). Based on the hierarchical organization summarized in , the variation-selection hypothesis of goal-directed learning (
Staddon and Simmelhag, 1971), and behavioral data reviewed below, it is argued that the striatal complex plays a key role in both behavioral variation and selection. In particular, the meso-ventromedial striatal dopamine system appears to participate in the variation process, generating unconditioned responding. The meso-ventrolateral striatal dopamine system, in addition to the meso-dorsal striatal dopamine system, appears to participate in behavioral selection by modulating associative learning. The present proposal of the variation-selection hypothesis is a further elaboration of the concept of flexible and habit response systems generated to explain functional roles of nucleus accumbens and dorsal striatal dopamine (
Ikemoto and Panksepp, 1999).
Discussions of the roles played by the dopamine systems in variation and selection are preceded by discussion of the methodological issues involved in investigating behavioral functions and functional organization of the nucleus accumbens-olfactory tubercle complex.
4.2. Methodological notes
Here, popular behavioral methods are discussed with respect to their advantages and disadvantages and their merit for studying phasic or tonic dopamine functions. Two valuable tools to investigate neuronal mechanisms of behavior in animals are permanent destruction of neuronal populations by local administration of toxins and temporal modulations of neuronal communication by local administration of drugs. Neurotoxins such as excitatory amino acids (or excitotoxins) and 6-OHDA permanently damage neurons and are useful to investigate loss of function following lesions. Excitotoxins have the advantage of damaging cell-bodies without damaging fibers of passage, and the extent of damaged areas can be determined relatively easily following experiments using histological procedures. The major advantage of 6-OHDA lesions is their selectivity to dopamine neurons if additional pharmacological tools are sensibly co-applied. However, it is more difficult to determine the exact extent and consequences of damage induced by 6-OHDA lesions. Especially when small lesions of dopaminergic terminals confined in regions such as the accumbens core or medial shell are made, it is unclear whether such lesions result in a decrease or increase in extracellular dopamine levels (
Parkinson et al., 2002;
Ikegami et al., 2006). Dopamine could diffuse from adjacent dopamine-rich regions with intact terminals to lesioned sites, which have no terminals for dopamine uptake. Because behavioral tests are typically given following recovery periods of a week or more, permanent lesions induced by these toxins most likely involve functional compensation, a mechanism that makes interpretation of data difficult. It is especially difficult to interpret lesion data when behavioral tests detect no functional loss.
The compensation issue is minimal in temporal neuronal modulations induced by microinjection of drugs, because behavioral tests typically follow immediately after microinjections. Microinjection procedures can modulate selective neuronal communication when ligands of selective receptors are used. Disadvantages of this technique (for extended discussion, see
Ikemoto and Wise, 2004) include the difficulty of identifying the exact regions of drug action, because drugs diffuse as soon as they are microinjected into a site. Thus, it is essential to address the issue of anatomical specificity. The most effective approach in freely moving animals may be to perform control injections into neighboring regions when functions of selective regions are being investigated. In addition, effects induced by microinjections of drugs are transient, typically lasting for only a few minutes. Whether or not this property of the method is helpful or not depends on tasks and research questions.
Because dopamine signals fluctuate phasically and tonically (see in
Ikemoto and Panksepp, 1999), it is reasonable to hypothesize that phasic dopamine signals have different functional roles than tonic signals. No consensus has been reached as to how exactly phasic and tonic dopamine signals should be defined; in this paper, a phasic dopamine signal is defined as a brief increase (lasting up to 2 sec) in dopamine concentration (or “transient”) in terminal regions, which is thought to result from an episode of burst firing of a neuron (temporal summation) (
Gonon, 1988;
Wightman and Zimmerman, 1990;
Suaud-Chagny et al., 1992) or simultaneous firing of multiple cells projecting to the same target neurons (spatial summation). A tonic signal is defined as a slow change in concentration, lasting from tens of seconds to hours to days or longer (for another definition, see
Grace, 1995;
Goto and Grace, 2005). Phasic changes, which are temporally and spatially selective, may play a critical role in learning, because in most cases associative learning depends on temporal contiguity*. Tonic changes may play a critical role in motivation and affect*, because motivation and affect concern tonic states. Indeed, tonic signals in the ventral striatum as measured by microdialysis procedures appear to correlate with hyper-activity effects as indicated by locomotion (
Sharp et al., 1987;
Steinpreis and Salamone, 1993). The current literature on function does not offer sufficient information to address phasic and tonic issues; the last section of this paper will offer some suggestions for future investigation.
summarizes popular research tools and their effectiveness for the investigation of functional roles of phasic and tonic dopamine signals. Although not neurotransmitter-selective, brief electrical stimulation administered at a frequency of 10 Hz or higher could mimic an episode of burst firing of dopamine neurons, which results in transients (
Gonon, 1988;
Wightman and Zimmerman, 1990;
Suaud-Chagny et al., 1992;
Marinelli et al., 2006). Injections of dopaminergic agents such as amphetamine and cocaine may mimic phasic increase in dopamine levels. Microdialysis techniques are useful for detecting tonic, but not phasic, changes, whereas the fast-scan cyclic voltammetry or chronoamperometry are useful for detecting phasic changes in dopamine levels (
Wightman and Robinson, 2002).
| Table 3Research tools and their capacities to study phasic or tonic changes |
4.3. Functional organization of the nucleus accumbens-olfactory tubercle complex
Limited behavioral data suggest that functions of the medial shell and medial tubercle are similar to each other and different from those of the ventrolateral striatum. To date, much of the functional analysis of the nucleus accumbens has focused on the differences between the accumbens core and shell, and many studies found that the core and shell are involved in different functions (
Di Chiara, 2002;
Kelley, 2004;
Everitt and Robbins, 2005). However, the shell is shaped like a crescent, lying medial and ventral to the core (), and it is not possible to selectively manipulate the entire shell with a single application of drugs or toxins. Because most functional studies have compared the core with the medial, but not lateral, shell, the differences in function should be attributed to the differences between the core and medial portion of the shell.
Microinjections of cocaine into the medial shell elicit more robust forward locomotion and rearing than cocaine injections into the lateral part of the accumbens shell (
Ikemoto, 2002). As mentioned above (section 1.2), rats readily learn to self-administer cocaine or amphetamine into the medial shell, but not the lateral shell (
Ikemoto, 2003;
Ikemoto et al., 2005). As for the olfactory tubercle, microinjections of cocaine into the medial olfactory tubercle elicit locomotion and rearing more effectively than those into the lateral tubercle (
Ikemoto, 2002), and rats readily learn to self-administer cocaine or amphetamine into the medial tubercle, but not the lateral tubercle (
Ikemoto, 2003;
Ikemoto et al., 2005). These behavioral data suggest functional differences between the medial shell and lateral shell and between the medial tubercle and lateral tubercle, and functional similarity between the medial shell and medial tubercle. The data are therefore consistent with the two dopamine systems proposed here (). The following review focuses on functional studies of the nucleus accumbens, but not the olfactory tubercle, because the behavioral functions of the olfactory tubercle have not received much research attention.
4.4. Functional roles of the nucleus accumbens
Nucleus accumbens dopamine has been implicated in two major roles: invigoration of approach and incentive learning* (
Ikemoto and Panksepp, 1999). The former conclusion is based on many studies such as those carried out by Robbins and his colleagues on conditioned reinforcement*. Their typical protocol (
Robbins and Everitt, 1982) involves thirsty rats that are trained to lick water from a dipper hidden behind a panel when a light signal and a mechanical noise (conditioned stimuli*) signify the availability of the water dipper. Panel pushing following these stimuli always produces access to water. When the rat learns the relationship between the conditioned stimuli and the water availability, two levers are introduced to the test chamber. Access to the unconditioned stimulus* (i.e., water) is no longer available in this phase. Normal rats learn to lever-press for the presentation of conditioned stimulus. Such conditioned reinforcement as indicated by lever-pressing is amplified by systemic injections (
Hill, 1970;
Robbins, 1975;
Robbins, 1976) or intra-accumbens injections of dopaminergic drugs such as amphetamine and pipradrol (
Taylor and Robbins, 1984;
Cador et al., 1991;
Kelley and Delfs, 1991;
Wolterink et al., 1993), which increase extracellular levels of dopamine in the accumbens. On the other hand, injections of these drugs into other dopamine terminal regions are not effective (
Cador et al., 1991;
Kelley and Delfs, 1991). Arousing effects of dopaminergic drugs on conditioned reinforcement cannot be readily attributed to a general arousal or hyperactivity, because dopaminergic treatments selectively increase responding on the active lever* over the inactive lever* and because the presentation of control stimuli, which have not been paired with unconditioned stimulus, does not support such instrumental behavior (
Taylor and Robbins, 1984;
Cador et al., 1991;
Kelley and Delfs, 1991). Furthermore, intra-accumbens injections of dopamine receptor antagonists or 6-OHDA treatments abolishes conditioned reinforcement (
Robbins and Everitt, 1982) and attenuate amphetamine-enhanced conditioned reinforcement (
Taylor and Robbins, 1986;
Wolterink et al., 1993).
Fibiger and Phillips (1986) suggested a close relationship between conditioned reinforcement and incentive motivation. They stated “Incentive motivational stimuli therefore have both activational and directional or cue features” and “It is possible that the mesolimbic DA system is concerned with mediating the activational or energizing properties of such stimuli” (p. 667).
In addition to such energizing effects, particularly with the presence of incentive stimuli*, dopamine in the accumbens plays an important role in incentive learning (
Ikemoto and Panksepp, 1999). For example,
McFarland and Ettenberg (1995) provided particularly strong evidence for dopamine’s role in incentive learning using the dopamine antagonist haloperidol (although this study does not indicate brain regions where the dopamine antagonist acts). Rats were trained to run down a straight runway to a goal box in which they received intravenous heroin reward. Olfactory cues in the runway predicted whether responding would be rewarded with the drug. Rats learned to run fast when a cue paired with the reward was present and run slowly when another cue paired with no reward was present. Pretreatment with a moderate dose of haloperidol in these rats did not significantly change running speeds, i.e., the motivation for the drug. However, when these rats were tested again 24 hours later without haloperidol treatment, the rats that had received the reward cue and the drug reward under the influence of haloperidol ran significantly slowly even with the presence of the reward cue. On the other hand, rats that had received the no-reward cue and no drug reward under the influence of haloperidol ran fast with the reward cue. These data strongly suggest that dopaminergic activity at the time of reward consumption is involved in incentive learning (for more evidence see
Beninger, 1983); a predictive stimulus will no longer energize conditioned responding if dopaminergic activity was disrupted during previous consumption of the reward. In other words, dopamine is involved in memory consolidation of incentive representation. The experience of approaching and consuming the reward appears to make incentive representation of environmental knowledge labile. Activity levels of dopamine transmission at the time of or just after such experience appear to determine how incentive representation will be consolidated.
Dopamine in the nucleus accumbens is thought to play such a role, because blockade of dopamine receptors in the nucleus accumbens results in a similar deficit ( of
Ikemoto and Panksepp, 1999). Also, dopamine receptor blockade in the accumbens disrupts a random foraging task: haloperidol-treated rats increase re-entries (errors) in an eight arm maze as they collect bait from four randomly baited arms. On the other hand, if rats can use information that they learned prior to receiving intra-accumbens haloperidol, no impairment is found in collecting bait from the eight arm maze (
Floresco et al., 1996).
Since our review (
Ikemoto and Panksepp, 1999), many studies have been conducted providing rich data indicating differential functions between the medial portion of the nucleus accumbens shell and the core in appetitive tasks (see
Di Chiara, 2002;
Kelley, 2004;
Everitt and Robbins, 2005). The present review will focus on new studies that have addressed functional differences within the ventral striatum, and will try to address how the meso-ventromedial and ventrolateral striatal dopamine systems participate in invigoration of approach, incentive learning and drug reward.
4.5. Roles of the meso-ventromedial striatal dopamine system in the regulation of states and behavioral variation
Before the role of the meso-ventromedial striatal dopamine system in goal-directed learning is considered, more general roles of this system will be discussed. The meso-ventromedial striatal dopamine system appears to play a major role in the regulation of arousal or states involving reciprocal interaction between the mind and the body (hereafter referred to as “states of mind/body interaction”). Heightened state of mind/body interaction allows animals to actively interact with the environment, possibly leading to procurement of biologically important stimuli or avoidance/escape from danger. This functional role of the dopamine system is characterized in part by the notion of “affect” in the sense that
Young (1959) defined. The activation of the meso-ventromedial striatal dopamine system appears to lead to positive affect, a state that results in approach learning (or positive reinforcement). Thus, positive affect is synonymous with reward, and the data consistent with this notion were discussed in section 1.2. Its inhibition, on the other hand, appears to lead to negative affect, a state that results in avoidance learning (or negative reinforcement). Evidence for this will be discussed later.
The functional role of the meso-ventromedial striatal dopamine system is also characterized in part by the notion of “general drive state” that
Hebb (1955) conceived. By drive, Hebb meant “an energizer, but not a guide; an engine, but not steering gear” (p. 249). In the present paper, such affective and drive states of mind/body interaction modulated by the meso-ventromedial striatal dopamine system is referred to as “action-arousal*”. The meso-ventromedial striatal dopamine system conducts signals from the limbic system, which detects significant changes in the environment with respect to self-preservation and procreation (
MacLean, 1990). This dopamine system is sensitized by regulatory imbalances (e.g., hunger) and activated when animals detect incentive stimuli and especially when procurement of reward is uncertain.
4.5.1. Action-arousal in unconditioned contexts: behavior Behavioral variation, a process that generates unconditioned responding, is normally triggered by the perception of novelty, opportunity, danger or uncertainty. The induction of this process appears to be mimicked by the activation of the meso-ventromedial striatal dopamine system. Administration of drugs into the posteromedial VTA or the central linear nucleus, which activates dopaminergic projections to the ventromedial striatum, elicits heightened locomotion and rearing, behaviors that enable organisms to interact with the environment. Locomotor activity is elicited by rewarding treatments such as microinjections of the cholinergic receptor agonists carbachol (
Ikemoto et al., 2003) or cytisine (
Museo and Wise, 1990), NMDA receptor agonists (
Ikemoto, 2004), μ-opiate receptor agonists (
Joyce et al., 1981;
Zangen et al., 2002) or the cannabinoid receptor agonist delta-9 THC (
Zangen et al., 2006) into the posteromedial VTA, or the GABA
A receptor agonist muscimol into the central linear nucleus (
Klitenick and Wirshafter, 1988). Thus, these observations are consistent with the role of the meso-ventromedial striatal dopamine system in action-arousal and behavioral variation. The significance of these findings is that the activation of the meso-ventromedial striatal dopamine system elicits unconditioned responses.
Administration of dopaminergic drugs such as cocaine and amphetamine into the medial accumbens shell or the medial olfactory tubercle elicits forward locomotion and rearing in rats (
Ikemoto, 2002,
Ikemoto and Witkin, 2003). Ventrolateral striatal dopamine may also participate in locomotion, although the data regarding this are not so consistent. Microinjections of dopamine or amphetamine into the accumbens core can induce heightened locomotion comparable to that induced by microinjections of the same substances into the medial shell (
Johnson et al., 1996;
Swanson et al., 1997;
Ikemoto, 2002,
Ikemoto and Witkin, 2003). Interestingly, 6-OHDA lesions of dopaminergic terminals in the core, but not the medial shell, disrupt locomotor activity induced by intravenous psychomotor stimulants (
Boye et al., 2001;
Sellings and Clarke, 2003;
Sellings et al., 2006a;
2006b). On the other hand, excitotoxic lesions of the core enhance locomotor activity induced by systemic amphetamine or cocaine (
Parkinson et al., 1999;
Ito et al., 2004), suggesting an inhibitory role of the core in psychomotor stimulant-induced locomotion. Several studies showed that mixtures of D
1- and D
2-like dopamine receptor agonists microinjected into the medial shell or medial tubercle, but not the core, increased locomotor activity (
Swanson et al., 1997;
Choi et al., 2000;
Ikemoto, 2002). These data raise a question as to how exactly core dopamine is involved in locomotion. Nevertheless, heightened activity of dopamine in ventromedial striatum appears to lead to forward locomotion and rearing in rats, observations that are thought to reflect an action-arousal state or general drive state. The next section reviews the evidence that ventromedial striatal dopamine also energizes conditioned responses*.
4.5.2. Action-arousal in conditioned contexts: behavior Activation of the meso-ventromedial striatal dopamine system appears to energize conditioned responses in the environment where extensive conditioning has taken place. These effects have been shown using cocaine reinstatement models. In this procedure, rats are first trained to lever-press for intravenous cocaine administration. When the animals learn to lever-press for cocaine over a few weeks, lever-pressing is extinguished by saline infusions instead of cocaine over several days (i.e., extinction). It should be noted that the extinction of conditioned responses is not erasure of what was previously learned, but is new learning. In other words, in this procedure animals first learn excitatory conditioning over selective response and then inhibitory conditioning over the same response, leading to no expression of conditioned responding. The balance between excitatory and inhibitory conditioning can be tipped by certain manipulations. For instance, in animals that have extinguished conditioned responding (lever-pressing), systemic administration of cocaine reinstates the conditioned responding, and this cocaine-induced reinstatement is blocked by the administration of dopamine receptor antagonists into the medial shell, but not the core (
McFarland and Kalivas, 2001;
Anderson et al., 2003;
Bachtell et al., 2005;
Anderson et al., 2006). Furthermore, microinjections of cocaine or dopamine receptor agonists directly into the medial shell, but not the core, reinstate lever-pressing (
Schmidt et al., 2006). Reinstatement of responding is selective, because these animals only reinstate responding on the active lever, but not the inactive lever. Thus, these results suggest that the medial accumbens shell, but not the core, exerts excitatory control over conditioned responding. Findings from a conditioned reinforcement study are also consistent with microinjection data. The vigor of conditioned reinforcement induced by intra-accumbens administration of amphetamine is diminished following selective excitotoxic lesions of the medial shell, but not the core (
Parkinson et al., 1999).
However, these findings from instrumental tasks allow an alternative interpretation: increased activity of ventromedial striatal dopamine merely energizes learned motor-habits, but not motivation or action-arousal, because conditioned stimuli are learned through instrumental tasks. This motor habit hypothesis is no longer viable because of the data obtained by
Wyvell and Berridge (2000). They trained rats in two distinct learning procedures and first gave an instrumental conditioning* task in which they had to lever-press to receive sucrose pellets, and then trained rats in a Pavlovian conditioning* task in which a light cue was conditioned with the delivery of sucrose pellets. Thus, conditioned stimulus (light cue) was learned in a Pavlovian task, but not an instrumental task. The test session was carried out in an extinction* procedure in which lever-presses did not deliver the reward. Amphetamine microinjections into the medial shell resulted in heightened conditioned responding (active lever-pressing but not inactive lever-pressing) upon the presentation of a Pavlovian conditioned stimulus, but not upon presentation of a control stimulus or the absence of a stimulus. This and related studies (
Wyvell and Berridge, 2000;
2001;
Pecina et al., 2006) demonstrate that increased activity of intra-shell dopamine does not merely enhance motor-habits, but also enhances motivation for rewards, a finding consistent with the notion of action-arousal or drive.
Considering the role of ventromedial striatal dopamine in unconditioned contexts, acute heightened activity of ventromedial striatal dopamine appears to energize responding without directing it (Hebb’s general drive state). The goal-directed nature of conditioned responding appears to be exerted by other selection mechanisms such as the accumbens core and dorsal striatum; this point is further elaborated in sections 4.6.3 and 4.9.5. This hypothesis is consistent with the suggestion by Everitt and his colleagues that “The NAc shell can thus serve to amplify the expression in behavior of information flowing through the NAc core” (p. 395,
Ito et al., 2004).
4.5.3. Action-arousal in unconditioned contexts: vocalization Heightened dopaminergic activity in the ventromedial striatum not only energizes somatic motor expressions, but also appears to induce an emotional state. Microinjections of amphetamine into the medial shell, more readily than the core, elicit high frequency (around 50-Hz) ultrasonic vocalizations (
Burgdorf et al., 2001;
Thompson et al., 2006), which have been implicated in positive motivational states in rats (
Knutson et al., 2002).
4.5.4. Action-arousal in unconditioned contexts: physiological measures The activation of the meso-limbic dopamine system also leads to physiological responses consistent with the notion of action-arousal. Overall, such physiological responses resemble those triggered by mild stress. Electrical brain stimulation at the medial forebrain bundle/VTA is rewarding and triggers dopamine release in the ventral striatum (
Fiorino et al., 1993;
Garris et al., 1999;
Cheer et al., 2005;
Hernandez et al., 2006) and investigatory responses of sniffing (
Clarke and Trowill, 1971;
Ikemoto and Panksepp, 1994). Such electrical brain stimulation also increases blood pressure, an effect that is blocked by pretreatment with dopamine antagonists or 6-OHDA (
Spring and Winkelmuller, 1975;
Tan et al., 1983;
Burgess et al., 1993;
Cornish and van den Buuse, 1994), suggesting dopamine mediation. This electrical brain stimulation not only elicits reward, investigatory responses and heightened blood pressure, but also heightened norepinephrine, epinephrine and glucocorticoid levels in the plasma of rats (
Burgess et al., 1993), characterized as a set of sympathetic arousal responses. Increased blood pressure is also observed after microinjections of the substance P analog DiMe-C7 into the VTA and is abolished by pretreatment with systemic dopamine antagonists (
Cornish and van den Buuse, 1995). These physiological responses appear to be more readily triggered by the activation of the meso-ventromedial striatal dopamine system rather than other dopamine systems, because microinjections of cocaine or dopamine receptor agonists into the ventromedial striatum increase plasma glucocorticoid levels more effectively than microinjections of the same drugs into the medial prefrontal cortex or dorsal striatum (
Ikemoto and Goeders, 1998).
Consideration of the natural environment in which wild mammals live may be helpful for understanding the relationships between two apparently distinct effects of sympathetic arousal and reward triggered by the activation of the meso-ventromedial striatal dopamine system. For example, think of hungry lions that have just spotted a zebra cub in distance. Heightened blood pressure and circulatory levels of “stress” hormones enable lions to maintain energy homeostasis in the event of vigorous physical activity and, thereby, ensure the completion of predatory pursuit leading to the procurement of their prey.
Glickman and Schiff (1967) suggested that the engagement of approach such as predatory behavior or the activation of neuronal mechanisms underlying such approach is rewarding. From this perspective, it is not surprising that common neuronal mechanisms underlie both stress responses, such as those of lions getting ready for predatory action, and reward.
Stress-related physiological responses, stimulated by the activation of the meso-ventromedial striatal dopamine system, appear to regulate the dopamine systems (see
Marinelli, 2007). Certain stressful stimuli appear to selectively activate the meso-ventromedial striatal dopamine system. Exposure to foot shock increases extracellular dopamine levels in the medial shell, but not the core (
Kalivas and Duffy, 1995). Interestingly, extracellular basal dopamine levels in the medial shell, but not the core, decrease after adrenalectomy, which diminishes plasma glucocorticoids (
Barrot et al., 2000). Moreover, diminishment of glucocorticoids induced by adrenalectomy blunts increased dopamine concentration in the medial shell, but not the core, induced by morphine or cocaine administration (
Barrot et al., 2000). Indeed, previous data suggest that stress hormones play a critical role in the reinstatement of drug seeking (
Shalev et al., 2002) and the rewarding effects of psychomotor stimulants (
Sarnyai et al., 2001;
Goeders, 2002;
Marinelli and Piazza, 2002). These data suggest that dopamine in the ventromedial striatum is more responsive to stressful stimuli than dopamine in the ventrolateral striatum. Overall, the meso-ventromedial striatal dopamine system and some stress responses appear to interact reciprocally. Such relationships are consistent with the notion of states of mind/body interaction; both heightened ventromedial striatal dopamine and stress-related physiological responses are coordinated to help organisms to actively interact with the environment for survival.
4.5.5. Action-arousal: possible mechanisms These coordinated effects at behavioral, psychological and physiological levels induced by the activation of the meso-ventromedial striatal dopamine system appear to be mediated, in part, by the medial ventral pallidum, the major recipient of the outputs from the ventromedial striatum (section 3.1). Heightened locomotion induced by intra-accumbens dopamine is facilitated by microinjections of the GABA
A receptor antagonist picrotoxin and attenuated by GABA injections into the vicinity of medial ventral pallidum (
Jones and Mogenson, 1980). In addition, hyper-sensitive locomotion to systemic administration of the dopamine receptor agonist apomorphine following 6-OHDA lesions of the nucleus accumbens is attenuated by injections of the GABA
A receptor agonist muscimol into the vicinity of the medial ventral pallidum (
Swerdlow and Koob, 1984) or excitotoxic or electrolytic lesions of the region (
Swerdlow et al., 1984a;
1984b). Moreover, excitotoxic lesions of the vicinity of the medial ventral pallidum appear to attenuate the rewarding effects of intravenous administration of cocaine or heroin (
Hubner and Koob, 1990). Therefore, these data are consistent with the idea that the medial ventral pallidum plays a major role in mediating the outputs from the ventromedial striatum.
The medial ventral pallidum sends its efferents to the medial mediodorsal thalamic nucleus, the lateral hypothalamic area (
Zahm, 1989;
Groenewegen et al., 1993;
O’Donnell et al., 1997) and the midbrain extrapyramidal area, dorsomedial to the pedunculopontine tegmental nucleus (
Swanson et al., 1984;
Rye et al., 1987). Although data from some studies suggest that one region is more important in locomotion than others, all of these regions appear to be involved in locomotion and possibly action-arousal. Lesions of the medial mediodorsal thalamic nucleus, but not the midbrain extrapyramidal area, attenuate hyper-sensitive locomotion to apomorphine following 6-OHDA accumbens lesions (
Swerdlow and Koob, 1987). In addition, microinjections of GABA receptor agonists muscimol or baclofen into the mediodorsal thalamic nucleus elicit locomotion in rats (
Churchill et al., 1996). However, microinjections of the local anesthesia procaine into the midbrain extrapyramidal area, but not the medial mediodorsal thalamic nucleus, attenuate heightened locomotion induced by picrotoxin injections into the ventral pallidum (
Mogenson and Wu, 1988). Moreover, heightened locomotion induced by amphetamine injections into the nucleus accumbens is also attenuated by procaine injections or excitotoxic lesions of the midbrain extrapyramidal area (
Brudzynski and Mogenson, 1985).
Additional data confirm that both mediodorsal thalamic nucleus and midbrain extrapyramidal area are involved in locomotion and suggest that distinct mechanisms exist within the ventral pallidum projecting between these regions, to mediate locomotion. Microinjections of the μ-opiate-receptor agonist DAMGO or the glutamate receptor agonist AMPA into the ventral pallidum elicit locomotion. Locomotion induced by intra-pallidal DAMGO is blocked by procaine microinjections into the mediodorsal thalamic nucleus, but not the midbrain extrapyramidal area, whereas locomotion induced by intra-pallidal AMPA is blocked by procaine microinjections into the midbrain extrapyramidal area, but not the mediodorsal thalamic nucleus (
Churchill and Kalivas, 1999). The medial mediodorsal thalamic nucleus may send arousal signals to the prelimbic and infralimbic prefrontal cortices, which then relay them back to the ventromedial striatum and to the lateral hypothalamic area. The midbrain extrapyramidal area, which may conduct arousal information, sends its efferents to the spinal cord (
Swanson et al., 1984) (apparently to modulate motor processes), the VTA (
Klitenick and Kalivas, 1994) (completing a circuit) and the lateral hypothalamic area.
The action-arousal state may be regulated by a network of nuclei or cell assemblies localized from the spinal cord to the telencephalon. A key structure of the network may be the lateral hypothalamic area, which receives innervation from the medial ventral pallidum and ventromedial striatum and from the infralimbic prefrontal cortex and midbrain extrapyramidal area. Recent data suggest that the lateral hypothalamic area contains hyporcretin/orexin (
de Lecea et al., 1998,
Sakurai et al., 1998) neurons (
Peyron et al., 1998), which appear to play a critical role in arousal (
Siegel, 2004;
Saper et al., 2005) and probably reward (
Harris et al., 2005;
Harris and Aston-Jones, 2006).
The lateral hypothalamic area belongs to the phylogenically old brain structure referred to as “isodendritic core” (or “reticular formation”), as do the ventral pallidum and VTA (
Leontovich and Zhukova, 1963;
Ramon-Moliner and Nauta, 1966;
Geisler and Zahm, 2005). The isodendritic core is a type of neuronal tissue that consists of “isodendritic” neurons, characterized as long, straight, thick, poorly ramified dendrites and long axons with many collaterals and poorly ramified terminals. The dendritic field of an isodendritic cell is extensive and overlaps with those of other isodendritic cells, forming a continuum localized in the core of the central nervous system from the spinal cord to the telencephalon (
Leontovich and Zhukova, 1963;
Ramon-Moliner and Nauta, 1966). The isodendritic neurons’ morphological features enable them to receive extensive afferents and, thus, appear to be optimal for integrating a variety of inputs, including strong inputs from protopathic somato-visceral sensibility (
Leontovich and Zhukova, 1963). In other words, the lateral hypothalamic area is reciprocally connected with the brain regions that conduct information on visceral and somatic outputs characterized as emotional (
Bandler et al., 1991;
Holstege, 1991;
Nieuwenhuys, 1996;
Saper, 2002). Electrical brain stimulation along the pathways of these structures elicits sniffing in anesthetized rats (
Ikemoto and Panksepp, 1994) and a set of physiological responses characterized as a heightened sympathetic state (
Hilton, 1982;
Yardley and Hilton, 1986), effects that resemble those elicited by the stimulation of the VTA, a recipient of extensive afferents from other isodendritic core regions (
Geisler and Zahm, 2005). Thus, the lateral hypothalamic area may control the autonomic nervous system via the connections to the nucleus of the solitary tract and parabrachial nucleus (
Saper, 2002;
2004), endocrine secretion via the connection to the periventricular hypothalamus (
Swanson, 1987), and emotional movements via the connection to the periaqueductal gray and the midbrain extrapyramidal area (
Bandler et al., 1991;
Holstege, 1991;
Nieuwenhuys, 1996).
Therefore, an extensive network of brainstem regions may mediate action-arousal modulated by the meso-ventromedial striatal dopamine system. Such a network organization would be consistent with observations that the rewarding effects of electrical brain stimulation at the medial forebrain bundle/lateral hypothalamic area are not readily abolished by lesions of the tissues just anterior or posterior to the stimulating electrodes (e.g.,
Janas and Stellar, 1987). Although precise circuitry regulating the action-arousal state is not clear at this time, the lateral hypothalamic area and its associated brainstem isodendritic core regions are tentatively referred to as the “action-arousal system”. The activation of this global system leads to not only motor arousal but also emotional and cognitive arousal or vigilance (via the activation of the thalamus and cortices), which enables the organism to learn about the environment in relation to itself (
Hebb, 1955).
4.6. Basic processes and popular tasks in goal-directed associative learning
Action-arousal, regulated by the meso-ventromedial striatal dopamine system, may play an essential role in Pavlovian and instrumental conditioning tasks. Historically, instrumental (or operant) conditioning emerged as a learning process distinct from Pavlovian conditioning (
Miller and Konorski, 1928/1969;
Skinner, 1938). More recent behavioral analyses, however, suggest that Pavlovian conditioning plays a pivotal role in instrumental tasks (
Rescorla and Solomon, 1967;
Dickinson and Balleine, 1994). Some saw the distinction between Pavlovian conditioning and instrumental conditioning as merely procedural and argued that the same basic processes underlie both Pavlovian and instrumental learning (
Bindra, 1972;
Hearst and Jenkins, 1974), although some basic processes are more intimately involved in instrumental tasks than Pavlovian tasks (
Bolles, 1972).
4.6.1. Action-outcome and stimulus-response association Instrumental learning tasks have been found to depend on two basic processes (
Yin and Knowlton, 2006), in addition to those typically involved in Pavlovian learning. Action-outcome* learning consists of encoding in memory the relationship between animals’ actions and the value of their outcomes (
Dickinson and Balleine, 1994). As a result of this learning, rats can initiate an action while anticipating the outcomes of the action from previous experience. This learning process is thought to be partly mediated by the dorsomedial striatum (or the caudate) (
Yin et al., 2005a;
2005b). The other process is stimulus-response* or habit learning, which involves encoding the relationship between environmental stimuli and responses. Over repeated trials of learning, stimuli alone automatically elicit fixed, adaptive patterns of responses, i.e., goal-directed habit. This learning process appears to be mediated, in part, by the dorsolateral striatum (or the putamen) (
Knowlton et al., 1996;
White, 1997;
Graybiel, 1998;
Hikosaka, 1998;
Yin and Knowlton, 2006). After weeks or months of the experience of self-administration of drugs, characters of drug self-administration become more and more persistent and extinction-resistant in rats (
Deroche-Gamonet et al., 2004;
Vanderschuren and Everitt, 2004). These observations are thought to be, in part, a consequence of abnormal stimulus-response learning (
Everitt and Robbins, 2005). Such functional divisions of the dorsal striatum are consistent with recent analyses of the connectivity of the striatum (
Haber, 2003;
Voorn et al., 2004).
4.6.2. Stimulus-outcome association and its performance Pavlovian conditioning involves repeated parings between predictive stimuli and outcomes. As a result, predictive stimuli acquire the affective properties of outcomes (
Rescorla, 1988). Hence, Pavlovian conditioning depends in part on stimulus-outcome association*, which enables animals to retrieve the representation of an outcome upon the presentation of a predictive stimulus. The formation of stimulus-outcome association is evident when the presentation of stimuli that have previously elicited no response now elicits responses (i.e., conditioned responses). Therefore, Pavlovian conditioning also involves the execution of actions upon the presentation of conditioned stimuli. The evidence indicating different neural mechanisms for these two processes of Pavlovian conditioning will be discussed later.
To better understand dopamine’s role in Pavlovian conditioning and drug reward, two points concerning the nature of unconditioned stimuli and conditioning are elaborated here. Food nutrition is an affective outcome (or unconditioned stimuli), which triggers association with stimuli. Sight, smell and other senses of food may not be affective outcomes until they acquire affective properties of nutrients contained in food over the repeated experience of consuming it (the notion of “cathexis” by
Tolman, 1949). For example, the sight of a banana may be used as an unconditioned stimulus, but will not trigger affective responses unless the animal has consumed it before. Unlike the consumption of nutrients via food, drug administration could bypass the five senses when administered to animals via the intravenous or intracranial routes (for discussion, see
Wise, 2002). Thus, such drug administration lacks intrinsic sensory properties. However, environmental stimuli associated with drug administration will acquire the affective properties of drugs.
In addition, Pavlovian conditioning appears to be a collective name for a set of heterogeneous phenomena. For example, conditioned autonomic responses such as heart rate change appear to be learned via different mechanisms than those for conditioned somatic responses such as eye-blinks (
Kao and Powell, 1988). Unlike most types of Pavlovian learning, in which temporal contiguity plays an important role, conditioned taste aversion (the association of illness with food-intake) can be established even though a predictive stimulus associated with food intake occurs hours before the aversive effects of an unconditioned stimulus (
Rozin and Kalat, 1971). In addition, the same unconditioned stimulus can elicit conditioned responses via different mechanisms under some circumstances. Cocaine administration appears to induce conditioned responses via different mechanisms, depending on how the drug is delivered (intravenous vs. intraperitoneal). These points will be elaborated later on in section 4.9.2. The focus of the present paper is Pavlovian conditioning mediated via dopaminergic mechanisms.
The conditioned place preference (or conditioned cue preference) procedure, a variant of Pavlovian procedures, is commonly used to measure the rewarding properties of drugs in rats and other animals. Conditioned place preference induced by drugs or other stimuli involves two phases. The first, or acquisition, phase involves pairing of a compartment (place cues) with rewards like drug administration and another compartment with no reward. Following such pairings, animals are thought to associate cues present in the reward-paired compartment with some affective properties of rewards (
Carr et al., 1989). The testing or performance phase involves choice behavior between the compartments. The reward-paired compartment is thought to attract animals more than the non-drug compartment, because of its affective properties acquired through association. Thus, conditioned place preference performance involves retrieval of affective information for action and has been found to involve stimulus-outcome association (
Perks and Clifton, 1997;
Yin and Knowlton, 2002). Because the performance phase is conducted during extinction, conditioned place preference performance does not involve other types of association such as action-outcome and stimulus-response associations.
Because its conditioning and performance phases take place separately, conditioned place preference tests allow experimenters to distinguish neuronal mechanisms involved in stimulus-outcome learning from those involved in performance guided by stimulus-outcome association. Indeed, it has been shown that the acquisition and performance phases of conditioned place preference procedures are affected by different neuronal manipulations (e.g.,
Hiroi and White, 1991a;
1991b). On the other hand, most Pavlovian procedural tasks typically involve both conditioning and performance phases occurring simultaneously and, thereby, do not allow experimenters to examine them separately or pinpoint which basic process is affected by particular treatments. For example, when presentation of a light cue precedes the delivery of food by several seconds in a Pavlovian procedure, hungry rats learn to exhibit conditioned responding to the lighted cue. Because the conditioning phase in this typical Pavlovian task is not separated from the performance phase, it is difficult to investigate mechanisms involved in the stimulus-outcome component distinguished from the performance component. Moreover, one cannot be sure whether the stimulus-outcome association really controls conditioned responses in this procedure. Even if outcomes are not dependent on any responses, organisms could develop “anomalous beliefs” that their action produces outcomes, incidental action-outcome or stimulus-response associations, which could maintain the performance (
Hearst and Jenkins, 1974).
4.6.3. Hypothesis on associative roles of striatal regions in selection summarizes the roles of striatal regions in associative processes. Although evidence is scant at this time, to facilitate research, explicit, easily falsifiable functions are suggested. The ventromedial striatum and its dopamine are suggested to be involved in stimulus-outcome learning, while the ventrolateral striatum, particularly that of the nucleus accumbens core, is involved in selection of stimulus-appropriate responding, a process that is referred to as stimulus-action* association. Here, a distinction is made between stimulus-action and stimulus-response associations. The term action signifies that performance depends on a stimulus-outcome association, while the term response signifies that performance does not depend on such association. Core dopamine may be important for the acquisition of stimulus-action association, i.e., conditioned response learning dependent on stimulus-outcome association.
These associative processes participate in behavioral selection. Stimulus-outcome association is proposed to be the foundation for selection via stimulus-action, action-outcome and stimulus-response associative mechanisms through the hierarchical organization of the striatal complex described in . The variation-selection processes are envisioned to generate adaptive responding, from unconditioned to stimulus-dependent to outcome-dependent to habit responding, as organisms adapt to a new environment. It should be emphasized, however, that these associative processes critically depend on the interactions of striatal regions with various other structures such as the limbic (amygdala, hippocampus, prefrontal) cortices (
Everitt et al., 1999;
Everitt and Robbins, 2005;
Goto and Grace, 2005), associative and sensory cortices (
Haber, 2003;
Yin and Knowlton, 2006).
4.7. Roles of the ventral striatum in incentive learning
4.7.1. Nucleus accumbens medial shell Limited evidence suggests that the meso-ventromedial striatal dopamine system plays a critical role in incentive learning, particularly that of stimulus-outcome association. Ventromedial striatal dopamine modulates the formation of incentive representation of the environment such that stimuli associated with the increase of ventromedial striatal dopamine gains an incentive motivation property. Microinjections of dopamine receptor antagonists into the medial accumbens shell, but not the core, during the acquisition phase disrupt conditioned place preference induced by systemic administration of nicotine or opiates (
Fenu et al., 2006;
Spina et al., 2006). Consistent with this finding, researchers have found that microinjections of amphetamine into the medial accumbens shell, but not the core, enhanced the acquisition of Pavlovian conditioned responding induced by conditioned stimuli signaling food deliveries (
Phillips et al., 2003), and that 6-OHDA lesions of the medial shell or medial tubercle, but not the core, disrupted conditioned place preference involving psychomotor stimulants (
Sellings and Clarke, 2003;
Sellings et al., 2006a and
2006b). In addition, microinjections of the D
2 dopamine receptor agonist quinpirole into the posteromedial VTA just before the acquisition phase diminish conditioned place preference induced by food. This treatment inhibits dopamine activity and dopamine release in the ventromedial striatum. Microinjections of the same drug into the anterolateral VTA or substantia nigra, which respectively project to the ventrolateral striatum and dorsal striatum, are much less effective (
Liu and Ikemoto, 2006). Importantly, the doses of quinpirole that diminish conditioned place preference for food do not influence food consumption and cannot alone induce conditioned place avoidance. Moreover, the same manipulation disrupts the acquisition of conditioned place avoidance induced by systemic administration of naloxone (
Liu and Ikemoto, 2006). Therefore, these findings are consistent with the idea that the meso-ventromedial striatal dopamine system plays an important role in incentive learning, particularly learning linked to stimulus-outcome association.
Acute, moderate disruption of ventromedial striatal dopamine transmission, which impairs incentive learning, does not appear to interfere with performance based on a stimulus-outcome association. The blockade of dopamine receptors in the medial shell just before preference testing does not change conditioned place preference induced by nicotine or morphine administration, even though the same treatments during the acquisition phase disrupt the acquisition of conditioned place preference (
Fenu et al., 2006;
Spina et al., 2006). Therefore, once learning takes place, moderate blockade of ventromedial striatal dopamine receptors does not appear to influence performance based on stimulus-outcome learning or the retrieval of memories of stimulus-outcome association. This is not to say that ventromedial striatal dopamine is only involved in acquisition of stimulus-outcome association. Although adequate evidence is lacking at this time, it is reasonable to postulate that ventromedial striatal dopamine is involved in the re-organization of incentive representation or reconsolidation of incentive memory in stimulus-outcome association (section 4.9.4).
These findings are not always consistent with others, however. Selective excitotoxic lesions of the medial shell did not disrupt Pavlovian procedural learning in which approach responses were elicited by the presentation of conditioned stimuli paired with food (
Parkinson et al., 2000;
Corbit et al., 2001;
Hall et al., 2001). Such data may be explained by compensatory mechanisms after lesioning. The loss of medial shell function may be compensated during the “recovery” period (typically a week or more) from surgery by remaining mechanisms including the medial olfactory tubercle, which serves similar functions to the medial shell.
4.7.2. Nucleus accumbens core The accumbens core appears to be involved in generating conditioned responding (or selection of adaptive responding) based on stimulus-outcome association. Dopamine in the nucleus accumbens core may be involved in learning such conditioned responding. The acquisition of instrumental tasks to obtain food is severely retarded when rats receive microinjections of D
1 receptor antagonists into the core immediately before learning sessions (
Smith-Roe and Kelley, 2000). The same is true when rats receive the protein synthesis inhibitor anisomycin into the core, but not the shell or dorsolateral striatum, immediately after each learning session (
Hernandez et al., 2002). The latter finding in particular is consistent with the idea that the accumbens core is involved in consolidation of memory concerning conditioned responding. These instrumental studies, however, do not pinpoint which basic processes are disrupted by the core manipulations.
Findings from excitotoxic lesion studies shed some light on this issue, and more specifically suggest that the core plays an important role in conditioned responding that is guided and reinforced by conditioned stimuli. Some inconsistency in finding should be mentioned first. Excitotoxic lesion studies (
Corbit et al., 2001;
Ito et al., 2004) found that core lesions have no detectable effect on the acquisition of a simple instrumental task; this lack of effects of core lesions appears to contradict the findings mentioned above (
Smith-Roe and Kelley, 2000;
Hernandez et al., 2002). This discrepancy may be explained by the difference in recovery periods given following permanent lesions, some of which may have resulted in some compensation. In any case, the core lesions do disrupt instrumental performance that is maintained by the presentation of cocaine-paired conditioned stimuli (
Ito et al., 2004). Similarly, core-lesioned rats do not learn to perform instrumental tasks for food as effectively as sham-lesioned controls, when they have to extend responding guided by conditioned stimulus (
Corbit et al., 2001). Moreover, microinjections of dopamine receptor antagonists into the core disrupt cocaine or food seeking maintained by conditioned stimuli (
Bari and Pierce, 2005) and inactivation of the core by microinjections of GABA receptor agonists disrupts the reinstatement of cocaine seeking triggered by the presentation of a conditioned stimulus (
Fuchs et al., 2004).
In addition,
Corbit et al. (2001) showed that core lesions disrupt the so-called devaluation effect. When rats are trained to perform a task for a food and another task for a different food, and when they are pre-fed with one of the foods, intact rats will seek the unfed outcome more than fed outcome. When core lesioned rats, which had learned to perform these tasks, were pre-fed with one of the foods, they showed decreased responding for both tasks; or, they showed no selective responding. In contrast, after rats received extinction (no food) sessions instead of devaluation, core-lesioned rats showed selective responding with decreased responses for the extinguished task. These results suggest that the deficit induced by core lesions in the devaluation test is not due to impaired action-outcome association, but rather the inability to select adaptive responding based on conditioned stimuli informing values of the outcome.
Studies using Pavlovian procedural tasks suggest that the core is involved in consolidation and retrieval of information concerning selection of action.
Dalley et al., (2005) injected the D
1 receptor antagonist SCH 23390 into the lateral nucleus accumbens (the core and lateral shell) immediately after a Pavlovian approach task with food in hungry rats, and found that this disrupted the acquisition of conditioned responding. SCH 23390 administration into the accumbens core does not disrupt stimulus-outcome association in conditioned place preference procedures (
Fenu et al., 2006;
Spina et al., 2006). Taken together, these data suggest that dopamine in the lateral nucleus accumbens is involved in memory consolidation concerning conditioned responses or stimulus-action association.
Using a conditioned place preference,
Miller and Marshall (2005) showed that performance of conditioned place preference induced by intravenous cocaine administration results in activation of specific molecular signaling (ERK, CREB, Elk-1 and Fos) in the accumbens core, but not the shell. Furthermore, pharmacological blockade of this molecular signaling cascade in the core disrupted conditioned place preference performance immediately after the injection and up to 14 days later. Although it is not clear how this core manipulation may have affected stimulus-outcome association, these data are consistent with the hypothesis that the core plays a role in generating responses based on stimulus-outcome association. Although additional research is needed to substantiate these findings, the overall pattern of recent data suggests that the core is important for the acquisition and the selection of conditioned responding based on stimulus-outcome association.
4.8. Role of ventromedial striatal dopamine in drug self-administration
Preceding sections reviewed functions of ventromedial and ventrolateral dopamine and described how dopamine in these regions may play roles in goal-directed behavior. An experiment was conducted to demonstrate that sophisticated hypotheses such as the variation-selection hypothesis of striatal functional organization are needed to explain how rats learn instrumental responses such as drug self-administration. It is generally assumed that behavioral response leading to drug delivery is
reinforced by the direct pharmacological actions of the drug, and contingent drug delivery is thought to be essential for learning self-administration. This principle was also applied to explain intracranial self-administration such as that shown in . In this case, lever-press leading to cocaine delivery is
reinforced by the dopaminergic action of cocaine in the medial olfactory tubercle. Indeed, co-administration of dopamine receptor antagonists with cocaine diminishes self-administration (
Ikemoto, 2003). Here we show the case that the reinforcement concept is not sufficient for explaining drug-associated responding. In particular, the data suggest that rats learn to lever-press even if cocaine is delivered in a response-independent manner (see section 6.4 for methods). To be maximally comparable with the data obtained with a response-dependent procedure (), all experimental conditions, except the relationship between lever-pressing and cocaine delivery (i.e., instrumental contingency), were identical to those described in the
Ikemoto (2003) study. In that study, rats received a cocaine infusion into the medial olfactory tubercle upon a lever-press (a response-dependent procedure) that also illuminated a light stimulus just above the lever for 1 sec. The present study presented a light stimulus above the lever for 1 sec upon a lever-press, and cocaine was delivered with fixed-interval schedules in a response-independent manner. The delivery schedules of cocaine were derived from median infusion rates of sessions from the
Ikemoto (2003) study; as in the previous study, the rats received 60 mM cocaine in sessions 2–4; 200 mM cocaine in sessions 6 and 7, and vehicle in sessions 1, 5, and 8.
The response-independent cocaine administration into the medial olfactory tubercle significantly increased lever-pressing (). Comparison of the present data with those from the self-administration study (), which involved virtually identical procedures with the same equipment, suggests that the levels of lever-pressing obtained with the response-independent schedule are strikingly similar to those obtained with the response-dependent schedule. These data suggest that intermittent injections of cocaine into the medial olfactory tubercle have marked arousal effects in rats and lead to heightened lever-pressing. As discussed in sections 4.5.1 and 4.5.2, cocaine administration into the medial tubercle appears to increase a general drive state or action-arousal, as it elicits locomotor activity and rearing (
Ikemoto, 2002). In addition, high extracellular levels of ventromedial striatal dopamine appear to energize instrumental responding controlled by incentive stimuli. Taken together, these data are consistent with the explanation generated by the variation-selection hypothesis that administration of cocaine into the medial tubercle energized responses (variation) reinforced by the presentation of light signals upon lever-pressing (selection by action-outcome association). Although this explanation must be substantiated by additional experiments, the present experiment demonstrates that response-independent delivery of cocaine into the medial olfactory tubercle leads to heightened lever-pressing, an effect that the reinforcement concept alone does not readily explain. Cocaine’s capacity to enhance the incentive effects of other rewards such as brain stimulation reward or conditioned stimuli has been documented (
Phillips and Fibiger, 1990). Recently, Caggiula and his colleagues (
Chaudhri et al., 2006) have shown that non-contingent nicotine administration increases lever-pressing for the presentation of a light signal. Light stimulus presentation appears to be rewarding to rats especially when they are food-restricted (
Stewart and Hurwitz, 1958). Their rats were food-restricted, and moderately responded on the lever that delivered light signals. Response-dependent, but not response-independent, administration of intravenous nicotine also supported moderate lever-pressing. Combining the light signal and nicotine administration synergistically increased lever-pressing even when nicotine was delivered in a response-independent procedure (
Donny et al., 2003). I should remind readers that cocaine administration into this region has a dual role: it does not only energize responding elicited by incentive stimuli, but also elicits a positive affective effect, because it induces conditioned place preference without the presence of incentive stimuli (
Ikemoto, 2003;
Ikemoto and Donahue, 2005).
The present data also have implications for schedule-induced adjunctive behaviors such as autoshaping induced by food in hungry animals (section 4.1.1). These behaviors may be partly mediated by the striatal complex and basal ganglia. The meso-ventromedial striatal dopamine system may play a major role in energizing such behaviors. Indeed, intermittent delivery of small pieces of food in food-restricted rats more effectively increases extracellular dopamine levels in the nucleus accumbens than single deliveries of large amounts of food (
McCullough and Salamone, 1992). Lesions of the meso-limbic dopamine system by 6-OHDA disrupt adjunctive behaviors induced by intermittent delivery of food to hungry rats, including heightened water consumption (schedule induced-polydipsia), wheel-running and plasma glucocorticoid (
Robbins and Koob, 1980;
Wallace et al., 1983). The variation-selection hypothesis suggests that inactivation of the meso-ventromedial striatal dopamine system disrupts a variety of schedule-induced adjunctive behaviors because of attenuated action-arousal, leading to low response levels, whereas inactivation of the meso-ventrolateral or dorsal striatal dopamine systems may not disrupt arousing effects but instead disrupts conditioned responding (selection of a certain response over others). In addition, food-restriction procedures used in scheduled-induced adjunctive behaviors may potentiate the meso-ventromedial striatal dopamine system or its upstream or downstream systems. Indeed,
Carr (2002) suggested that food-restriction “sensitizes” the meso-limbic dopamine system.
4.9. Previous hypotheses on dopamine functions
Many hypotheses have been offered over the years to address dopamine’s functions on behavior. These hypotheses are not necessarily mutually exclusive; each hypothesis focuses on certain functional issues. Influential hypotheses are considered below to further elaborate the present conceptual framework, which is based on previously undefined ventral striatal dopamine systems, and to offer insights on those issues that previous hypotheses addressed.
4.9.1. The anhedonia hypothesis and subjective effect issues Research in human subjects has documented that administration of drugs of abuse such as cocaine and amphetamine elicits subjective effects characterized as “euphoria” and “high”. Such an effect is likely mediated, in part, via dopamine.
Volkow and her colleagues (1999 colleagues
(2002) have shown that the greater the velocity in ventral striatal dopamine receptor binding induced by the administration of psychomotor stimulants, the greater the subjective effect of “high”. Based on animal research,
Wise (1982) offered a hypothesis that the blockade of dopamine receptors takes away the pleasurable experience of rewards. In his words, “all of life’s pleasures – the pleasures of primary reinforcement and the pleasures of their associated stimuli – lose the ability to arouse the animal” (
Wise, 1982), after the treatment with dopamine receptor antagonists.
To better understand subjective effect issues, two points must be recognized. First, subjective experience of pleasure does not necessarily dictate animals’ behavior. It is not clear how subjective experience interacts with ongoing subconscious brain processes to modify behavioral outputs, and to what extent it controls animals’ actions. This point is further elaborated in .
In addition, pleasure is not a unitary phenomenon and, thus, dopamine may only be involved in a certain kind of pleasure. At least two different types of pleasure have been suggested. One is associated with the consumption of rewards and is referred to as sensory pleasure, which does not appear to depend on dopamine. For example, feeding is a pleasurable experience. A series of recent studies using dopamine deficient mice (
Zhou and Palmiter, 1995) make compelling points on related issues. Without dopamine, hungry mice are hypoactive and starve to death, even though food and water are readily available, literally in front of their noses. However, these mice still have the capacity to consume food or water if these are directly delivered to their mouths (
Szczypka et al., 1999;
2001). In addition, dopamine deficient mice prefer sweet solutions over plain water (
Cannon and Palmiter, 2003). These data suggest that dopamine does not appear to be essential for feeding or experiencing some pleasure out of it, while it is critical for seeking out and delivering food into the mouth. Other lines of research are consistent with this notion (for a recent review, see
Baldo and Kelley, 2007).
Ikemoto and Panksepp (1996) showed a dissociation between reward-seeking and reward-consumption in that blockade of dopamine receptors in the nucleus accumbens disrupts seeking for a sucrose reward, but not sucrose consumption, in rats.
Berridge and Robinson (1998) showed that almost complete depletion of striatal dopamine does not disrupt “facial pleasure expression” induced by sucrose delivered into rats’ mouths.
The other type of pleasure is the anticipation of rewards, referred to as emotional pleasure. In his influential book,
The Expression of the Emotions in Man and Animals,
Charles Darwin (1872) described this type of pleasure as being accompanied by physical movements.
Under a transport of Joy or of vivid Pleasure, there is a strong tendency to various purposeless movements, and to the utterance of various sounds. We see this in our young children, in their loud laughter, chapping of hands, and jumping for joy; in the bounding and barking of a dog when going out to walk with his master; and in the frisking of a horse when turned out into an open field. Joy quickens the circulation, and this stimulates the brain, which again reacts on the whole body. The above purposeless movements and increased heart-action may be attributed in chief part to the excited state of the sensorium, and to the consequent undirected overflow, as Mr. Herbert Spencer insists, of nerve-force. It deserves notice, that it is chiefly the anticipation of a pleasure, and not its actual enjoyment, which leads to purposeless and extravagant movements of the body, and to the utterance of various sounds. We see this in our children when they expect any great pleasure of treat; and dogs, which have been bounding about at the sight of a plate of food, when they get it do not show delight by any outward sign, not even by wagging their tails. Now, with animals of all kinds the acquirement of almost all their pleasures, with the exception of those of warmth and rest, has long been associated with active movements, as in the hunting or search for food, and in their courtship. (p. 76–77,
Darwin, 1872/1965)
As suggested by
Panksepp (1982;
1998), this type of pleasure is likely mediated, in part, by dopamine. The activation of the meso-ventromedial striatal dopamine system is hypothesized to be particularly important. As discussed above, extracellular tonic increases of dopamine in the ventromedial striatum elicit unconditioned physical movements and vocalization in rats, consistent with Darwin’s description. Therefore, ventromedial striatal dopamine may play an important role in emotional pleasure of reward expectancy, but not sensory pleasure. However, it is unclear to what extent the subjective experience of such pleasure controls behavior.
4.9.2. The psychomotor stimulant theory and reward-arousal homology issues The theoretical framework and data discussed above indicate a close relationship between the rewarding effects and behavioral arousal effects of drugs. Indeed, a theory linking the reinforcing effects and the motor stimulant effects of drugs has been proposed by
Wise and Bozarth (1987). They argue, “all drugs that are positive reinforcers should elicit forward locomotion… they do so by activating the dopaminergic circuitry of the medial forebrain bundle”. It should be emphasized that there are multiple tasks or measures that qualify stimuli as rewarding. By positive reinforcers*,
Wise and Bozarth (1987) meant stimuli for which animals learn instrumental tasks. However, stimuli can be said to be rewarding if they induce conditioned place preference or if they are orally consumed.
Wise and Bozarth (1987) did not suggest that rewarding drugs defined by non-instrumental tasks elicit forward locomotion. As discussed above, mice without dopamine can consume food and water (
Szczypka et al., 1999;
2001) and prefer sweet solutions over plain water (
Cannon and Palmiter, 2003). Thus, these results and many other data demonstrate a dissociation between consumption and seeking/procurement (see
Ikemoto and Panksepp, 1999;
Baldo and Kelley, 2007) suggesting that dopamine appears to be essential for seeking, but not necessary for generating preference for the oral consumption of particular stimuli.
In addition, conditioned place preference procedures appear to detect the rewarding effects of stimuli that are dopamine-independent. Whereas intravenous administration of cocaine induces conditioned place preference mediated via a dopamine dependent mechanism (
Spyraki et al., 1987), particularly that of the accumbens shell and olfactory tubercle (
Sellings et al., 2006b), intraperitoneal administration of cocaine induces conditioned place preference, which is not blocked by pretreatments with systemic dopamine antagonists or 6-OHDA lesions of the nucleus accumbens (
Spyraki et al., 1982a;
Sellings et al., 2006b).
Spyraki et al., (1982a) suggested that intraperitoneal cocaine induced conditioned place preference results from the local anesthetic actions of cocaine, because intraperitoneal administration of procaine, which has a similar chemical structure to cocaine and similar local anesthetic properties, but little capacity to block dopamine uptake, induces conditioned place preference. Interestingly, intraperitoneal administration of procaine, which induces conditioned place preference, does not increase but instead decreases forward locomotion (
Wiechman et al., 1981;
Reith et al., 1985).
Similarly, opiates appear to have both dopamine-dependent and -independent actions. Dopamine deficient mice can learn to associate place cues and the affective action of morphine, and if they are treated with caffeine or L-DOPA, which temporarily restores their dopamine, just before the performance phase, they display a preference for the compartment paired with morphine (
Hnasko et al., 2005). These data suggest that dopamine is not necessary for stimulus-outcome learning between environmental cues and some affective effects of morphine, but is necessary for expression of that learning.
Dopamine deficient mice display heightened locomotion when treated with psychomotor stimulants if dopamine is selectively restored in the ventral striatum, but not the dorsal striatum (
Heusner et al., 2003), findings that are consistent with other studies (section 4.5). Therefore, as
Wise and Bozarth (1987) suggested, drug administration that is rewarding as assessed by instrumental tasks should elicit heightened locomotion.
Wise and Bozarth (1987) also stated, “the locomotor effects and the positive reinforcing effects of these drugs are homologous.” These effects are triggered via a common mechanism, namely the meso-limbic dopamine system, especially the meso-ventromedial striatal dopamine system. However, rewarding and arousal effects appear to be differentially dependent on the downstream mechanisms of this dopamine system; that is, these effects should be dissociated on anatomical grounds. The meso-ventromedial striatal dopamine system interacts with various other systems: the ventrolateral and dorsal striatal dopamine systems, action-arousal system and limbic, associative and sensory cortices (). The rewarding effects of drugs, which are assessed by behavioral tasks such as conditioned place preference or self-administration, appear to rely on basic associative processes (stimulus-outcome, stimulus-action, action-outcome and stimulus-response associations); the nature of the task and extent of experience with the task make a difference in the way these associative mechanisms interact and control animals’ behavior to display the rewarding effects of drugs (). On the other hand, drugs’ unconditioned motor stimulant effects, initiated via the meso-ventromedial striatal dopamine system, may not require elaboration of information with other meso-striatal systems or other higher systems (). This point is consistent with the above-mentioned finding that the restoration of dorsal striatal dopamine is not necessary for psychomotor stimulant-induced locomotion, if ventral striatal dopamine is restored in dopamine deficient mice (
Heusner et al., 2003).
In addition, rewarding and arousing effects may be dissociated on temporal grounds. This point is discussed in section 5.1.
4.9.3. Anergia hypothesis Salamone and his colleagues (
Salamone and Correa, 2002;
Salamone et al., 2003) have suggested that nucleus accumbens dopamine is involved in energizing animals performing instrumental tasks. In their words, “interference with accumbens DA impairs the exertion of sustained effort over time” (p. 7,
Salamone et al., 2003). This suggestion is in general agreement with the role of ventromedial striatal dopamine in action-arousal. Dopaminergic activity in the ventromedial striatum may determine animals’ capacity to pursue seeking tasks of varying degrees of “difficulty”. The level of difficulty for which ventromedial striatal dopamine is important has not yet been determined. Because ventromedial striatal dopamine is involved in action-arousal or general drive states, the deficiency of ventromedial striatal dopamine may affect performance based on both physical and mental challenges.
4.9.4. Appetitive motivation/”wanting” vs. consummatory motivation/”liking” and incentive formation hypothesis Goal-directed behavior is thought to be divided into two general phases, appetitive and consummatory phases (
Craig, 1918;
Konorski, 1967), served by distinct neuronal mechanisms. Appetitive motivation, or “wanting”, is hypothesized to be mediated partly by brain dopamine (
Blackburn et al., 1992;
Ikemoto and Panksepp, 1996;
Berridge and Robinson, 1998). However, the argument that dopamine is not important for the consummatory or “liking” phase needs careful consideration. As discussed above, dopaminergic neurons fire and dopamine is released during the consummatory, or liking, phase. Moreover, as exemplified by the
McFarland and Ettenberg (1995) study, conditioned responding is attenuated after the disruption of dopaminergic activity by systemic or intra-accumbens administration of dopamine antagonists during the consummatory, or liking, phase.
To account for the role of dopamine during consummatory behavior,
Ikemoto and Panksepp (1999) hypothesized that dopamine in the nucleus accumbens plays an essential role in incentive learning; increased dopaminergic activity in the accumbens dopamine during the consummatory phase increases the incentive value of stimuli, while dopaminergic inactivity during the consummatory phase decreases the incentive value of stimuli. Fast-scan cyclic voltammetry data indicate that the magnitude of dopaminergic signals in the accumbens occurring just after lever-pressing (the consummatory phase) becomes smaller and smaller during the extinction phase, while response intervals increase more and more (
Stuber et al., 2005b), as if the magnitude indicates incentive motivation for the next lever-presses. The present variation-selection model considers incentive learning in two forms: stimulus-outcome and stimulus-action learning (). Treatment with dopamine receptor antagonists by itself does not appear to change stimulus-outcome or stimulus-action associations that have already been formed. Therefore, treatment with dopamine receptor antagonist does not affect immediate appetitive performance, if the task is not physically or mentally demanding or the blockade of dopamine receptors in the ventromedial striatum is not extensive (severe inhibition leads to the lack of drive, see section 4.9.6). When rats execute an action and receive a reward, ventral striatal dopamine participates in the reconsolidation of incentive memory. The disruption of normal dopaminergic activity in the ventral striatum during consummatory behavior followed by appetitive behavior will lead to re-organization of a stimulus-outcome and stimulus-action association, such that the presentation of conditioned stimuli in the next trial will be ineffective in eliciting conditioned responding.
4.9.5. The incentive-salience hypothesis and incentive-sensitization theory of addiction Berridge and Robinson (
Robinson and Berridge, 1993;
2003;
Berridge and Robinson, 1998) proposed an incentive salience hypothesis of dopamine function. Accordingly, striatal dopamine “transforms the brain’s neural representations of conditioned stimuli, converting an event or stimulus from a neutral ‘cold’ representation (mere information) into an attractive and ‘wanted’ incentive that can ‘grab attention’” (p. 5,
Robinson and Berridge, 1993). The present variation-selection hypothesis offers a mechanism for understanding how ventral striatal dopamine is involved in incentive salience. Increased ventromedial striatal dopamine energizes responding, while activities in the ventrolateral and dorsal striatum interacting with cortical regions guide responding toward novel, salient and conditioned stimuli. Ventromedial striatal dopamine during consummatory behavior is involved in the acquisition and, perhaps, the re-organization of stimulus-outcome association, and ventrolateral striatal dopamine during consummatory behavior is involved in the acquisition and re-organization of conditioned responding (stimulus-action association).
Although the present variation-selection model is not designed to address drug addiction (or dependence), it can offer some insights.
Robinson and Berridge (1993;
2003) proposed a theory to explain how prolonged drug use makes drugs addictive. They stated:
The incentive-sensitization theory of addiction focuses on how drug cues trigger excessive incentive motivation for drugs, leading to compulsive drug seeking, drug taking, and relapse… The central idea is that addictive drugs enduringly alter NAcc-related brain systems that mediate a basic incentive-motivational function, the attribution of incentive salience. As consequence, these neural circuits may become enduringly hypersensitive (or “sensitized”) to specific drug effects and to drug-associated stimuli (via activation by S-S association)… We proposed that this leads psychologically to excessive attribution of incentive salience to drug-related representations, causing pathological “wanting” to take drugs. (p. 36,
Robinson and Berridge, 2003)
As mentioned above,
Wyvell and Berridge (2000) found that amphetamine injections into the accumbens shell energize rats to respond on the active lever only when a Pavlovian-conditioned stimulus is present, a finding that is consistent with an incentive salience attribution effect of dopamine.
Wyvell and Berridge (2001) further found that previous experience with repeated administration of amphetamine has essentially the same effect on incentive salience attribution as intra-shell administration of amphetamine. The presentation of a Pavlovian-conditioned stimulus energizes rats with a history of repeated administration of amphetamine to respond on the active lever, even if the rats are free of the drug at the time of testing. These results raise the possibility that repeated drug experience “sensitizes” the meso-ventromedial striatal dopamine system, or its upstream or downstream systems, in response to conditioned stimuli.
Interestingly, intra-shell administration of amphetamine in those sensitized rats did not further increase conditioned responding upon the presentation of a conditioned stimulus, but rather increased general investigatory behaviors such as sniffing and rearing not targeted at the active lever (
Wyvell and Berridge, 2001). This observation may be explained by the hypothesis that the combination of previous repeated amphetamine and intra-shell amphetamine activated the meso-ventromedial striatal dopamine system so much that it overrode inhibitory control of selection mechanisms or conditioning effects, leading to an increase in unconditioned responding.
4.9.6. Hedonic homeostatic dysregulation In contrary to Robinson and Berridge’s addiction theory based on hyper-function of reward systems,
Koob and Le Moal (1997;
2001;
2005) proposed an addiction theory based on hypo-function of reward systems. They adopted an opponent-process model to conceptualize changing mood states as a function of drug taking. In this model, drug taking results partly in a chronic deviation of reward mechanisms from a set point, leading to dysregulation of reward systems and increased drug taking.
The identification of the two dopamine systems and our recent unpublished data may offer some insights on Koob and Le Moal’s thesis. The meso-ventromedial striatal dopamine system may participate in not only the positive mood state (a-process), but also the negative mood state (b-process). As mentioned above, microinjections of the dopamine D
2 receptor agonist quinpirole into the posteromedial VTA, which inhibit dopaminergic neurons projecting to the ventromedial striatum, disrupt conditioned place preference induced by food and conditioned place avoidance induced by naloxone (
Liu and Ikemoto, 2006), suggesting that inactivation of these dopamine neurons disrupts stimulus-outcome learning. Higher doses of quinpirole into the posteromedial VTA that lowered basal dopamine levels in the ventromedial striatum induced conditioned place avoidance by themselves, and in the presence of food, reduced food intake (Z. H. Liu and S. Ikemoto, unpublished observation). High doses of quinpirole injections into the anterolateral VTA or substantia nigra did not induce these effects. Our data suggest that hypo-activity of the meso-ventromedial dopamine system leads to a negative affective state. Some clinical data suggest that chronic psychomotor-stimulant abuse decrease activity of brain dopamine in patients (
Lago and Kosten, 1994,
Koob and Le Moal, 1997). These findings suggest that chronic drug taking leading to hypo-activity of the meso-ventromedial striatal dopamine system may result in a negative mood state which in turn accelerates drug-taking (“self-medication”) in an attempt to improve the mood state (
Markou et al., 1998).
4.9.7. Prediction-error hypothesis and phasic-tonic functional issues In a series of studies,
Schultz (1998;
2002) and his colleagues characterized how dopaminergic neurons in the ventral midbrain are excited by incentive stimuli in learning tasks. They found that those neurons respond to incentive stimuli as if their signals are conveying discrepancies between the expectation of a reward (with respect to timing and magnitude) and the actual result. The way those signals change in a learning task is found to fit well with the Rescorla-Wagner equation (
Rescorla and Wagner, 1972) or, more recently, temporal-difference models (
Sutton and Barto, 1981), which describe how organisms learn the relationship between a neutral stimulus and an unconditioned stimulus in Pavlovian learning tasks. Therefore, it is suggested that midbrain dopaminergic neurons encode a prediction error between the actual and predicted rewards.
Two points are needed to clarify these observations. First, these electrophysiological signals are correlates of incentive stimuli during reward-seeking tasks, and the functional significance of these signals is not clearly understood. The observation that dopaminergic neurons increase firing activity similarly across the VTA and SNC in relation to reward-seeking tasks does not necessarily mean that dopamine released in various striatal target sites has the same functional consequence. Indeed, behavioral data indicate otherwise (
Di Chiara, 2002;
Kelley, 2004;
Everitt and Robbins, 2005,
Yin and Knowlton, 2006). Therefore, these phasic dopaminergic signals, when they reach the ventromedial, ventrolateral, and dorsal striatum, appear to be utilized differently by terminal regions and their downstream circuits.
Secondly, those phasic dopamine signals should be distinguished from tonic signals and may be critically involved in associative learning between environmental stimuli and internal states, between stimuli and actions and between actions and consequences (). Latencies of firing of dopamine neurons in relation to reward-related stimuli may be too short to signal rewards (
Redgrave et al., 1999). Indeed, dopaminergic neurons appear to fire not only in response to reward-related stimuli, but also those of potential importance such as loud noise (
Kiyatkin, 1988;
Pan et al., 2005).
Redgrave et al. (1999) suggested, “the initial burst of dopaminergic-neurone firing could represent an essential component in the process of switching attentional and behavioural selections to unexpected, behaviourally important stimuli. This switching response could be a crucial prerequisite for associative learning and might be part of a general short-latency response that is mediated by catecholamines and prepares the organism for an appropriate reaction to biologically significant events” (p. 146,
Redgrave et al., 1999). Phasic dopamine signals, therefore, may be involved in associative learning by enabling salient stimuli to be associated with internal states and actions, and actions to be associated with consequences. A similar conclusion is suggested based largely on observations concerning the effects of dopaminergic manipulations on latent inhibition (
Joseph et al., 2003), an effect in which a previously exposed (i.e., familiar) stimulus is more difficult to condition with an unconditioned stimulus than a novel stimulus.