|Home | About | Journals | Submit | Contact Us | Français|
The anhedonia hypothesis – that brain dopamine plays a critical role in the subjective pleasure associated with positive rewards – was intended to draw the attention of psychiatrists to the growing evidence that dopamine plays a critical role in the objective reinforcement and incentive motivation associated with food and water, brain stimulation reward, and psychomotor stimulant and opiate reward. The hypothesis called to attention the apparent paradox that neuroleptics, drugs used to treat a condition involving anhedonia (schizophrenia), attenuated in laboratory animals the positive reinforcement that we normally associate with pleasure. The hypothesis held only brief interest for psychiatrists, who pointed out that the animal studies reflected acute actions of neuroleptics whereas the treatment of schizophrenia appears to result from neuroadaptations to chronic neuroleptic administration, and that it is the positive symptoms of schizophrenia that neuroleptics alleviate, rather than the negative symptoms that include anhedonia. Perhaps for these reasons, the hypothesis has had minimal impact in the psychiatric literature. Despite its limited heuristic value for the understanding of schizophrenia, however, the anhedonia hypothesis has had major impact on biological theories of reinforcement, motivation, and addiction. Brain dopamine plays a very important role in reinforcement of response habits, conditioned preferences, and synaptic plasticity in cellular models of learning and memory. The notion that dopamine plays a dominant role in reinforcement is fundamental to the psychomotor stimulant theory of addiction, to most neuroadaptation theories of addiction, and to current theories of conditioned reinforcement and reward prediction. Properly understood, it is also fundamental to recent theories of incentive motivation.
The anhedonia hypothesis of neuroleptic action (Wise, 1982) was, from its inception (Wise et al., 1978), a corollary of broader hypotheses, the dopamine hypotheses of reward (Wise, 1978) or reinforcement (Fibiger, 1978). The dopamine hypotheses were themselves deviations from an earlier catecholaminergic theory, the noradrenergic theory of reward (Stein, 1968). The present review sketches the background, initial response, and current status of the inter-related dopamine hypotheses: the dopamine hypothesis of reward, the dopamine hypothesis of reinforcement, and the anhedonia hypothesis of neuroleptic action.
The notion that animal behavior is controlled by reward and punishment is certainly older than recorded history (Plato attributed it to his older brother). The notion that an identifiable brain mechanism subserves this function was anchored firmly to biological fact by the finding of Olds and Milner (1954) that rats will work for electrical stimulation of some but not other regions of the forebrain. This led to the postulation by Olds (1956) of "pleasure centers" in the lateral hypothalamus and related brain regions. Brain stimulation studies by Sem-Jacobsen (1959) and Heath (1963) confirmed that humans would work for such stimulation and found it pleasurable (Heath, 1972). Olds (Olds and Olds, 1963) mapped much of the rat brain for reward sites, and even as his title phrase "pleasure centers" (Olds, 1956) was capturing the minds of a generation of students he was thinking not about isolated centers so much as about interconnected circuit elements (Olds, 1956; 1959; Olds and Olds, 1965). Olds (1956) assumed these to be specialized circuits that "would be excited by satisfaction of the basic drives – hunger, sex, thirst and so forth."
The first hints of what neurotransmitters might carry reward-related signals in the brain came from pharmacological studies. Olds and Travis (1960) and Stein (1962) found that the tranquilizers reserpine and chlorpromazine dramatically attenuated intracranial self-stimulation, while the stimulant amphetamine potentiated it. Imipramine potentiated the effects of amphetamine (Stein, 1962). Reserpine was known to deplete brain noradrenaline, chlorpromazine was known to block noradrenergic receptors, amphetamine was known to be a noradrenaline releaser, and imipramine was known to block noradrenergic reuptake. Largely on the basis of these facts and the location of reward sites in relation to noradrenergic cells and fibers, Stein (1968) proposed that reward function was mediated by a noradrenergic pathway originating in the brainstem (interestingly, Stein initially identified the A10 cell group, which turned out to comprise dopaminergic rather than noradrenergic neurons, as the primary origin of this system). Pursuing his hypothesis, C.D. Wise and Stein (1969; 1970) found that inhibition of dopamine-β-hydroxylase the enzyme that converts dopamine to norepinephrine – abolished self-stimulation and eliminated the rate-enhancing action of amphetamine; intraventricular administration of l-norepinephrine reinstated self-stimulation and restored the ability of dopamine to facilitate it.
At the time of initial formulation of the noradrenergic theory of reward, dopamine was known as a noradrenergic precursor but not as a transmitter in its own right. At about this time, however, Carlsson et al. (1958) suggested that dopamine might be a neurotransmitter in its own right. The discovery that noradrenaline and dopamine have different distributions in the nervous system (Carlsson, 1959; Carlsson and Hillarp, 1962) appeared to confirm this assumption, and reward sites in the region of the dopamine-containing cells of the midbrain led Crow and others to suggest that the two catecholamine transmitters in forebrain circuitry – noradrenaline and dopamine – might each subserve reward function (Crow, 1972; Crow et al., 1972; Phillips and Fibiger, 1973; German and Bowden, 1974).
Evidence that eventually ruled out a major role for norepinephrine in brain stimulation and addictive drug reward began to accumulate from two sources: pharmacology and anatomy. The pharmacological issue was whether selective noradrenergic blockers or depletions disrupted reward function itself or merely impaired the performance capacity of the animals. For example, Roll (1970) reported that noradrenergic synthesis inhibition disrupted self-stimulation by making animals sleepy; waking them restored the behavior for a time, until the animals lapsed into sleep again (Roll, 1970). Noradrenergic receptor antagonists clearly disrupted intracranial self-stimulation in ways suggestive of debilitation rather than loss of sensitivity to reward (Fouriezos et al., 1978; Franklin, 1978). Also, noradrenergic antagonists failed to disrupt intravenous (IV) self-administration of amphetamine (Yokel and Wise, 1975; 1976; Risner and Jones, 1976) or cocaine (de Wit and Wise, 1977; Risner and Jones, 1980). Further, lesions of the noradrenergic fibers of the dorsal bundle failed to disrupt self-stimulation with stimulating electrodes near the locus coeruleus, where the bundle originates, or in the lateral hypothalamus, through which the bundle projects (Corbett et al., 1977). Finally, careful mapping of the region of the locus coeruleus and the trajectory of the dorsal noradrenergic bundle fibers that originate there revealed that positive reward sites in these regions did not correspond to the precise location of histochemically confirmed noradrenergic elements (Corbett and Wise, 1979).
On the other hand, as selective antagonists for dopamine receptors became available, evidence began to accumulate that dopamine receptor blockade disrupted self-stimulation in ways that implied a devaluation of reward rather than an impairment of performance capacity. There was considerable early concern that the effect of dopamine antagonists – neuroleptics – was primarily motor impairment (Fibiger et al., 1976). Our first study in this area was not subject to this interpretation because performance in our task was enhanced rather than disrupted by neuroleptics. In our study rats were trained to lever-press for IV injections of amphetamine, a drug that causes release of each of the four monoamine neurotransmitters – norepinephrine, epinephrine, dopamine, and serotonin. We trained animals to self-administer IV amphetamine and challenged with selective antagonists for adrenergic or dopaminergic receptors. Animals treated with low and moderate doses of selective dopamine antagonists simply increased their responding (as do animals tested with lower than normal amphetamine doses), while animals treated with high doses increased responding in the first hour or two but responded intermittently thereafter (as do animals tested with saline substituted for amphetamine) (Yokel and Wise, 1975; 1976). Similar effects were seen in rats lever-pressing for cocaine (de Wit and Wise, 1977). Very different effects were seen with selective noradrenergic antagonists; these drugs decreased responding from the very start of the session and did not lead to further decreases as the animals earned and experienced the drug in this condition (Yokel and Wise, 1975; 1976; de Wit and Wise, 1977). The increases in responding for drug reward could clearly not be attributed to performance impairment. The findings were interpreted as reflecting a reduction of the rewarding efficacy of amphetamine and cocaine, such that the duration of reward from a given injection was reduced by dopaminergic, but not noradrenergic, antagonists.
In parallel with our pharmacological studies of psychomotor stimulant reward, we carried out pharmacological studies of brain stimulation reward. Here, however, dopamine antagonists, like reward-reduction, reduced rather than increased lever-pressing. The reason that neuroleptics decrease responding for brain stimulation and increase responding for psychomotor stimulants are interesting and are now understood (Lepore and Franklin, 1992), but at the time decreased responding was suggested to reflect parkinsonian side-effects of dopaminergic impairment (Fibiger et al., 1976). The timecourse of our finding appeared to rule out this explanation. We tracked the time-course of responding in well-trained animals that were pre-treated with the dopamine antagonists pimozide or butaclamol. We found that the animals responded normally in the initial minutes of each session, when they would have expected normal reward from the prior reinforcement history, but they slowed or ceased responding, depending on the neuroleptic dose, as did animals unexpectectly tested under conditions of reduced reward (Fouriezos and Wise, 1976; Fouriezos et al., 1978). Animals pretreated with the noradrenergic antagonist phenoxybenzamine, in contrast, showed depressed lever-pressing from the very start of the session and they did not slow further as they earned and experienced the rewarding stimulation. Performance was poor in the phenoxybenzamine-treated animals, but it did not worsen as the animals gained experience with the reward while under the influence of the drug.
That dopaminergic but not noradrenergic antagonists impaired the ability of reward to sustain motivated responding was confirmed in animals tested in a discrete-trial runway test. Here, the animals ran a two-meter alleyway from a start box to a goal box where they could lever-press, on each of 10 trials per day, for 15 half-second trains of brain stimulation reward. After several days of training the animals were tested after neuroleptic pretreatment. Over the course of 10 trials in the neuroleptic condition, the animals stopped leaving the start box immediately when the door was opened, stopped running quickly and directly to the goal box, and stopped lever-pressing for the stimulation. Importantly, however, the consummatory response – earning the stimulation once they reached the goal box response – deteriorated before the instrumental responses – leaving the start box and running the alleyway deteriorated. The animals left the start box with normal latency for the first 8 trials, ran normally for only the first 7 trials, and lever-pressed at normal rates for only the first 6 trials of the neuroleptic test session. Thus the animals showed signs of disappointment in the reward – indicated by the decreased responding in the goal box – before they showed any lack of motivation indicated by approach responding.
These self-stimulation findings were again incompatible with the possibility that our neuroleptic doses were simply causing motor deficits. The animals showed normal capacity at the beginning of sessions, and continued to run the alleyway at peak speed until after they showed signs disappointment with the reward in the goal box. Moreover, in the lever-pressing experiments the neuroleptic-treated animals sometimes leaped out of their open-topped test chambers and balanced precariously on the edge of the plywood walls; thus the animals still had good motor strength and coordination (Fouriezos, 1985). Moreover, neuroleptic-treated animals that ceased responding after a few minutes did not do so because of exhaustion; they re-initiated normal responding when presented reward-predictive environmental stimuli (Fouriezos and Wise, 1976; Franklin and McCoy, 1979). Moreover, after extinguishing one learned response for brain stimulation reward, neuroleptic-treated rats will initiate, with normal response strength, an alternative, previously learned, instrumental response for the same reward (they then go through progressive extinction of the second response: Gallistel et al., 1982). Finally, moderate reward-attenuating doses of neuroleptics do not impose a lowered response ceiling as do changes in performance demands (Edmonds and Gallistel, 1974); rather they merely increase the amount of stimulation (reward) necessary to motivate responding at the normal maximum rates (Gallistel and Karras, 1984). These pharmacological findings suggested that whatever collateral deficits they may cause, neuroleptic drugs devalue the effectiveness of brain stimulation and psychomotor stimulant rewards.
In parallel with our pharmacological studies, we initiated anatomical mapping studies with two advantages over earlier approaches. First, we used a moveable electrode (Wise, 1976) so that we could test several stimulation sites within each animal. In each animal, then, we had anatomical controls: ineffective stimulation sites above or below loci where stimulation was rewarding. Electrode movements of 1/8 mm were often sufficient to take an electrode tip from a site where stimulation was not rewarding to a site where it was, or vice versa. This allowed us to identify the dorsal-ventral boundaries of the reward circuitry within a vertical electrode penetration in each animal. Second, we took advantage of a new histochemical method (Bloom and Battenberg, 1976) to identify the boundaries of the catecholamine systems in the same histological material that showed the electrode track. Previous studies had relied on single electrode sites in each animal and on comparisons between nissl-stained histological sections and line drawings showing the locations of catecholamine systems. Our mapping studies showed that the boundaries of the effective zones of stimulation did not correspond to the boundaries of noradrenergic cell groups or fiber bundles (Corbett and Wise, 1979) and did correspond to the boundaries of the dopamine cell groups in the ventral tegmental area and substantia nigra pars compacta (Corbett and Wise, 1980) and pars lateralis (Wise, 1981). While subsequent work has raised the question of whether rewarding stimulation activates high-threshold catecholamine systems directly or rather activates their low-threshold input fibers (Gallistel et al., 1981; Bielajew and Shizgal, 1986; Yeomans et al., 1988), the mapping studies tended to focus attention on dopamine rather than norepinephrine systems as substrates of reward.
The term "anhedonia" was first introduced in relation to studies of food reward (Wise et al., 1978). Here again, we found that when well-trained animals were first tested under moderate doses of the dopamine antagonist pimozide, they initiated responding normally for food reward. Indeed, pimozide-pretreated animals responded as much (at 0.5 mg/kg) or almost as much (at 1.0 mg/kg) the first day under pimozide treatment as they did when food was given in the absence of pimozide. When retrained for two days and then tested a second time under pimozide, however, they again responded normally in the early portion of their 45-min sessions but stopped responding earlier than normal and their total responding for this second session was significantly lower than on a drug-free day or on their first pimozide-test day. When retrained and tested a third and fourth time under pimozide, the animals still initiated responding normally but ceased responding progressively earlier. Normal responding in the first few minutes of each session confirmed that the doses of pimozide were not simply debilitating the animals; decreased responding after tasting the food in the pimozide condition suggested that the rewarding (response-sustaining) effect of food was devalued when the dopamine system was blocked.
In this study, a comparison group was trained the same way, but these animals were simply not rewarded on the four "test" days when the experimental groups were pretreated with pimozide. Just as the pimozide-treated animals lever-pressed the normal 200 times for food pellets on the first day, so did the non-rewarded animals lever-press the normal 200 times despite the absence of the normal food reward. On successive days of testing, however, lever-pressing in the non-rewarded group dropped to 100, 50, and 25 responses, showing the expected decrease in resistance to extinction that paralleled the pattern seen in the pimozide-treated animals. A similar pattern across successive tests is seen when animals trained under deprivation are tested several times under conditions of satiety; the first time tested the animals respond for and eat food that was freely available before or during the test. Like the habit-driven lever-pressing in our pimozide-treated or non-rewarded animals, the habit-driven eating under satiety decreases progressively with repeated testing. Morgan (1974) termed the progressive deterioration of responding under satiety "resistance to satiation," calling attention to the parallel with resistance to extinction. In all three conditions – responding under neuroleptics, responding under non-reward, and responding under satiety – the behavior is driven by a response habit that decays if not supported by normal reinforcement. In our experiment, an additional comparison group established that there was no sequential debilitating effect of repeated testing with pimozide, a drug with a long half-life and subject to sequestration by fat. The animals of this group received pimozide in their home cages but were not tested on the first three "test days"; they were allowed to lever-press for food only after the fourth of their series of pimozide injections. These animals responded avidly for food after their fourth pimozide treatment, just like animals that were given the opportunity to lever-press for food the first time they were treated with pimozide. Thus responding in Test 4 depended not just on having had pimozide in the past, but on having tasted food under pimozide conditions in the past. Something about the memory of food experience under pimozide – not just of pimozide alone – caused the progressively earlier response cessation seen when pimozide tests were repeated. The fact that pimozide-pretreated animals responded avidly for food until after they had tasted it in the pimozide condition led us to postulate that the food was not as enjoyable under the pimozide condition. The essential feature of what appeared to be a devaluation of reward under pimozide had been captured earlier in a remark of George Fouriezos in connection with our brain stimulation experiments: "Pimozide takes the jolts out of the volts."
The formal statement of the anhedonia appeared a few years after the food reward studies in a journal that published peer commentaries along with review papers (Wise, 1982). Two thirds of the initial commentaries either contested the hypothesis or proposed an alternative to it (Wise, 1990). For the most part, the primary arguments against the original hypothesis appealed to motor or other performance deficits (Freed and Zec, 1982; Koob, 1982; Gramling et al., 1984; Ahlenius, 1985). These were arguments addressed to the finding that neuroleptics caused decreased performance for food or brain stimulation reward but did not, for the most part, address the fact that neuroleptics disrupted maintenance rather than initiation of responding. They also failed to address the fact that when neuroleptic-treated animals stopped responding their responding could be reinstated by exposing them to previously conditioned reward-predictive stimuli (Fouriezos and Wise, 1976; Franklin and McCoy, 1979). Nor could these arguments be reconciled with the fact that such reinstated responding itself underwent apparent extinction. Finally, they did not address the fact that neuroleptics caused compensatory increases in lever-pressing for amphetamine and cocaine reward (Yokel and Wise, 1975; 1976; de Wit and Wise, 1977).
The most critical evidence against a motor hypothesis was elaborated before the formal statement of the anhedonia hypothesis. The paper (Wise et al., 1978) is still steadily cited, but is probably rarely now read in the original. The original findings are summarized above, but they continue to escape the attention of most remaining proponents of motor hypotheses (or other hypotheses of debilitation); for this reason the original paper is still worth reading. The critical findings are that moderate doses of neuroleptics only severely attenuate responding for food after the animal has had experience with that food while under the influence of the neuroleptic. If the animal has had experience with the neuroleptic in the absence of food, its subsequent effect on responding for food is minimal; however, after having had experience with the food under the influence of the neuroleptic, the effect of the neuroleptic becomes progressively stronger. Similar effects are seen when the only instrumental responses required of the animal are those of picking up the food, chewing it, and swallowing (Wise and Colle, 1984; Wise and Raptis, 1986).
Several of the criticisms of the anhedonia hypothesis have been more semantic than substantial. While agreeing that the effects of neuroleptics cannot be explained as simple motor debilitation, several authors have suggested other names for the condition. Katz (1982) termed it "hedonic arousal"; Liebman (1982) termed it "neuroleptothesia"; Rech (1982) termed it "neurolepsis' or "blunting of emotional reactivity"; Kornetsky (1985) termed it a problem of "motivational arousal"; and Koob (1982) begged the question by calling it a "higher order" motor problem. The various criticisms addressed differentially the anhedonia hypothsis, the reinforcement hypothesis, and the reward hypothesis.
The anhedonia hypothesis was really a corollary of the hypothesis that dopamine was important for objectively measured reward function. The initial statement of the hypothesis was that the neuroleptic pimozide "appears to selectively blunt the rewarding impact of food and other hedonic stimuli" (Wise, 1978). It was not really an hypothesis about subjectively experienced anhedonia but rather an hypothesis about objectively measured reward function. The first time the hypothesis was actually labeled the "anhedonia hypothesis" (Wise, 1982), it was stated thusly: "the most subtle and interesting effect of neuroleptics is a selective attenuation of motivational arousal that is (a) critical for goal-directed behavior, (b) normally induced by reinforcers and associated environmental stimuli, and (c) normally accompanied by the subjective experience of pleasure." The hypothesis linked dopamine function explicitly to motivational arousal and reinforcement – the two fundamental properties of rewards – and implied only a partial correlation with the subjective experience of the pleasure that "usually" accompanies positive reinforcement.
The suggestion that dopamine might be important for pleasure itself came in part from the subjective reports of patients (Healy, 1989) or normal subjects (Hollister et al., 1960; Bellmaker and Wald, 1977) given neuroleptic treatments. The dysphoria caused by neuroleptics is quite consistent with the suggestion that they attenuate the normal pleasures of life. Consistent with this view were that drugs like cocaine and amphetamine – drugs that are presumed to be addictive at least in part because of the euphoria they cause (Bijerot, 1980) – increase extracellular dopamine levels (vanRossum et al., 1962; Axelrod, 1970; Carlsson, 1970). The neuroleptic pimozide, a competitive antagonist at dopamine receptors (and the neuroleptic used in our animal studies), had been reported to decrease the euphoria induced by IV amphetamine in humans (Jönsson et al., 1971; Gunne et al., 1972).
The ability of neuroleptics to block the subjective effects of euphoria have been questioned on the basis of clinical reports of continued amphetamine and cocaine abuse in neuroleptic-treated schizophrenic patients and on the basis of more recent studies on the subjective effects of neuroleptic-treated normal humans. The clinical observations are difficult to interpret because of compensatory adaptations to chronic dopamine receptor blockade and because of variability in drug intake, neuroleptic dose, and compliance with treatment during periods of stimulant use. The more recent controlled studies of the effects of pimozide on amphetamine euphoria (Brauer and de Wit, 1996; 1997) are also problematic. First, there are issues of pimozide dose: the high dose of the early investigators was 20 mg (Jönsson et al., 1971; Gunne et al., 1972), whereas, because of concern about extrapyramidal side-effects, the high dose in the more recent studies was 8 mg. More troublesome are the differences in amphetamine treatment between the original and the more recent studies. In the original studies, 200 mg of amphetamine was given intravenously to regular amphetamine users; in the more recent studies, 10 or 20 mg was given to normal volunteers by mouth in capsules. One must wonder if normal volunteers are feeling and rating the same euphoria from their 20 mg capsules as is felt by chronic amphetamine users after their 200 mg IV injection (Grace, 2000; Volkow and Swanson, 2003).
The notion that neuroleptics attenuate the pleasure of food reward has also been challenged on the basis of rat studies (Treit and Berridge, 1990; Pecina et al., 1997). Here the challenge was based on the taste-reactivity test, putatively a test of the hedonic impact of sweet taste (Berridge, 2000). The test has been used to challenge directly the hypothesis that "pimozide and other dopamine antagonists produce anhedonia, a specific reduction of the capacity for sensory pleasure" (Pecina et al., 1997, p. 801). This challenge is, however, subject to serious caveats: "When using taste reactivity as a measure of 'liking' or hedonic impact it is important to be clear about a potential confusion. Use of terms such as 'like' and 'dislike' does not necessarily imply that taste reactivity patterns reflect a subjective experience of pleasure produced by a food" (Berridge, 2000, p. 192, emphasis as in the original), and that "We will place 'liking' and 'wanting' in quotation marks because our use differs in an important way from the ordinary use of these words. By their ordinary meaning, these words typically refer to the subjective experience of conscious pleasure or conscious desire" (Berridge and Robinson, 1998, p. 313). The taste reactivity test seems unlikely to directly measure the subjective pleasure of food, as "normal" taste reactivity in this paradigm is seen in decorticate rats (Grill and Norgren, 1978) and similar reactions are seen in anencephalic children (Steiner, 1973). Thus it appears that the initial interpretation of the taste reactivity test (Berridge and Grill, 1984) was correct: the test measures the fixed action patterns of food ingestion or rejection – more a part of swallowing than of smiling – reflecting hedonic impact only insomuch as it reflects the positive or negative valence of the fluid injected into the passive animal's mouth.
The anhedonia hypothesis was based on the observation that a variety of rewards failed to sustain normal levels of instrumental behavior in well-trained but neuroleptic-treated animals. This was not taken as evidence of neuroleptic-induced anhedonia, but rather evidence of neuroloptic-induced attenuation of positive reinforcement. Under neuroleptic treatment animals showed normal initiation but progressive decrements in responding both within and across repeated trials, and these decrements paralleled in pattern, if not in degree, the similar decrements seen in animals that were simply allowed to respond under conditions of non-reward (Wise et al., 1978). Moreover, naïve rats were found not to learn to lever-press normally for food if they were pretreated with neuroleptic for their training sessions (Wise and Schwartz, 1981). Thus the habit-forming effect of food is severely attenuated by dopamine blockade. These findings have not been challenged but have rather been replicated by critics of what has come to be labeled the anhedonia hypothesis (Tombaugh et al., 1979; Mason et al., 1980), who have argued that under their conditions neuroleptics cause performance deficits above and beyond clear deficits in reinforcement. Given the fact that neuroleptics block all dopamine systems, some of which are thought to be involved in motor function, this was not surprising or contested (Wise, 1985).
Clear similarities between the effects of non-reward and the effects of reward under neuroleptic treatment are further illustrated by two much more subtle paradigms. The first is a partial reinforcement paradigm. It is well established that animals respond more under extinction conditions if they are trained not to expect a reward for every response they make. That animals respond more in extinction if they have been trained under intermittent reinforcement is known as the partial reinforcement extinction effect (Robbins, 1971). Ettenberg and Camp found partial reinforcement extinction effects with neuroleptic challenges of food- and water-trained response habits. They tested animals in extinction of a runway task after training in each of three conditions. Food- or water-deprived animals were trained, one trial per day, to run 155 cm in a straight alley runway for food (Ettenberg and Camp, 1986b) or water (Ettenberg and Camp, 1986a) reward. One group was trained under a "continuous" reinforcement schedule; that is, they received their designated reward on each of the 30 days of training. A second group was trained under partial reinforcement; they received their designated reward on only 20 of the 30 training days; on 10 days randomly spaced in the training period, the animals found no food or water when they arrived at the goal box. The third group received food or water on every trial but were periodically treated with the neuroleptic haloperidol; on 10 of their training trials they found food or water in the goal box, but, having been pretreated with haloperidol on those days, they experienced the food or water under conditions of dopamine receptor blockade. The consequences of these training regimens were assessed in 22 subsequent daily "extinction" trials in which each group was allowed to run but received no reward in the goal box. All animals ran progressively slower as the extinction trials continued. However, the performance of animals that had been trained under conditioned reinforcement conditions deteriorated much more rapidly from day to day than did that of animals that had been trained under partial reinforcement conditions. The animals that had been trained under "partial" haloperidol conditions also persevered more than the animals with the continuous reinforcement training; the intermittent haloperidol animals had start-box latencies and running times that were identical to those of the animals trained under partial reinforcement. That is, the animals pretreated with haloperidol on 1/3 of their training days performed in extinction as if they had experienced no reward on 1/3 of their training days. There is no possibility of a debilitation confound here, first because the performance of the haloperidol-treated animals was better than that of the control animals and second because haloperidol was not given on the test days, only on some of the training days.
The second subtle paradigm is a two-lever drug discrimination paradigm. Here the animals are trained to continue responding on one of two levers as long as that lever yields food reward, and to shift to the other lever when no longer rewarded. With low-doses of haloperidol, animals inexplicably shift to the wrong lever as if they had earned no food with their initial lever-press (Colpaert et al., 2007). That is, haloperidol-treated rats that earned food on their initial lever-press behaved like normal rats that failed to earn food on their initial lever-press. This was not a reflection of some form of haloperidol-induced motor deficit, because the evidence that food was not rewarding under haloperidol involved not the absence of a response but rather the initiation of a response: a response on the second lever.
Thus it is increasingly clear that, whatever else they do, neuroleptics decrease the reinforcing efficacy of a range of normally positive rewards.
The most recent challenge to the anhedonia hypothesis comes from theorists who argue that the primary motivational deficit caused by neuroleptics is a deficit in the drive or motivation to find or earn reward rather than the reinforcement that accompanies the receipt of reward (Berridge and Robinson, 1998; Salamone and Correa, 2002; Robinson et al., 2005; Baldo and Kelley, 2007). The suggestion that dopamine plays an important role in motivational arousal was, in fact, stressed more strongly in the original statement of the anhedonia hypothesis than was anhedonia itself: "the most subtle and interesting effect of neuroleptics is a selective attenuation of motivational arousal which is (a) critical for goal-directed behavior…" (Wise, 1982). That elevations of extracellular dopamine can motivate learned behavior sequences is perhaps best illustrated by the "priming" effect that is seen when free reward is given to an animal that is temporarily not responding in an instrumental task (Howarth and Deutsch, 1962; Pickens and Harris, 1968). This effect is best illustrated by drug-induced reinstatement of responding in animals that have undergone repeated extinction trials (Stretch and Gerber, 1973; de Wit and Stewart, 1983). One of the most powerful stimuli for reinstatement of responding in animals that have extinguished a cocaine-seeking or a heroin-seeking habit is an unearned injection of the dopamine agonist bromocriptine (Wise et al., 1990). The inclusion of motivational arousal is the main feature that differentiates the dopamine hypothesis of reward from the narrower dopamine hypothesis of reinforcement (Wise, 1989; 2004).
While there is ample evidence that dopamine can amplify or augment motivational arousal, there is equally ample evidence that neuroleptic drugs do not block the normal motivational arousal that is provided for a well-trained animal by reward-predictive cues in the environment. As discussed above, neuroleptic-treated animals tend to initiate response habits normally. Such animals start but do not normally continue to lever-press, run, or eat in operant chambers, runways, or free-feeding tests. When given in a discrete-trial runway task, haloperidol- treated animals run normally during the trial when the haloperidol is given; their motivational deficit only appears the next day, when the haloperidol has been metabolized and all that is left of the treatment is the memory of the treatment trial (McFarland and Ettenberg, 1995; 1998). The start-box cues fail to trigger running down the runway for food or heroin not on the day when the animals are under the influence of haloperidol, but on the next day when they only remember what the reward was like on the haloperidol day. So the motivational arousal of the animal on the day it gets haloperidol treatment is not compromised by the treatment; rather it must be the memory of a degraded reward that discourages the animal the day after the treatment trial. This is the most salient message from studies of the effects of neuroleptics on instrumental behavior in the range of tasks; neuroleptics at appropriate doses do not interfere with the ability of learned stimuli to instigate motivated behavior until after the stimuli have begun to lose the ability to maintain that behavior because of experience of the reward in the neuroleptic condition (Fouriezos and Wise, 1976; Fouriezos et al., 1978; Wise et al., 1978; Wise and Raptis, 1986; McFarland and Ettenberg, 1995; 1998).
This is not to say that dopamine is completely irrelevant to motivated behavior, only that the surges of phasic dopamine that are triggered by reward-predictors (Schultz, 1998) are, for the moment, unnecessary for the normal motivation of animals with an uncompromised reinforcement history. Well-trained animals respond out of habit, and do so even under conditions of dopamine receptor blockade. If brain dopamine is completely depleted, however, there are very dramatic effects on motivated behavior (Ungerstedt, 1971; Stricker and Zigmond, 1974). This is evident from studies of mutant mice that do not synthesize dopamine; these animals, like animals with experimental dopamine depletions, fail to move unless aroused by pain or stress, a dopamine agonist, or the dopamine-independent stimulant caffeine (Robinson et al., 2005). Thus minimal levels of functional dopamine are necessary for all normal behavior; dopamine-depleted animals, like dopamine-depleted parkinsonian patients (Hornykiewicz, 1979), are almost completely inactive unless stressed (Zigmond and Stricker, 1989). Among the primary deficits associated with dopamine depletion are aphagia and adipsia, which have motivational as well as motor components (Teitelbaum and Epstein, 1962; Ungerstedt, 1971; Stricker and Zigmond, 1974). Reward-blocking doses of neuroleptics, however, fail to produce the profound catalepsy that is caused by profound dopamine depletion.
The dopamine terminal field that has received most attention with respect to reward function is nucleus accumbens. Attention was drawn to nucleus accumbens first because lesions of this but not other catecholamine systems disrupted cocaine self-administration (Roberts et al., 1977). Further attention was generated by the suggestions that nucleus accumbens septi should be considered a limbic extension of the striatum, rather than an extension of the septum (Nauta et al., 1978a,b) and that it is an interface between the limbic system – conceptually linked to functions of motivation and emotion – and the extrapyramidal motor system (Mogenson et al., 1980). Studies of opiate reward also suggested that it is the mesolimbic dopamine system – the system projecting primarily from the ventral tegmental area to the nucleus accumbens – that is associated with reward function. Morphine in the ventral tegmental area was found to activate (Gysling and Wang, 1983; Matthews and German, 1984), by disinhibiting them (Johnson and North, 1992), dopaminergic neurons, and microinjections of morphine in this region potentiated brain stimulation reward (Broekkamp et al., 1976), produced conditioned place preferences (Phillips and LePiane, 1980), and were self-administered in their own right (Bozarth and Wise, 1981).
One challenge to the dopamine hypotheses thus arose from the finding that nucleus accumbens lesions failed to disrupt all instrumental behavior (Salamone et al., 1997). Aside from the problem that it is almost impossible to lesion nucleus accumbens selectively and, at the same time, completely, there are other reasons to assume that nucleus accumbens lesions should not eliminate all of dopamine's motivational actions. First, cocaine is directly self-administered not only into nucleus accumbens (Carlezon et al., 1995; Ikemoto, 2003), but also – and more avidly – into the medial prefrontal cortex (Goeders and Smith, 1983; Goeders et al., 1986) and olfactory tubercle (Ikemoto, 2003). Intravenous cocaine reward is attenuated not only by microinjections of a D1 antagonist into the ventral tegmental area (Ranaldi and Wise, 2001) but also by similar injections into the substantia nigra (Quinlan et al., 2004). Finally, post-trial dopamine release in the dorsal striatum enhances consolidation of learning and memory (White and Viaud, 1991), and dopamine blockade in the dorsal striatum impairs long-term potentiation (a cellular model of learning and memory) in this region (Centonze et al., 2001). Potentiation of memory consolidation is, in essence, the substance of reinforcement (Landauer, 1969) and dopamine appears to potentiate memory consolidation in the dorsal striatum and a variety of other structures (White, 1989; Wise, 2004).
Thus, for a variety of reasons, the dopamine hypothesis should not be reduced to a nucleus accumbens hypothesis. Nucleus accumbens is but one of the dopamine terminal fields implicated in reward function.
While evidence has steadily accumulated for an important role of dopamine in reward function a role we originally summarized loosely as "motivational arousal" our understanding of the precise nature of this function continues to develop in subtlety and complexity. Four issues, in addition to variations on the old motor hypothesis, have arisen in the recent literature.
One suggestion, offered as a direct challenge to the anhedonia hypothesis and the dopamine hypothesis of reward (Salamone et al., 1994; 1997; 2005) is that what neuroleptics reduce is not motivation or reinforcement but rather the animal's willingness to exert effort (Salamone et al., 2003). This suggestion is merely semantic. The willingness to exert effort is the essence of what we mean by motivation or drive, the first element in the initial three-part statement of the anhedonia hypothesis (Wise, 1982).
Studies of mutant mice lacking dopamine in dopaminergic neurons (but retaining it in noradrenergic neurons) show that brain dopamine is not absolutely necessary for food-rewarded instrumental learning. If given caffeine to arouse them, dopamine-deficient mice can learn to choose the correct arm of a T-maze for food reward (Robinson et al., 2005). This implicates dopamine in the motivational arousal that is lacking in dopamine-deficient mice that are not treated with caffeine, and indicates that dopamine is not essential to – though it normally contributes greatly to – the rewarding effects of food. It is interesting to note, however, that caffeine – required if the mutant mice are to behave at all without dopamine – also restores the feeding response that is lost after neurotoxic lesions of dopamine neurons in adult animals (Stricker et al., 1977). The mechanism of the caffeine effects is not fully understood, but caffeine affects the same medium-sized spiny striatal neurons that are the normal neuronal targets of dopaminergic fibers of the nigro-striatal and meso-limbic dopamine systems. It acts there as a phosphodiesterase inhibitor that increases intracellular cyclic AMP (Greengard, 1976) and as an adenosine receptor antagonist (Snyder et al., 1981). Moreover, the adenosine receptors that are blocked by caffeine normally form heteromers with dopamine receptors and affect the intracellular response to the effects of dopamine at those receptors (Ferre et al., 1997; Schiffmann et al., 2007). The complex interactions of dopamine and adenosine receptors in the striatum raises the possibility that caffeine enables learning in dopamine-deficient mice by substituting for dopamine in a shared or overlapping intracellular signaling cascade.
Schultz and colleagues have shown that the ventral tegmental dopamine neurons implicated in reward function respond not only to food reward itself but, as a result of experience, to predictors of food reward (Romo and Schultz, 1990; Ljungberg et al., 1992). As the animal learns that an environmental stimulus predicts food reward, the 200 millisecond burst of dopaminergic nerve firing that was initially triggered by food presentation itself becomes linked, instead, to the food-predictive stimulus that precedes it. If the food-predictive stimulus predicts food on only a fraction of the trials, then the dopaminergic neurons burst, to a lesser extent, in response to both the predictor and to the food; the stronger the probability of prediction, the stronger the response to the predictor and the weaker the response to the food presentation.
The fact that the dopaminergic neurons cease to respond to food itself and respond instead to food predictors raises the issue of whether the taste of food is not itself merely a reward predictor (Wise, 2002). Some tastes appear to be unconditioned reinforcers from birth (Steiner, 1974), but others gain motivational significance through the association of their taste with their post-ingestional consequences (Sclafani and Ackroff, 1994).
The concept of "reinforcement" is a concept of "stamping in" of associations (Thorndike, 1898). Whether the association is between a conditioned and an unconditioned stimulus (Pavlov, 1928), a stimulus and a response (Thorndike, 1911), or a response and an outcome (Skinner, 1937), reinforcement refers to the strengthening of an association through experience. Another way to look at it is that reinforcement is a process that enhances consolidation of the memory trace for the association (Landauer, 1969). Studies of post-trial dopaminergic activation suggest that dopamine serves to enhance or reinforce the memory trace for recently experienced events and associations, and that it does so in a variety of dopamine terminal fields (White and Milner, 1992). Several lines of evidence (Reynolds et al., 2001; Wise, 2004; Hyman et al., 2006; Wickens et al., 2007) now implicate a modulatory role for dopamine in cellular models of learning and memory that is consistent with the view that dopamine plays an important role in reinforcement.
While variations of the anhedonia hypothesis or the dopamine hypotheses of reward or reinforcement continue to appear, the hypothesis as originally stated still captures the scope of the involvement of dopamine in motivational theory. Normal levels of brain dopamine are important for normal motivation, while phasic elevations of dopamine play an important role in the reinforcement that establishes response habits and stamps in the association between rewards and reward-predicting stimuli. Subjective pleasure is the normal correlate of the rewarding events that cause phasic dopamine elevations, but stressful events can also cause dopamine elevations; thus pleasure is not a necessary correlate of dopamine elevations or even reinforcement itself (Kelleher and Morse, 1968).