|Home | About | Journals | Submit | Contact Us | Français|
The ability to form associations between predictive environmental events and rewarding outcomes is a fundamental aspect of learned behavior. This apparently simple ability likely requires complex neural processing evolved to identify, seek, and utilize natural rewards and redirect these activities based on updated sensory information. Emerging evidence from both animal and human research suggests that this type of processing is mediated in part by the nucleus accumbens and a closely associated network of brain structures. The nucleus accumbens is required for a number of reward-related behaviors, and processes specific information about reward availability, value, and context. Additionally, this structure is critical for the acquisition and expression of most Pavlovian stimulus-reward relationships, and cues that predict rewards produce robust changes in neural activity in the nucleus accumbens. While processing within the nucleus accumbens may enable or promote Pavlovian reward learning in natural situations, it has also been implicated in aspects of human drug addiction, including the ability of drug-paired cues to control behavior. This article will provide a critical review of the existing animal and human literature concerning the role of the NAc in Pavlovian learning with non-drug rewards and consider some clinical implications of these findings.
Species survival and propagation requires that individual organisms learn about their surroundings and continually adjust behavior accordingly. One elementary yet biologically critical form of learning involves the connection of positive outcomes with predictive cues. This ability enables organisms to track, locate, and secure food and necessary materials in demanding environments, revealing obvious survival value. Moreover, such Pavlovian learning is often the background for both normal and maladaptive human behaviors. Thus, understanding reward-related Pavlovian learning could shed light on a variety of human activities, including drug taking, food seeking, social attachment, and sexual behavior.
Recent endeavors have sought to identify and deconstruct the neurobiological correlates of Pavlovian learning, and there is now little doubt that a distributed network of brain nuclei mediates the ability to form associations between rewarding events and their predictors. The nucleus accumbens (NAc) is central to this network, and ongoing research focuses on how this area processes natural and drug rewards as well as related behaviors. This review will discuss the behavioral importance of Pavlovian learning (with particular attention to natural rewards), examine the involvement of the nucleus accumbens in this process, and consider how these factors may contribute to drug abuse and addiction.
Historically, reward-related learning has been divided into stimulus-outcome (classical or Pavlovian) and action-outcome (operant or instrumental) branches. In the context of Pavlovian learning, biologically relevant outcomes such as food, water, and sexual stimuli are labeled unconditioned stimuli (US) because they are able to evoke innate or unconditioned responses (UR) such as salivation, approach, and consumption (Pavlov 1927). Pavlovian conditioning procedures involve the pairing of a neutral sensory stimulus, termed the conditioned stimulus (CS) with a US in a temporally contingent manner. Learning occurs as the previously neutral stimulus obtains predictive value for the coming reward based on repeated pairings of the CS and US. Eventually, this novel cue is able to evoke a response that is often topographically similar to that produced by the US itself. The learned response that the CS elicits is called the conditioned response (CR).
The existence of an Pavlovian association is typically inferred from the presence or absence of a CR. Therefore, complete understanding of the neurobiological basis of Pavlovian learning requires not only advanced cellular and pharmacological technologies, but also reliable behavioral techniques that can measure the acquisition and maintenance of an association. For example, stimulus-reward learning in animals is often quantified using an “autoshaping” or “sign-tracking” design, in which a CS predicts the availability of a natural reward such as food (US). Initially, food delivery produces orienting and approach responses that are followed by consumption. With repeated CS-US pairings, the CS itself begins to elicit highly selective approach responses even though reward delivery is independent of any specific behavior (Bussey and others 1997; Robbins and Everitt 2002). After extended conditioning, approach responses are observed nearly every time the reward-predictive CS is presented to an animal, indicating a strong association between this cue and the future reward. Interestingly, conditioned approach behaviors have been observed in a variety of species, including rats, monkeys, pigeons, and humans (Brown and Jenkins 1968; Sidman and Fletcher 1968; Wilcove and Miller 1974; Bussey and others 1997), and track diverse rewards such as heat, food, water, intracranial stimulation, intravenous cocaine, and copulation (Peterson and others 1972; Jenkins and Moore 1973; Wasserman 1973; Burns and Domjan 1996, 2001; Uslaner and others 2006). Thus, although directed approach behaviors are unlike traditional Pavlovian visceral or glandular responses, they provide a convenient measure of associative reward learning that is highly similar across species.
In real life, organisms use environmental cues to update expectancies and allocate behavioral resources in a way that maximizes value and minimizes energy expenditure. Therefore, Pavlovian relationships may be embedded within virtually all operant circumstances. For example, general contextual stimuli (e.g., a place where rewards are consumed) may come to be explicitly associated with reward delivery and operate as conditioned stimuli. In other cases, conditioned stimuli may even acquire some of the motivational properties of primary goals like food and sex through Pavlovian learning. In turn, such stimuli could act as conditioned reinforcers, with the ability to strengthen and redirect behavior on their own in an operant manner. For humans, money is perhaps the best example of this phenomenon. Although government currency possesses no innate biological importance, it is consistently paired with items that do have motivational significance. This pairing allows money to serve as a predictor for future rewards, but also as a powerful conditioned reinforcer.
From an ecological standpoint, reward-related Pavlovian learning may provide several adaptive advantages for an organism in a rapidly changing environment. Successful identification and consumption of unconditioned stimuli such as food involves physical contact that engages proximal receptors of taste, olfaction, and somatosensation. However, because these sensory modalities are not equipped to identify stimuli in a larger environmental field, they may not make up a complete set of tools for efficient foraging behavior. Through Pavlovian learning, visual and auditory information can be incorporated in the foraging experience and utilized to detect and predict available rewards. Moreover, the ability of stimuli to act as conditioned reinforcers means that goal-directed behaviors can be maintained for longer periods of time in the absence of a primary reward. Importantly, each of these functions may increase the chances that an organism will secure necessary materials and meet energy needs.
The ability to form and use Pavlovian associations requires the existence and cooperation of brain circuits equipped to integrate sensory and motivational information and alter motor output. The nucleus accumbens has received much attention in this respect. In the rat, the NAc is comprised of two primary subregions, the core and shell, which differ with respect to their anatomic organization and functional properties (Zahm and Brog 1992; Zahm and Heimer 1993; Zahm 1999, 2000; Voorn and others 2004). As illustrated in Figure 1, the core and shell of the rodent NAc receive afferent projections from a variety of cortical and subcortical structures including the basolateral amygdala (Zahm and Brog 1992; Brog and others 1993; Wright and others 1996), the prefrontal cortex (McGeorge and Faull 1989; Zahm and Brog 1992; Brog and others 1993), the subiculum of the hippocampus (Groenewegen and others 1987; Groenewegen and others 1991; Zahm and Brog 1992; Brog and others 1993), and the ventral tegmental area (Zahm and Brog 1992). Importantly, afferent projections to the NAc are not homogeneously distributed across the core and shell (Groenewegen and others 1987; McGeorge and Faull 1989; Groenewegen and others 1991; Zahm and Brog 1992; Brog and others 1993; Heimer and others 1995; Heimer and others 1997). For example, Brog and co-workers (1993) showed that a number of cortical afferents of the shell and core originate in separate areas (e.g., the infralimbic and posterior piriform cortices to the medial shell versus the dorsal prelimbic and anterior cingulate to the core).
Likewise, the efferent projections from the NAc differ between the core and shell subregions in the rat (Heimer and others 1991; Zahm and Brog 1992; Zahm and Heimer 1993; Zahm 1999). That is, the core parallels basal ganglia circuitry sending outputs through the ventral pallidum (dorsolateral district), subthalamic nucleus and substantia nigra. These outputs in turn project via the motor thalamus to premotor cortical areas. In contrast, the shell projects preferentially to subcortical limbic regions including the lateral hypothalamus, ventral pallidum (ventromedial district) and VTA (Zahm 1999). Interestingly, recent findings show direct interconnections between core and shell neurons providing anatomic evidence that these NAc subregions do not function completely independently, but instead comprise interacting neuronal networks (van Dongen and others 2005).
Examination of the local organization of the striatum has revealed that over 90% of neurons in the neostriatum (caudate-putamen) are of the medium-spiny type, and that these medium spiny neurons comprise the majority of projections from the striatum (Groves 1983). Other cell types include, among others, the aspinous cholinergic interneurons and medium-size aspinous GABAergic interneurons (Groves 1983; Groenewegen and others 1991). Investigations have confirmed that the NAc is comprised of similar neuronal cell types (Voorn and others 1989; Groenewegen and others 1991), although there are morphological differences between core and shell neurons (Meredith and others 1992). Moreover, a heterogeneous distribution within the NAc subregions also exists with respect to markers for substances such as calbindin and enkephalin, among others (Voorn and others 1989; Groenewegen and others 1991; Zahm 1999).
Given the anatomic arrangement of the NAc, it was proposed by Mogenson (1987) and elaborated upon by others (Everitt and Robbins 1992; Pennartz and others 1994; Ikemoto and Panksepp 1999) that the NAc functions as a site for the integration of limbic information related to memory, drive and motivation, and the generation of goal-directed motor behaviors (termed ‘limbic-motor integration’). Within this model, the neurotransmitter dopamine plays a key role in this process by functioning to modulate or ‘gate’ the transfer of these neural signals and thereby influence goal-directed behaviors (Cepeda and Levine 1998). That is, dopamine functions as a neuromodulator, influencing the activation of NAc cells by specific afferents including the hippocampus, basolateral amygdala and prefrontal cortex.
Consistent with this view is the observation that NAc afferents make convergent synaptic contacts onto medium spiny neurons. Studies using immunocytochemistry in conjunction with electron microscopy showed that hippocampal and dopaminergic inputs make synaptic connections with the same NAc neuron (Totterdell and Smith 1989; Sesack and Pickel 1990). Likewise, Van Bockstaele and Pickel (1993) reported that 5-HT terminals were in direct contact with dopaminergic axons in both the core and shell of the rat. In addition, a convergence of inputs from the medial prefrontal cortex and the ventral subiculum on NAc neurons has recently been identified (French and Totterdell 2002) as well as the BLA and ventral subiculum (French and Totterdell 2003). These findings indicate that NAc afferents are capable of influencing NAc cell firing in behaving animals (Pennartz and others 1994; O'Donnell and Grace 1995; Carr and Sesack 2000; Pinto and Sesack 2000), consistent with a role of this structure in ‘limbic-motor’ integration (Mogenson 1987).
In order for Pavlovian learning to successfully modify reward-related behaviors, brain systems must first be able to process information about the identity and value of unconditioned stimuli that act as rewards. Further, once a reward is obtained, motor systems must appropriately redirect behavior to gain maximal utility from the reward. The application of in vivo electrophysiological recording techniques to the study of reward processing in the NAc has revealed a variety of changes in NAc activity during goal-directed behaviors. For example, a subset of NAc neurons exhibit phasic yet time-locked alterations in firing rate when animals make operant responses to obtain rewards such as water and food (Apicella and others 1991; Carelli and Deadwyler 1994, 1997; Carelli and others 2000; Hassani and others 2001; Nicola and others 2004). However, these patterns of cellular activity are not homogenous. In fact, some NAc cells display enhanced activation before a lever press, while the activity of other neurons may increase or decrease immediately after the lever press (Carelli and Deadwyler 1994, 1997; Carelli 2002). Thus, NAc neurons seemingly process remarkably different types of reward-related information, which could reflect the dual role of this structure in both reward seeking and reward consumption (Nicola and others 2004).
Electrophysiological studies typically investigate NAc reward processing using operant (action-outcome) tasks, making it difficult to distinguish NAc activity specific to rewards from activity related to reward seeking behaviors. However, a few recent studies have controlled for or circumvented this complication to assess reward-specific NAc activity. In one study, NAc cellular activity was monitored while naive rats received experimenter-controlled intra-oral infusions of rewarding sucrose (Roitman and others 2005a). Consistent with other reports (Nicola and others 2004; Taha and Fields 2006), the predominant response of NAc neurons to sucrose infusions was a decrease in activity (Fig. 2). As is evident in Figure 2, the same neurons exhibited opposite responses when an aversive quinine solution was delivered intra-orally. One hypothesis suggests that inhibitions observed during reward delivery occur among GABA-containing NAc neurons that project to important motor areas such as the ventral pallidum (VP). Through the dis-inhibition of target neurons, such a change in activity could provide a gating signal for reward-related behaviors such as consumption (Nicola and others 2004; Roitman and others 2005a; Taha and Fields 2006). In support of this hypothesis, a recent study found that individual VP neurons exhibit increases in firing rate during consumption of a rewarding sucrose solution (Tindell and others 2006). Notably, a separate subset of NAc neurons exhibit increases in activity when sucrose rewards are delivered (Taha and Fields 2005). However, the magnitude of activation varies based on the concentration of sucrose, indicating that these neurons encode the palatability of a food reward instead of reward delivery or consumption. Interestingly, not all inhibitory and excitatory NAc responses observed during the delivery of primary rewards are fixed or unconditional. Rather, a subgroup of NAc neurons exhibit differential responses based on the relative context of reward delivery, including the availability of more and/or less preferred rewards (Wheeler and others 2005).
Another approach to the study of NAc reward processing involves the use of pharmacological techniques to activate or inhibit specific neurotransmitter systems within the NAc. Using this method, multiple reports implicated glutamate, GABA, and opioid neurotransmission in key aspects of reward processing, and specifically in feeding behavior. Both GABA agonism and glutamate antagonism in the NAc produce increases in food consumption, further indicating that neuronal inhibition in this structure may play an important role in the initiation or maintenance of feeding behavior (Kelley 2004). Intra-NAc μ-opioid agonists have also been shown to boost food intake, while animals receiving μ-opioid antagonists exhibit attenuated consumption (Kelley and others 1996; Pecina and Berridge 2000). Interestingly, manipulations that increase food intake are most effective in the shell of the NAc, indicative of a functional division between NAc subregions. In addition, a spatially restricted area within the medial NAc shell has been specifically implicated in the ability of opioid agonists to alter hedonic reactions to both rewarding and aversive stimuli (Pecina and Berridge 2005). Thus, some categories of reward-related information may be processed by distinct neurotransmitter systems in functionally isolated regions of the NAc.
The midbrain dopaminergic projection to the NAc is critical for a number of reward-related behaviors, and several decades of experimental research have attempted to elucidate the precise functional role of this connection. Early support for the involvement of the mesocorticolimbic dopamine pathway in reward processing came from several studies demonstrating that the blockade of dopamine receptors produced a decrease in goal-directed behavior for food and other rewards (Wise and others 1978b; Wise and others 1978a; Wise and others 1992). Interestingly, although animals that received dopamine antagonists still worked to obtain rewards, responding decreased as a function of time, similar to what would be expected if rewards were removed altogether (Fouriezos and Wise 1976; Wise and others 1978b). These findings initially led to the suggestion that dopamine release in the NAc mediates the hedonic or “pleasure” aspects of rewarding stimuli, and, in turn, that both natural and drug rewards could be defined by this common path of activation (Wise and Bozarth 1985). However, this hypothesis has been questioned on a number of grounds. For example, dopamine antagonism in the NAc does not impair orofacial movements characteristic of reward “liking” (Pecina and others 1997), indicating that the hedonic value of a stimulus is not based on NAc dopamine transmission. Moreover, proper NAc dopamine function is also required for tasks that are motivated by aversion rather than by rewards (Blackburn and others 1992; Salamone 1994). Finally, NAc dopamine depletion disrupts behavioral performance when large amounts of effort are required to obtain rewards, but has little effect on easy tasks (Aberman and Salamone 1999). Taken together, these and other findings support a larger role for NAc dopamine beyond simple hedonic pleasure (Blackburn and others 1987; Salamone and others 1991; Schultz and others 1993; Hollerman and Schultz 1998; Waelti and others 2001; Salamone and others 2002; Pecina and others 2003).
Since the original “anhedonia” hypothesis, a number of new and/or revised theories have been developed to explain the function of NAc dopamine in reward processing (Blackburn and others 1992; Ikemoto and Panksepp 1999; Schultz 2001; Di Chiara 2002; Ungless 2004; Wise 2004; Salamone and others 2005). For example, one theory posits that NAc dopamine mediates the incentive salience possessed by rewards and other important stimuli (Berridge and Robinson 1998). According to this theory, the principal function of mesocorticolimbic dopamine is to promote the motivational aspects of goal-directed behavior. Another theory regards NAc dopamine as a more general system that enables an organism to respond to novel events, discern the meaning of environmental stimuli, and generate approach or withdrawal responses to rewarding and aversive stimuli (Ikemoto and Panksepp 1999). Still more hypotheses have proposed that dopamine release in the NAc is critical to the behavioral process of reinforcement or reward itself (Wise 2004), or is required to overcome larger response costs to obtain rewards (Salamone and others 2003).
Although the precise role of dopamine in reward processing is presently under much debate, new findings and technological advances have contributed greatly to our understanding of this issue. While microdialysis investigations have long reported increases in NAc dopamine levels during goal-directed behaviors and/or receipt of rewards (Di Chiara 2002), these investigations lack the temporal resolution necessary to associate dopamine with precise (real-time) behavioral observations. Recently, the ability to measure dopamine release on a physiologically and behaviorally relevant timescale has led to a focus on rapid NAc dopamine release events (Garris and others 1999; Phillips and others 2003; Robinson and others 2003). Using an electrochemical technique that permitted sub-second detection of dopamine, Roitman and colleagues (2004) demonstrated that operant responses for a sucrose reward were associated with brief but robust increases in NAc dopamine concentration. Similar dopamine signals have also been observed in male rats during exposure to and approach towards receptive females (Robinson and others 2001), suggesting that phasic changes in dopamine release in the NAc may dynamically modulate a variety of reward-directed behaviors. Furthermore, preliminary results indicate that subsecond increases in NAc dopamine concentration are promoted by primary rewards but not aversive stimuli, and that this response is innate (Roitman and others 2005b). Future studies will continue to examine the exact nature of dopamine release in the NAc and the role of fast dopamine signals in reward processing.
In addition to its role in processing information about unconditioned rewards, emerging evidence from animal research indicates that the NAc is also critically involved in the formation of learned Pavlovian associations between these rewards and exteroceptive stimuli. However, because NAc neurons are not believed to process primary sensory information, this function likely requires that individual NAc circuits undergo dynamic modification during stimulus-reward learning. Two intriguing studies have recently indicated that this may be the case. Setlow and colleagues (2003) paired olfactory cues with rewarding sucrose in a go-no go task while monitoring the activity of neurons in the ventral striatum (including the NAc). Initially, delivery of olfactory cues produced a change in activity among very few neurons. However, as animals learned to associate olfactory stimuli with sucrose delivery, those cues began to evoke time-locked phasic responses in a number of neurons. In another study that employed a strictly Pavlovian design (Roitman and others 2005a), NAc neurons developed responses to reward-predictive audiovisual cues on the first day that these stimuli were paired. Thus, although the majority of individual NAc neurons do not exhibit innate phasic responses to environmental stimuli, such responses quickly emerge as animals come to associate those stimuli with impending outcomes.
The ability of conditioned stimuli to elicit changes in NAc cell activity may only increase as stimulus-reward associations become stronger. In one experiment, rats were repeatedly exposed to a CS that was always followed by a sucrose reward as well as a control stimulus that was not paired with a reward (Day and others 2006). Across several conditioning sessions, rats gradually developed selective conditioned approach responses towards the reward predictive cue, but not towards the unpaired cue. Consistent with another recent study that employed a similar paradigm (Wan and Peoples 2006), a majority (75%) of NAc neurons exhibited marked changes (increases and decreases) in firing rate during presentation of the reward-paired CS in well-conditioned rats. Of these cells, roughly half responded with a prolonged inhibition, while the other half were activated by the presence of the cue, again suggesting that individual neurons within the NAc may operate as a part of microcircuits with distinct functional responsibilities (Carelli and Wightman 2004). A characteristic excitatory NAc neuron is shown in Figure 3. Although this neuron displayed no change in activity during unpaired stimulus trials, presentations of the reward-paired cue produced a significant increase in firing rate. Moreover, the excitation produced by the CS was greater than that evoked by the reward itself. It has been suggested such excitations among NAc neurons may originate from glutamatergic inputs from cortical and limbic structures that compete for access to motor resources through striatal circuits (Pennartz and others 1994). Through this mechanism, higher-order processing centers could gain direct influence over motor areas and promote behavioral responses to conditioned stimuli and other important cues.
Support for the involvement of the NAc in stimulus-reward learning also comes from single-unit recordings of midbrain dopamine neurons. In primates, a majority of these neurons exhibit brief increases in activity when rewards are delivered unexpectedly (Mirenowicz and Schultz 1994; Hollerman and Schultz 1998). However, if rewards are fully predicted by a CS, they no longer evoke activation among dopamine neurons. Instead, conditioned stimuli alone elicit increases in dopamine burst firing that varies in magnitude based on the likelihood of reward delivery as well as the value of the expected reward (Schultz and others 1997; Fiorillo and others 2003; Tobler and others 2005). A current hypothesis based on these observations proposes that dopamine neurons may provide a “prediction error” signal consistent with contemporary reward learning theories that are directly applicable to Pavlovian contingencies (Schultz and others 1997; Sutton and Barto 1998). According to this hypothesis, phasic activation of dopamine neurons signal unexpected reward delivery because this produces an error in ongoing predictions about reward availability. Likewise, as conditioned stimuli become valid reward predictors, reward delivery does not constitute a violation of expectancy and therefore does not produce phasic dopamine cell firing. At the cellular level, phasic dopamine signals in the NAc may facilitate synaptic modification (Calabresi and others 2000a), enabling NAc neurons to incorporate new information. With respect to Pavlovian learning, such plasticity could help organisms identify cues that predict rewards and update evaluation of those cues based on actual outcomes.
Importantly, actual dopamine release during the presentation of conditioned stimuli may not be identical across all terminal regions. Using microdialysis to determine extracellular dopamine levels in the NAc core and shell, Bassareo and Di Chiara (1997, 1999) observed that while food rewards preferentially evoked increases in dopamine concentration in the NAc shell, conditioned stimuli paired with those rewards only elicited dopamine release in the NAc core. Furthermore, dopamine release in response to conditioned stimuli paired with cocaine rewards occurs selectively in the NAc core as well (Ito and others 2000). Based on these findings, it has been tentatively suggested that dopamine transmission in the NAc core specifically mediates associative learning processes, whereas dopamine increases in the NAc shell reflect primary reinforcement (Di Chiara 2002).
The functional role of the NAc and its dopaminergic innervation during Pavlovian conditioning has been explored extensively using site-specific lesions and pharmacological manipulations. These studies have also identified distinctions between NAc core and shell subregions. Parkinson and colleagues (1999) used an autoshaping paradigm to train rats to associate the presence of a previously neutral stimulus with the delivery of a food reward. Selective lesions were then made to either the core or shell of the NAc, and rats underwent additional pairing sessions in which conditioned approach responses towards the reward-paired cue were monitored. Lesions to the NAc core (but not shell) significantly impaired the expression of these approach responses, indicating that CS-US associations were disrupted. Similarly, dopamine antagonism or depletion in the NAc core also produces a profound impairment in the ability of animals to learn and express conditioned approach responses (Di Ciano and others 2001; Parkinson and others 2002). By comparison, NMDA antagonism disrupts conditioned responses only during acquisition, whereas AMPA antagonism preferentially impairs the expression of Pavlovian approaches (Di Ciano and others 2001). Taken together, these findings suggest that the reliance of conditioned approach responses on an intact NAc core reflects the contributions of dopamine and glutamate transmission within this structure.
The role of dopamine in associative reward learning may be selectively mediated by specific receptor subtypes within the NAc. Dopamine D1 and D2 receptors oppositely modulate the same intracellular cascade, and D1 receptor antagonism inhibits long term potentiation of striatal synapses (Calabresi and others 2000b; Kerr and Wickens 2001). Consistent with the distinct cellular effects attributed to these receptors, Eyny and Horvitz (2003) reported that selective D1 and D2 antagonists also differentially affect stimulus-reward learning. In this study, the systemic blockade of D1 receptors produced a reduction in conditioned approaches towards reward-paired stimuli, while D2 antagonists actually promoted the expression of learned associations (Eyny and Horvitz 2003). Intra-NAc D1 receptor antagonism immediately after Pavlovian conditioning also blocks the performance of subsequent conditioned approach responses in an autoshaping task, indicating that D1 receptors in the NAc may play a vital role in the consolidation of learned stimulus-reward associations (Dalley and others 2005).
In addition to the dopaminergic projection from the VTA, a number of other structures may contribute specific information to the NAc during associative learning (Robbins and Everitt 2002). For example, excitotoxic lesions to the anterior cingulate cortex (ACC) impair the acquisition and performance of Pavlovian approach responses towards conditioned stimuli (Cardinal and others 2002). However, in contrast to NAc core lesions, ACC lesions do not abolish approach responses, but rather increase that likelihood that animals will approach non-predictive cues (Bussey and others 1997). One potential explanation for this effect is that the ACC rapidly acquires the ability to discriminate between stimuli and then “teaches” this discrimination to other regions, such as the NAc (Cardinal and others 2002; Robbins and Everitt 2002). In agreement with this view, disconnection lesions between the NAc core and ACC also impair the expression of learned associations (Parkinson and others 2000). Importantly, other brain structures may also contribute to stimulus-reward learning in a NAc-independent manner. Indeed, a number of studies have indicated that a separate neural circuit consisting of the central nucleus of the amygdala, substantia nigra pars compacta, and dorsolateral striatum mediates the learning and expression of conditioned orienting responses elicited by cues that predict favorable outcomes as well as the potentiation of feeding by conditioned stimuli (Han and others 1997; Lee and others 2005; El-Amamy and Holland 2006).
Human brain imaging studies during reward learning tasks have confirmed and extended experimental findings that implicate the NAc in associative learning (Knutson and Cooper 2005). A number of investigations using fMRI techniques to assess blood oxygenation have reported increased activity in the ventral striatum during exposure to rewards ranging from water to money to sexual stimuli (McClure and others 2004). Consistent with the animal literature, reward prediction is a key feature in this pattern of activation. Berns and others (2001) found that unpredicted delivery of a rewarding juice substance to a volunteer’s mouth evoked a significantly greater change in activity of the ventral striatum than when rewards were delivered in a predictable fashion. Moreover, when rewards are predicted by a discrete conditioned stimulus, this CS itself can evoke a change in activity in the ventral striatum (McClure and others 2003; Ramnani and others 2004). Notably, the ventral striatum seems to encode such deviations from reward prediction in both passive (Pavlovian) and active (operant) tasks, whereas the dorsal striatum is only activated by prediction errors that occur in an operant situation (O'Doherty and others 2004). Thus, the ventral striatum and the NAc may have a wider role in linking stimuli with outcomes in a number of experimental and real-life situations. In keeping with this idea, event-related fMRI techniques have recently been employed to examine information processing in these areas in relation to social interactions, gambling, and cognition (Knutson and Cooper 2005).
Addiction to abused substances such as cocaine is typically characterized by cycles of compulsive drug use followed by periods of abstinence and resumption of drug consumption (relapse). In human addicts, it has been well documented that cues associated with prior drug use (such as places where the drug was consumed, people with whom the drug was taken or drug paraphernalia) are strong elicitors of drug craving and this craving is one of the leading causes of relapse (Gawin 1991; O'Brien and others 1998; Volkow and others 2006). Importantly, neural systems identified as critical for Pavlovain learning involving natural rewards (Figure 1) have been shown to be critically involved in the addiction process. Thus, Pavlovian learning that is beneficial to an organism under normal circumstances becomes maladaptive in the drug addicted individual.
A number of studies conducted in human cocaine addicts using neuroimaging techniques such as PET or fMRI support this view. In these studies, cocaine addicted individuals are typically placed in a brain scanner and shown videotapes of either neutral settings (e.g., nature scene) or cocaine-associated events (e.g., portraying subjects smoking cocaine). Cocaine craving (as assessed with questionnaires) is determined under each condition. Using this approach, several researchers have reported activation in specific brain regions linked with cocaine-craving during presentation of cocaine-associated cues including portions of the prefrontal cortex (e.g., anterior cingulate and orbitofrontal cortex; Breiter and others 1997; Childress and others 1999; Volkow and others 1999; Wexler and others 2001), limbic regions (Childress and others 1999) and more recently the dorsal striatum (Volkow and others 2006). These changes are typically correlated with the degree of self-reported drug craving and are not observed during presentation of neutral stimuli. Thus, in human addicts, stimuli associated with cocaine can evoke strong craving for the drug and activation of brain systems involved in natural reward-related behaviors.
Animal studies have provided additional insight into neural mechanisms underlying Pavlovian conditioning involving abused substances (Everitt and Wolf 2002; Robbins and Everitt 2002). It should be noted that in animal models of drug addiction Pavlovian contingencies are typically embedded within operant tasks. For example, a key animal model of human drug addiction is the drug self-administration procedure in which animals are trained to make an operant response (e.g., press a lever) for intravenous infusion of a drug such as cocaine. Typically, cocaine infusion is paired with a discrete audiovisual stimulus (CS). Studies show that presentation of a cocaine-associated CS alone can increase cell firing in the NAc (Carelli 2000) and BLA (Carelli and others 2003) and evoke rapid dopamine release in the NAc (Phillips and others 2003). This latter finding is illustrated in figure 4. In this case, an electrochemical technique (fast scan cyclic voltammetry) was used to measure dopamine release with subsecond temporal resolution (i.e., in 'real time') during CS presentation. As can be seen in Figure 4, dopamine was rapidly released during presentation of the audiovisual CS linked with cocaine infusion, but only in animals with a history of cocaine self-administration. Importantly, presentation of cocaine-associated conditioned stimuli can serve as powerful elicitors of relapse of drug taking following removal of drug availability in animal models of cocaine addiction (Shaham and others 2003).
In general, the same brain regions involved in natural reward processing are activated by abused substances (Figure 1). However, electrophysiological studies employing multiple schedules of reinforcement (i.e., animals responded on one lever for cocaine and a second lever for water or food within the same behavioral session) revealed that distinct populations of NAc neurons differentially process information about goal-directed behaviors for cocaine versus natural reward (Carelli and others 2000; Carelli and Wondolowski 2003). Moreover, conditioned stimuli associated with cocaine self-administration only activate NAc neurons that process information about goal-directed behaviors for cocaine and not water reward (Carelli and Ijames 2001). Collectively, these findings reveal a complex neural system in which discrete microcircuits exist (at least at the level of the NAc) that selectively process information about drug versus ‘natural’ reward and conditioned stimuli associated with them (Carelli and Wightman 2004).
The ability of animals to successfully secure food and other rewards from their environment may appear simple in nature, but in fact involves a rather complex level of processing. The organism must remember where the reward is located in the environment and how to get to there, must be motivated to respond for it, and must complete the appropriate set of behaviors to attain and consume the substance. This skill is enhanced by the capacity to form associations between predictive cues and their outcomes - the basis of Pavlovian learning. Research involving animal and human subjects implicate that the NAc and interconnected brain nuclei play a critical role in this phenomenon. The NAc processes information about unconditioned as well as conditioned aspects of Pavlovian learning that can be used to ultimately guide behaviors toward rewards in their environment. This fundamental form of learning affords obvious survival value but can also have devastating negative consequences when such associations are linked with maladaptive circumstances, such as those involving abused substances.
This work was supported by grants from the National Institute of Drug Abuse R01 DA14339 and DA017318 to RMC and F31 DA021979 to JJD.
Jeremy J. Day, Department of Psychology, The University of North Carolina at Chapel Hill, CB# 3270, Davie Hall, Chapel Hill, NC 27599-3270, Phone: 919-962-0419, Email: jjday/at/email.unc.edu.
Regina M. Carelli, Department of Psychology, The University of North Carolina at Chapel Hill, CB# 3270, Davie Hall, Chapel Hill, NC 27599-3270, Phone: 919-962-8775, FAX: 919-962-2537, Email: rcarelli/at/unc.edu.