In addition to its role in processing information about unconditioned rewards, emerging evidence from animal research indicates that the NAc is also critically involved in the formation of learned Pavlovian associations between these rewards and exteroceptive stimuli. However, because NAc neurons are not believed to process primary sensory information, this function likely requires that individual NAc circuits undergo dynamic modification during stimulus-reward learning. Two intriguing studies have recently indicated that this may be the case.
Setlow and colleagues (2003) paired olfactory cues with rewarding sucrose in a go-no go task while monitoring the activity of neurons in the ventral striatum (including the NAc). Initially, delivery of olfactory cues produced a change in activity among very few neurons. However, as animals learned to associate olfactory stimuli with sucrose delivery, those cues began to evoke time-locked phasic responses in a number of neurons. In another study that employed a strictly Pavlovian design (
Roitman and others 2005a), NAc neurons developed responses to reward-predictive audiovisual cues on the first day that these stimuli were paired. Thus, although the majority of individual NAc neurons do not exhibit innate phasic responses to environmental stimuli, such responses quickly emerge as animals come to associate those stimuli with impending outcomes.
The ability of conditioned stimuli to elicit changes in NAc cell activity may only increase as stimulus-reward associations become stronger. In one experiment, rats were repeatedly exposed to a CS that was always followed by a sucrose reward as well as a control stimulus that was not paired with a reward (
Day and others 2006). Across several conditioning sessions, rats gradually developed selective conditioned approach responses towards the reward predictive cue, but not towards the unpaired cue. Consistent with another recent study that employed a similar paradigm (
Wan and Peoples 2006), a majority (75%) of NAc neurons exhibited marked changes (increases and decreases) in firing rate during presentation of the reward-paired CS in well-conditioned rats. Of these cells, roughly half responded with a prolonged inhibition, while the other half were activated by the presence of the cue, again suggesting that individual neurons within the NAc may operate as a part of microcircuits with distinct functional responsibilities (
Carelli and Wightman 2004). A characteristic excitatory NAc neuron is shown in . Although this neuron displayed no change in activity during unpaired stimulus trials, presentations of the reward-paired cue produced a significant increase in firing rate. Moreover, the excitation produced by the CS was greater than that evoked by the reward itself. It has been suggested such excitations among NAc neurons may originate from glutamatergic inputs from cortical and limbic structures that compete for access to motor resources through striatal circuits (
Pennartz and others 1994). Through this mechanism, higher-order processing centers could gain direct influence over motor areas and promote behavioral responses to conditioned stimuli and other important cues.
Support for the involvement of the NAc in stimulus-reward learning also comes from single-unit recordings of midbrain dopamine neurons. In primates, a majority of these neurons exhibit brief increases in activity when rewards are delivered unexpectedly (
Mirenowicz and Schultz 1994;
Hollerman and Schultz 1998). However, if rewards are fully predicted by a CS, they no longer evoke activation among dopamine neurons. Instead, conditioned stimuli alone elicit increases in dopamine burst firing that varies in magnitude based on the likelihood of reward delivery as well as the value of the expected reward (
Schultz and others 1997;
Fiorillo and others 2003;
Tobler and others 2005). A current hypothesis based on these observations proposes that dopamine neurons may provide a “prediction error” signal consistent with contemporary reward learning theories that are directly applicable to Pavlovian contingencies (
Schultz and others 1997;
Sutton and Barto 1998). According to this hypothesis, phasic activation of dopamine neurons signal unexpected reward delivery because this produces an error in ongoing predictions about reward availability. Likewise, as conditioned stimuli become valid reward predictors, reward delivery does not constitute a violation of expectancy and therefore does not produce phasic dopamine cell firing. At the cellular level, phasic dopamine signals in the NAc may facilitate synaptic modification (
Calabresi and others 2000a), enabling NAc neurons to incorporate new information. With respect to Pavlovian learning, such plasticity could help organisms identify cues that predict rewards and update evaluation of those cues based on actual outcomes.
Importantly, actual dopamine release during the presentation of conditioned stimuli may not be identical across all terminal regions. Using microdialysis to determine extracellular dopamine levels in the NAc core and shell,
Bassareo and Di Chiara (1997,
1999) observed that while food rewards preferentially evoked increases in dopamine concentration in the NAc shell, conditioned stimuli paired with those rewards only elicited dopamine release in the NAc core. Furthermore, dopamine release in response to conditioned stimuli paired with cocaine rewards occurs selectively in the NAc core as well (
Ito and others 2000). Based on these findings, it has been tentatively suggested that dopamine transmission in the NAc core specifically mediates associative learning processes, whereas dopamine increases in the NAc shell reflect primary reinforcement (
Di Chiara 2002).
The functional role of the NAc and its dopaminergic innervation during Pavlovian conditioning has been explored extensively using site-specific lesions and pharmacological manipulations. These studies have also identified distinctions between NAc core and shell subregions.
Parkinson and colleagues (1999) used an autoshaping paradigm to train rats to associate the presence of a previously neutral stimulus with the delivery of a food reward. Selective lesions were then made to either the core or shell of the NAc, and rats underwent additional pairing sessions in which conditioned approach responses towards the reward-paired cue were monitored. Lesions to the NAc core (but not shell) significantly impaired the expression of these approach responses, indicating that CS-US associations were disrupted. Similarly, dopamine antagonism or depletion in the NAc core also produces a profound impairment in the ability of animals to learn and express conditioned approach responses (
Di Ciano and others 2001;
Parkinson and others 2002). By comparison, NMDA antagonism disrupts conditioned responses only during acquisition, whereas AMPA antagonism preferentially impairs the expression of Pavlovian approaches (
Di Ciano and others 2001). Taken together, these findings suggest that the reliance of conditioned approach responses on an intact NAc core reflects the contributions of dopamine and glutamate transmission within this structure.
The role of dopamine in associative reward learning may be selectively mediated by specific receptor subtypes within the NAc. Dopamine D1 and D2 receptors oppositely modulate the same intracellular cascade, and D1 receptor antagonism inhibits long term potentiation of striatal synapses (
Calabresi and others 2000b;
Kerr and Wickens 2001). Consistent with the distinct cellular effects attributed to these receptors,
Eyny and Horvitz (2003) reported that selective D1 and D2 antagonists also differentially affect stimulus-reward learning. In this study, the systemic blockade of D1 receptors produced a reduction in conditioned approaches towards reward-paired stimuli, while D2 antagonists actually promoted the expression of learned associations (
Eyny and Horvitz 2003). Intra-NAc D1 receptor antagonism immediately after Pavlovian conditioning also blocks the performance of subsequent conditioned approach responses in an autoshaping task, indicating that D1 receptors in the NAc may play a vital role in the consolidation of learned stimulus-reward associations (
Dalley and others 2005).
In addition to the dopaminergic projection from the VTA, a number of other structures may contribute specific information to the NAc during associative learning (
Robbins and Everitt 2002). For example, excitotoxic lesions to the anterior cingulate cortex (ACC) impair the acquisition and performance of Pavlovian approach responses towards conditioned stimuli (
Cardinal and others 2002). However, in contrast to NAc core lesions, ACC lesions do not abolish approach responses, but rather increase that likelihood that animals will approach non-predictive cues (
Bussey and others 1997). One potential explanation for this effect is that the ACC rapidly acquires the ability to discriminate between stimuli and then “teaches” this discrimination to other regions, such as the NAc (
Cardinal and others 2002;
Robbins and Everitt 2002). In agreement with this view, disconnection lesions between the NAc core and ACC also impair the expression of learned associations (
Parkinson and others 2000). Importantly, other brain structures may also contribute to stimulus-reward learning in a NAc-independent manner. Indeed, a number of studies have indicated that a separate neural circuit consisting of the central nucleus of the amygdala, substantia nigra pars compacta, and dorsolateral striatum mediates the learning and expression of conditioned orienting responses elicited by cues that predict favorable outcomes as well as the potentiation of feeding by conditioned stimuli (
Han and others 1997;
Lee and others 2005;
El-Amamy and Holland 2006).
Human brain imaging studies during reward learning tasks have confirmed and extended experimental findings that implicate the NAc in associative learning (
Knutson and Cooper 2005). A number of investigations using fMRI techniques to assess blood oxygenation have reported increased activity in the ventral striatum during exposure to rewards ranging from water to money to sexual stimuli (
McClure and others 2004). Consistent with the animal literature, reward prediction is a key feature in this pattern of activation.
Berns and others (2001) found that unpredicted delivery of a rewarding juice substance to a volunteer’s mouth evoked a significantly greater change in activity of the ventral striatum than when rewards were delivered in a predictable fashion. Moreover, when rewards are predicted by a discrete conditioned stimulus, this CS itself can evoke a change in activity in the ventral striatum (
McClure and others 2003;
Ramnani and others 2004). Notably, the ventral striatum seems to encode such deviations from reward prediction in both passive (Pavlovian) and active (operant) tasks, whereas the dorsal striatum is only activated by prediction errors that occur in an operant situation (
O'Doherty and others 2004). Thus, the ventral striatum and the NAc may have a wider role in linking stimuli with outcomes in a number of experimental and real-life situations. In keeping with this idea, event-related fMRI techniques have recently been employed to examine information processing in these areas in relation to social interactions, gambling, and cognition (
Knutson and Cooper 2005).