|Home | About | Journals | Submit | Contact Us | Français|
How does the brain translate information signaling potential rewards into motivation to get them? Motivation to obtain reward is thought to depend on the midbrain, (particularly the ventral tegmental area, VTA), the nucleus accumbens (NAcc), and the dorsolateral prefrontal cortex (dlPFC), but it is not clear how the interactions amongst these regions relate to reward-motivated behavior. To study the influence of motivation on these reward-responsive regions and on their interactions, we used Dynamic Causal Modeling (DCM) to analyze functional magnetic resonance imaging (fMRI) data from humans performing a simple task designed to isolate reward anticipation. The use of fMRI permitted the simultaneous measurement of multiple brain regions while human participants anticipated and prepared for opportunities to obtain reward, thus allowing characterization of how information about reward changes physiology underlying motivational drive. Further, we modeled the impact of external reward cues on causal relationships within this network, thus elaborating a link between physiology, connectivity, and motivation. Specifically, our results indicated that dlPFC was the exclusive entry point of information about reward in this network, and that anticipated reward availability caused VTA activation only via its effect on the dlPFC. Anticipated reward thus increased dlPFC activation directly, whereas it influenced VTA and NAcc only indirectly, by enhancing intrinsically weak or inactive pathways from the dlPFC. Our findings of a directional prefrontal influence on dopaminergic regions during reward anticipation suggest a model in which the dlPFC integrates and transmits representations of reward to the mesolimbic and mesocortical dopamine systems, thereby initiating motivated behavior.
Motivation translates goals into action. The initiation and organization of motivated behavior is thought to depend on the brain's mesolimbic and mesocortical dopamine systems (Salamone et al., 2007); that is, the projections from the ventral tegmental area (VTA) to the nucleus accumbens (NAcc) and the prefrontal cortex (PFC) (Berridge and Robinson, 1998; Wise, 2004). Dopamine in the NAcc and PFC is indeed critical to the normal function of these regions (Goldman-Rakic, 1998; Durstewitz et al., 2000; Hazy et al., 2006; Alcaro et al., 2007). However, much remains unknown about how interactions amongst these regions relate to motivated behavior. VTA neurons respond to anticipated reward, but from where do they get information about reward cues in the environment? What dynamic changes in network function, triggered by reward anticipation, occur during motivation?
The VTA, NAcc, and PFC are each strongly implicated in motivation to obtain reward: VTA neurons respond to reward cues and increase their activity prior to goal-directed behavior (Ljungberg et al., 1992; Schultz, 1998; Fiorillo et al., 2003). The NAcc is essential for translating motivational drive into motor behavior (reviewed by Goto and Grace, 2005). The PFC, specifically the dorsolateral prefrontal cortex (dlPFC), is involved in the representation and integration of goals and reward information (Miller and Cohen, 2001; Wagner et al., 2001; Watanabe and Sakagami, 2007).
Physiological relationships amongst the VTA, NAcc, and PFC are similarly thought to be crucial for executing reward-motivated behavior. These include dopamine release in the NAcc following VTA activation (Fields et al., 2007; Roitman et al., 2008) and modulation of VTA responsivity by both the NAcc (Grace et al., 2007) and PFC (Gariano and Groves, 1988; Svensson and Tung, 1989; Gao et al., 2007). Importantly, studies of the role of the PFC in driving VTA activity have been in anesthetized rodent models, thus imposing two important constraints on extant evidence. First, because rodents lack an expanded prefrontal cortex, the physiological role of dlPFC in modulating VTA remains unknown (Frankle et al., 2006). Second, and more fundamentally, characterization of physiological interactions in anesthetized animals cannot address the relation between physiology and motivated behavior. In spite of their well-documented interactions, the dynamics of the network linking dlPFC, VTA and NAcc has yet to be investigated during motivated behavior. Specifically, it is unclear where information signaling potential reward enters this network and how it impacts the relationships between these reward-responsive regions (but see Bromberg-Martin et al., 2010).
To investigate how this network supporting motivated behavior responds to information about potential rewards, we used functional magnetic resonance imaging (fMRI) to measure activations in the dlPFC, VTA, and NAcc during a rewarded reaction-time task. This task allowed us to isolate activations associated with motivation, which occurred after the presentation of reward-informative cues and prior to the execution of goal-directed behavior. Using Dynamic Causal Modeling (DCM), an analysis technique optimized to model causal relationships in fMRI data (Friston et al., 2003), we identified where reward information enters and how it modulates this dopamine-dependent system.
The data analyzed in this study were originally collected to examine the anticipation of either gaining or losing monetary rewards for either oneself or a charity, as described in detail in a previous report (Carter et al., 2009). Twenty young adults completed the original study. Four subjects were excluded due to poor data acquisition (i.e. signal drop out, poor coverage), one was excluded due to a Beck Depression Inventory (BDI) score indicating depression, and three were excluded due to insufficient activation in at least one of the regions of interest (see DCM methods section), leaving 12 subjects in the data reported here (age: 23.9 yrs, SD: 3.8 yrs; 6 male).
To experimentally manipulate subjects' motivational state, we used a modified Monetary Incentive Delay (MID) task (Knutson et al., 2001; Carter et al., 2009). The current analyses used only the data acquired while participants were anticipating monetary gains for themselves, resulting in 20 trials per condition. During these trials, initial cues marked the start of the trial and indicated whether individuals could earn either $4 (cue: figure on a red background) or $0 (cue: figure on a yellow background) for a fast reaction time to an upcoming target. After a variable delay (4-4.5 s), a response target (target: a white square) appeared indicating that participants were to press a button using their right index finger as quickly as possible. Participants earned the amount indicated by the cue if they responded in time or earned nothing otherwise. Using information about response times on previous trials in the same condition, an adaptive algorithm set reaction time thresholds so that subjects won approximately 65% of the time.
A 3T GE Signa MRI scanner was used to acquire blood oxygen level-dependent (BOLD) contrast images. Each of the two runs comprised 416 volumes (TR: 1 s; TE: 27 ms; Flip angle: 77°; voxel size: 3.75 mm × 3.75 mm × 3.75 mm) of 17 axial slices positioned to provide coverage of the midbrain, while also including striatum, and dlPFC (for image of coverage see Carter et al., 2009). This restricted volume, which sacrificed superior parietal cortex, permitted a TR of 1s for increased sensitivity in regions of interest where susceptibility artifact can be problematic. The GE Signa EPI sequence automatically passes images through a Fermi filter with a transition width of 10 mm and radius of half the matrix size, which results in an effective smoothing kernel of 4.8 mm2. At the beginning of the scanning session, we collected localizer images in order to identify the participant's head position within the scanner. Additionally, we acquired IR-SPGR high-resolution whole-volume T1-weighted images (voxel size: 1 mm × 1 mm × 1 mm) and 17 IR-SPGR images, coplanar with the BOLD contrast images, for use with registration, normalization, and anatomical specification of regions of interest (see below).
Images were skull stripped using FSL's BET tool (Smith, 2002). Preprocessing was performed using SPM8 software (http://www.fil.ion.ucl.ac.uk/spm). Images from separate runs from each subject were concatenated then realigned using a 4th degree B spline, with the mean image as a reference. Images were then smoothed using a 4-mm Gaussian kernel, yielding a cumulative effective smoothing kernel of 6.25 mm fwhm. This 4-mm kernel was previously tested against 2-, 6- and 8-mm kernels, for subcortical and medial temporal regions of interest, and gave both maximum Z scores and more limited spatial extent of activations, appropriate to the anatomy of these regions. For within-subject analysis, images were coregistered in two steps: first, IR-SPGR whole–volume T1-weighted images were coregistered to the 17 slice IR-SPGR images using a normalized mutual information function. Functional data was then coregistered to the resliced T1 images. For between-subject analyses, high-resolution structural images were normalized to MNI space and normalization parameters were applied to coregistered functional images.
The first (within-subject) level statistical models were analyzed using a general linear model. Regressors included 2 task-related regressors of interest, 13 task-related regressors of no interest, 6 motion regressors of no interest, and 3 session effects as covariates of no interest. The task-related effects were modeled with fifteen columns in the design matrix that represented the anticipation and outcome possibilities for self, charity, and control. Both cue and outcome periods were modeled with 1-second boxcars at cue and outcome onset. This allowed for isolation of neural activity related to processing of the cue and preparation to execute motivated behavior. Task-related regressors of interest modeled the cue/anticipation periods for $4 and $0 gain trials. Task-related regressors of no interest included the outcome regressors for $4 and $0 self trails, as well as both cue and outcome regressors for charity, and control trials. The beta estimates were then calculated using a general linear model with a canonical hemodynamic response basis function. Contrast images for $4 – $0 cue regressors were computed for each subject and entered into a between subject random effects analysis. Statistical thresholds were set to p<0.01, False Discovery Rate (FDR) corrected, with a cluster extent of 6 voxels for the group level GLM analyses (Genovese et al., 2002). For construction of DCMs, an additional first-level analysis was run that included a column with the combined anticipation periods for $4 and $0 gain trials.
All DCM analyses were conducted in DCM8 as implemented by SPM8. Dynamic Causal Modeling uses generative models of brain responses to infer the hidden activity of brain regions during different experimental contexts (Friston et al., 2003). A dynamic causal model (DCM) is comprised of a system of nodes that interact via unidirectional connections. Experimental manipulations are treated as perturbations in the system, which operate either by directly influencing the activity of one or more nodes (driving inputs), or by influencing the strength of connection between nodes (modulatory input). The latter effect of exogenous input represents how the coupling between two regions varies in response to experimental manipulations.
DCM represents the hidden neuronal population dynamics in each region with a state variable x. For a single input u, the state equation for DCM is:
Matrix A represents the strengths of context-independent or intrinsic connections, matrix B represents the modulation of context-dependent pathways, and matrix C represents the driving input to the network. The state equation is transformed into a predicted BOLD signal by a biophysical forward model of hemodynamic responses (Friston et al., 2000; Stephan et al., 2007). Model parameters are estimated using variational Bayes under the Laplace approximation, with the objective of maximizing the negative free energy as an approximation to the log model evidence, a measure of the balance between model fit and model complexity.
Volumes of interest were defined by taking the intersection of anatomical boundaries and significant functional activations. For the NAcc and VTA, anatomical boundaries were hand drawn using AFNI software on high-resolution anatomical images in individual space (afni.nimh.nih.gov/afni). The NAcc was drawn according to the procedure outlined in (Breiter et al., 1997). The ventral tegmental area was drawn in the saggital section and was identified with the following boundaries: the superior boundary was the most superior horizontal section containing the superior colliculus. The inferior boundary was the most inferior horizontal section containing the red nucleus. The lateral boundaries were drawn in the sagittal plane as vertical lines connecting the center of the colliculus and the peak of curvature of the interpeduncular fassa. The anterior boundary was clearly visible as cerebral spinal fluid, and the posterior boundary was a horizontal line bisecting the red nucleus in the axial plane. For the dlPFC, anatomical boundaries were defined on the MNI template (fmri.wfubmc.edu/cms) as the intersection of the left BA 46 and medial frontal gyrus, and back transformed into individual space.
The objective of DCM is to formulate and compare different possible mechanisms by which an established effect (local response) may have arisen. How this effect is initially detected and established, prior to the DCM analysis depends on the particular question that is being asked. In the context of a GLM analysis, as in our case, this is usually done by requiring that there is a certain degree of activation in each region considered. Here, we operationalized this by requiring that, for each subject, each region showed an activation at the level of p < 0.05, uncorrected, with a cluster extent threshold of greater than 3 voxels in subcortical/midbrain structures and 5 voxels in cortical structures. This criterion eliminated 3 subjects from the DCM analysis (2 for NAcc, 1 for dlPFC). The timeseries for both the VTA and the NAcc were extracted from the peak voxel within each subject's VOI, following (Adcock et al., 2006). For the dlPFC the VOI was an 8 mm sphere around each individual's peak activation. Mean coordinates ([x y z]) in MNI space for the three VOIs were as follows: Left NAcc: [-8 14 -1]; Right NAcc: [12 14 -3]; Left VTA: [-2 -15 -13]; Right VTA: [3 -17 -12]; dlPFC: [-44 38 19].
To answer the questions of where information about potential reward enters the system and how the system is modulated by reward anticipation, the model space included all possible configurations of driving input and several possible combinations of context-sensitive connections. Driving inputs represent cue information about all reward types (high and low), while modulatory inputs represented only high reward cues. All models assume reciprocal intrinsic connections between all regions due to the evidence from rodent literature of either monosynaptic pathways or direct functional relationships between the VTA, NAcc, and PFC (see introduction). In order to reduce the number of models, we employed a simplifying assumption that the bidirectional VTA - NAcc connections are modulated by the reward context. This is a reasonable assumption given the strong evidence that these two regions are critical for supporting a motivational state (for review see Haber and Knutson, 2010). The context-sensitivity of all connections to and from the dlPFC was systematically varied, resulting in 16 models. In addition, we crossed these models with all 7 possible combinations of driving input configurations. In total, 112 models per subject were fitted using a variational Bayes scheme and posterior means (maximum a posteriori, MAP) and posterior variances were estimated for each connection of each model.
Bayesian Model Selection (BMS), in combination with family level inference and Bayesian Model Averaging, was used to determine the most likely model structure given our data (Penny et al.; Stephan et al.; Stephan et al., 2009). In a first comparison step, the model evidence, which balances model fit and complexity, was computed for each model using the negative free-energy approximation to the log-model evidence. Models were compared at the group level using a novel random-effects BMS procedure (Stephan et al., 2009). The models were assessed using the exceedance probability- the probability that a given model explains the data better than all other models. In a second level of comparison, we employed family level inference using Gibbs sampling on 7 families (Penny et al., 2010), with all models in each family sharing a different driving input configuration. Models in the winning family were then subjected to the random-effects BMS procedure. Finally, we analyzed the winning family using Bayesian Model Averaging, a procedure that provides a measure of the most likely parameter values for an entire family of models across subjects (Stephan et al.; Stephan et al., 2009). Parameter significance is assessed by the fraction of samples in the posterior density that are greater from zero (posterior densities are sampled with 10,000 points) and parameters were considered significant at a posterior probability threshold of 95%.
To assess subjects' motivational state, we examined reaction times to target stimuli. An algorithm calibrated response time thresholds separately within each condition so that participants were successful about 65% of the time, ($4: M: 64%, SD: 7%; $0: M: 63%, SD: 6%), to ensure equivalent reinforcement rates across conditions. For all included participants, reaction times to the target in $4 gain trials (M: 201 ms, SD: 42 ms) were faster than in $0 gain trials (M: 226 ms, SD: 61ms, p < 0.001), signifying that participants were more motivated to perform in the $4 condition. Behavioral data for the loss and charity trials not included in this analysis can be found in (Carter et al., 2009).
To identify neural substrates of motivation to obtain reward, we used a general linear model (GLM) analysis of the period beginning with the presentation of the cue (either $4 vs. $0); hereafter, we refer to this period of anticipation of the opportunity to obtain reward as “reward motivation.” GLM analyses of the contrast $4 > $0 during this period revealed that reward motivation was associated with significant activations (p < 0.01, FDR-corrected, 6 voxel cluster extent threshold) in all of the regions of interest used for the DCM analysis: bilaterally in the VTA, NAcc, dlPFC. Additional significant activations were observed bilaterally in the midbrain (surrounding the VTA), dorsal striatum, ventral striatum (surrounding the NAcc), posterior parietal cortex, inferior parietal lobule, insula, ventrolateral prefrontal cortex, cerebellum, and ventral visual stream as well as the in the left hippocampus.
We used Dynamic Causal Modeling (DCM) with Bayesian Model Selection (BMS) and model space partitioning to examine a network consisting of VTA, NAcc, and dlPFC during reward motivation. Our DCMs estimate the strength of driving inputs, whereby information signaling an upcoming opportunity to obtain reward directly influences neural activity in a region; intrinsic connectivity, whereby regions influence each other in the absence of reward information; and modulatory inputs whereby information signaling the opportunity for future reward changes the strength of coupling between regions (Friston et al., 2003). We used BMS to compare the relative evidence of alternate DCMs that varied both the location of driving inputs into the system as well as the pattern of modulatory inputs, as specified below. We then applied the recently-developed approach of model space partitioning, which allowed us to compare families of DCMs that varied with respect to driving input, factoring out all other differences in the models (Penny et al., 2010).
Using random-effects BMS, we compared 112 models that differed both in where information signaling potential reward entered the network and how this information influenced connectivity (Figure 1). We tested a subset of the full model space that included all possible combinations of driving inputs, full intrinsic connectivity, and varied all combinations of possible modulatory connections between all regions except the connections between the VTA and NAcc (see Methods for details). The exceedance probabilities (the probability that a model explains the data better than all others considered) of the top 8 models together summed to 81% (Figure 2). These top 8 models all shared the feature of driving input to the dlPFC.
We next used model space partitioning to compare families of models defined by their driving input configuration. This analysis determined the most likely target of driving input, regardless of modulatory connectivity. We grouped our 112 models into 7 families of 16 models each with the same driving input configuration, and compared these families using Bayesian family level inference. This analysis yielded very strong evidence (exceedance probability of .93) that the family of models with the driving input solely to the dlPFC provided a better fit to the data (i.e., had higher evidence) than the 6 other families considered, including those where there were driving inputs to both the dlPFC and other regions (Figure 3). Because exceedance probabilities across families sum to one, the relative probabilities of families is more informative than the absolute probability (Penny et al., 2010). In the present study, the ratio of the exceedance probability of the best to second best family was 21. The equivalent ratio for the expected posterior probability, the expected likelihood of obtaining a model in a particular family from any randomly selected subject, was 3.3. Thus, the exceedance probability of .93 observed here represents very strong evidence that reward information was best modeled as entering the system at the dlPFC.
The models with driving inputs to the dlPFC were then compared to one another using BMS within the winning family. As anticipated, no single superior model within this group was determined. Notably, the top 4 models within the winning family, summing to 69% of the exceedance probability, all had modulatory connections from dlPFC to VTA, indicating that the influence of the dlPFC on the VTA was also important for determining model fit.
To determine how reward information entering the dlPFC affects network dynamics across subjects, we used Bayesian Model Averaging to compute the weighted average of modulatory changes in connectivity for the winning family of models (Stephan et al., 2009) (Figure 4; Table 1). Parameter estimates were considered significant at a posterior probability threshold of 95% that the posterior mean is different from zero. At baseline, intrinsic connection strengths were strongest from the VTA to NAcc and NAcc to VTA. This connection was not significantly modulated by reward motivation even though all of the models tested allowed for the expression of context-dependent modulation of these connections. However, we emphasize that all intrinsic connections were included in all models, so the finding of strong intrinsic connectivity is independent of our constraints on the model space. This lack of significant modulation is unexpected given the role of these regions in supporting motivation.
Reward motivation induced significant increases in connection strength only in the connections from the dlPFC to the NAcc and to the VTA. This is not an artifact of the fact that driving input entered the dlPFC; in fact, it has been suggested that when a driving input to a region and a modulatory input to its efferents are correlated (as they are here), the constraints on the prior variances of the driving and modulatory connections result in an underestimation of the strength of the modulatory effect (SPM listserv, 022036). The modulatory effect of reward motivation in these two connections from dlPFC was very strong: The connection from the dlPFC to the VTA increased from zero and the modulatory effect from the dlPFC to the NAcc is 290% of the intrinsic connection strength. Thus, reward motivation engaged previously weak or inactive pathways from the dlPFC to the VTA and NAcc, without significantly altering connectivity throughout the rest of the system.
We investigated the impact of reward motivation on the dynamics of mesolimbic and mesocortical dopaminergic regions in a network comprising VTA, NAcc and dlPFC. Importantly, our task structure permitted us to isolate neural activity concurrent with onset of information signaling potential reward and distinct from processing related to reward outcomes, allowing the observation of motivation preceding the execution of goal-directed behavior. Our results indicate that during a simple rewarded reaction time task, information about expected reward entered this network solely at the dlPFC. This reward information increased the dlPFC's modulation of the VTA and the NAcc, structures that are known to influence the physiology and plasticity of networks supporting motivated behavior, attention and memory throughout the brain. Together, these findings suggest that, in response to goal-relevant information, the PFC harnesses these modulatory pathways to generate physiological states that correspond to expectancy and motivation.
To characterize the network of brain regions involved in motivation, we used dynamic causal modeling (DCM). Importantly, DCM does not assume that temporal precedence is necessary for causality. Because the lag between neural activity and BOLD activation can theoretically vary across brain regions, due to vascular factors, DCM is particularly appropriate for detecting network interactions in BOLD data. In addition, DCM allows for inference about causal interactions between regions that depend on the experimental context. These inferences can be tested across a theoretically unlimited model-space, here allowing us to test amongst all possible driving input configurations.
Our DCM analysis of the VTA-NAcc-dlPFC network during reward motivation indicated that the driving input was exclusively to the dlPFC. This means that in this behavioral context and within the modeled network, information signaling potentially available reward entered the dopamine system at the dlPFC and not at the other regions in the model. Furthermore, driving input unique to the dlPFC appeared to be the feature of the models that was most important for determining model fit. Our findings demonstrate that reward cues directly increased dlPFC activation, and only influenced activation in the VTA or NAcc indirectly, via connections from the dlPFC.
Besides the regions modeled in our analysis, previous research has identified other candidate regions, such as the medial prefrontal cortex, orbitofrontal cortex, and habenula (Staudinger et al., 2009; Bromberg-Martin et al., 2010) that could plausibly initiate motivated behavior. The question of how these regions interact with this network, especially the dlPFC, is an important avenue of future research. However, it is important to note that if any of these regions were driving the modeled network via efferents to the VTA or NAcc, one would expect to see this influence expressed in our data as a driving input to the VTA or NAcc. Thus, our finding of a unique driving input to the dlPFC indicates that in this behavioral context, information signaling potential reward entered the modeled network neither via subcortical relays nor other prefrontal cortical inputs to the VTA, but rather via the dlPFC.
In addition to demonstrating PFC modulation of VTA in awake animals during motivated behavior, the current findings are, to our knowledge, the first demonstration of a prefrontal influence on the VTA in humans or other non-human primates. Bayesian Model Averaging revealed strong modulation of the VTA by the dlPFC specifically during reward motivation; this VTA-dlPFC pathway was not engaged intrinsically. Moreover, there was a nearly three-fold increase in connectivity strength from the dlPFC to the NAcc during reward motivation. Conversely, intrinsic VTA-NAcc connectivity was significant, but was not modulated by reward. This result could indicate that connectivity between the VTA and NAcc is always strong regardless of the level of motivation. However, based on prior research showing that reward information has an effect on VTA modulation of NAcc (Bakshi and Kelley, 1991; Ikemoto and Panksepp, 1999; Parkinson et al., 2002), this interpretation is unlikely to be correct. More plausible is that changes in VTA-NAcc connectivity existed, but because their effect size was small relative to that of dlPFC connectivity, they did not contribute significantly to the model evidence, further suggesting that dlPFC modulation was highly influential for the function of this network.
The finding that there was no increase in the connection strength from the VTA to the dlPFC may seem to conflict with physiology literature demonstrating dopaminergic modulation over the PFC (Williams and Goldman-Rakic, 1995; Durstewitz et al., 2000; Gao and Goldman-Rakic, 2003; Paspalas and Goldman-Rakic, 2004; Seamans and Yang, 2004; Wang et al., 2004; Gao et al., 2007). Although we found a modest intrinsic influence of the VTA on the PFC, this influence did not change with motivational state. However, we do not believe our findings to be contradictory to the above literature, as the modulatory influences may contribute to separate, but strongly interacting, behavioral processes. We believe the PFC's modulatory role over the VTA contributes to goal-directed, instrumental components of behavior, and the VTA's modulatory role over the PFC may be especially important to other behavioral processes, such as updating or task-switching, which were not manipulated in our paradigm. Thus, our findings, in the context of the previous literature, suggest that paradigms that evoked motivated executive behaviors would reveal bi-directional modulations between the VTA and dlPFC.
Building on the wealth of previous research outlining the influence of midbrain dopamine on target regions, the current findings suggest a model in which the dlPFC integrates information about potential reward and implements goal-directed behavior by tuning mesolimbic dopamine projections. This interpretation is consonant with evidence from the rodent literature showing that the PFC is the only cortical region that projects to dopamine neurons in the VTA (Beckstead et al., 1979; Sesack and Pickel, 1992; Sesack and Carr, 2002; Frankle et al., 2006) The findings fill a critical gap in this literature: stimulation of the PFC has been shown to regulate the firing patterns of dopamine neurons in rodents (Gariano and Groves, 1988; Svensson and Tung, 1989; Gao et al., 2007), and multi-site recordings demonstrate phase-coherence between the PFC and the VTA that mediates slow-oscillation burst firing (Gao et al., 2007), but there has been no demonstration that these physiological relationships are driven by motivation and goal-directed behavior. Furthermore, the absence of an expanded frontal cortex in the rodent makes an appropriate rodent correlate of primate dlPFC unclear. Although there is evidence in primates for excitatory projections from the PFC to midbrain dopamine neurons (Williams and Goldman-Rakic, 1998; Frankle et al., 2006), the functional significance of these relatively sparse projections has been questioned. Our findings showing a physiological relationship between prefrontal cortex and VTA in humans thus fill a second critical gap in the extant literature on human (and non-human primate) motivation.
Within the PFC, dlPFC is well situated to orchestrate motivated behavior because of its role in planning and goal maintenance. Primate physiology studies have demonstrated that while both the orbitofrontal cortex and the dlPFC encode reward information, only dlPFC activity predicts which behaviors a monkey will execute (Wallis and Miller, 2003). Further, the dlPFC maintains goal-relevant information during working memory (Levy and Goldman-Rakic, 2000; Wager and Smith, 2003; Owen et al., 2005), updates this information as goals dynamically change during task switching (Dove et al., 2000; Kimberg et al., 2000; MacDonald et al., 2000; Rushworth et al., 2002; Crone et al., 2006; Sakai, 2008; Savine et al., 2010), and arbitrates between conflicting goals during decision-making (MacDonald et al., 2000; McClure et al., 2004; Ridderinkhof et al., 2004; Boettiger et al., 2007; McClure et al., 2007; Hare et al., 2009). These previous findings suggest a role for the dlPFC in implementing behavioral goals, but they do not characterize the nature and direction of interactions between dlPFC and other regions supporting motivated behavior. Computational and neuroimaging work has posited a role for the dlPFC in modulating the striatum in the context of instructed reward-learning (Li et al., 2011; Doll et al., 2009). Our results corroborate these recent findings, and further implicate the DLPFC in initiating motivated behavior, via the novel demonstration of a directed influence on the VTA. Transcranial magnetic stimulation of the dlPFC changes the valuation of both cigarettes (Amiaz et al., 2009) and food (Camus et al., 2009), and also induces dopamine release in the striatum (Pogarell et al., 2006; Ko et al., 2008); however, these results do not reveal how dlPFC activation affects network activation or dynamics. The current findings directly demonstrate dlPFC influence over not only the NAcc but also the VTA during reward-motivated behavior, as postulated by prior work.
In summary, we found that motivation to obtain reward is instantiated by a transfer of information from the dlPFC to the NAcc and VTA; we saw no evidence of the reverse. These findings show that the dlPFC can orchestrate the dynamics of this neuromodulatory network in a contextually appropriate manner. Further, by suggesting an anatomical source for information about expected reward that activates dopaminergic regions, the findings also shed light on the fundamental question of how dopamine neurons define value. Finally, because of the widespread effects of VTA activation and resultant dopamine release, this interaction represents a candidate mechanism whereby dorsolateral prefrontal cortex modulates physiology and plasticity throughout the brain in order to support goal-directed behavior.
This project was supported by NIMH 70685 (SAH), NINDS 41328 (SAH), NARSAD (RAA) and the Dana Foundation (RAA). RMC is supported by NIH Fellowship (NIH51156). SAH was supported by an Incubator Award from the Duke Institute for Brain Sciences. RAA is supported by the Alfred P. Sloan Foundation and The Esther A. & and Joseph Klingenstein Fund Foundation. We gratefully acknowledge K.E. Stephan's help with methodological considerations related to the DCM analyses.
Conflicts of Interest: The authors have no conflicts of interest to report.