|Home | About | Journals | Submit | Contact Us | Français|
The mathematical formulations used to study the neurophysiological signals governing choice behavior fall under one of two major theoretical frameworks: “choice probability” or “subjective value”. These two formulations represent behavioral quantities closely tied to the decision process, but it is unknown whether one of these variables, or both, dominates the neural mechanisms that mediate choice. Value and choice probability are difficult to distinguish in practice, because higher-valued options are chosen more frequently in free choice tasks. This distinction is particularly relevant for sensorimotor areas such as parietal cortex, where both value information and motor signals related to choice have been observed. We recorded the activity of neurons in the lateral intraparietal area (LIP) while monkeys performed an intertemporal choice task for rewards differing in delay to reinforcement. Here we show that the activity of parietal neurons is precisely correlated with the individual-specific discounted value of delayed rewards, with peak subjective value modulation occurring early in task trials. In contrast, late in the decision process these same neurons transition to encode the selected action. When directly compared, the strong delay-related modulation early during decision-making is driven by subjective value rather than the monkey's probability of choice. These findings show that in addition to information about gains, parietal cortex also incorporates information about delay into a precise physiological correlate of economic value functions, independent of the probability of choice.
Decision-making involves the transformation of information into a behavioral choice. In perceptual decision tasks, this flow of information links sensory processing to the selection of an action. LIP neurons are hypothesized to mediate direct sensorimotor transformations, responding to stimuli in a selective region of visual space and firing before a saccadic eye movement to the same location (Gnadt and Andersen, 1988; Andersen and Buneo, 2002). In addition to spatially congruent sensorimotor activity, LIP neurons also represent sensory information that can be used to specify saccade metrics even when the stimulus is not co-localized with the reinforced movement. For example, in different experimental paradigms, LIP activity has been shown to reflect accumulated motion evidence, target color, temporal information, and probabilistic cues (Shadlen and Newsome, 2001; Roitman and Shadlen, 2002; Toth and Assad, 2002; Leon and Shadlen, 2003; Yang and Shadlen, 2007). Typically, decision-related activity in these experiments has been taken to represent the probability of choice: increased activation correlates with a higher likelihood of a correct decision, and thus a higher probability of behavioral selection (Sugrue et al., 2005; Gold and Shadlen, 2007).
While the choice probability framework has proven powerful, decision processes can also incorporate nonsensory, internally-derived information such as value, strategic planning, and attention. In particular, recent experiments suggest that LIP activity reflects value information such as the probability or magnitude of reinforcement, reward history, and strategic game valuation (Platt and Glimcher, 1999; Dorris and Glimcher, 2004; Sugrue et al., 2004) even though these properties are not instantiated as sensory signals at the level of single trials. These findings have lead to the proposal that decision-related activity may represent the subjective value of a specific action (Glimcher et al., 2005). In this framework, LIP activity combines all relevant reward information and sensory evidence into a single decision variable that reflects the overall subjective value of the saccade (or attentional target) encoded by the neuron under study. Like choice probability coding, subjective value is presumed to act via modulation of the spatially tuned response fields widely observed in parietal cortex.
Thus LIP neurons could be modulated by the behavioral probability that a response field movement would be chosen for execution or by the underlying subjective value of the action. Additionally, it has been suggested that both subjective value and choice probability may be represented, in sequential stages, during the decision process (Sugrue et al., 2005). Importantly, although subjective value and choice probability are separable quantities in principle, value and choice can be difficult to disambiguate: higher valued options are chosen more frequently. For example, in experiments utilizing matching law behavior, choice behavior is by definition directly proportional to the relative values of the options, and LIP activity correlates with both signals (Platt and Glimcher, 1999; Sugrue et al., 2004). In general, this correlation between choice probability and value confounds many free choice paradigms, particularly those in which the differences in (or ratios between) the option values under consideration span only a narrow experimental range. The result is that existing studies cannot distinguish between these two representations, and it is unclear if parietal cortex carries a value signal distinguishable from the probability of choice.
To investigate these issues we recorded the activity of LIP neurons while monkeys performed an oculomotor intertemporal choice task between a small immediately available reward and a larger delayed reward. Using a traditional psychophysical approach we measured the choice probabilities of two monkeys as they made these decisions, and using an experimental economics approach we quantified the individual subjective value of rewards as a function of delay, enabling us to examine how each variable controls LIP activity.
Two male rhesus monkeys (Macaca mulatta; monkey D, ~8.6 kg; monkey W, ~6.0 kg) were used as subjects. All experimental procedures were performed in accordance with the Public Health Service's Guide for the Care and Use of Laboratory Animals and approved by the New York University Institutional Use and Care Committee.
Experiments were conducted in a dimly-lit sound-attenuated room. The monkeys were head-restrained and seated in a plexiglass enclosure that permitted arm and leg movements. Visual stimuli were generated using an array of tri-state light-emitting diodes (LEDs) situated on a tangent screen 145 cm from the eyes of the monkey. The LEDs formed a grid with points spaced at 2° intervals, spanning 40° horizontally and 40° vertically. Eye movements were monitored using the scleral search coil technique (Fuchs and Robinson, 1966) with horizontal and vertical eye position sampled at 500 Hz using a quadrature phase detector (Riverbend Electronics). Presentation of visual stimuli and water reinforcement delivery were controlled with an integrated software and hardware system (Gramalkn; Ryklin Software).
Each trial began with the monkey fixating a central fixation target. Two peripheral targets were then presented, a red target associated with a small immediate reward and a green target associated with a larger delayed reward. After 800 ms, the fixation target was dimmed for 200 ms, followed by the presentation of a central instruction cue for 500 ms. In forced choice trials, the color of the central cue specified the saccade target; in free choice trials, a yellow cue indicated that a saccade to either target would be rewarded. At 1500 ms, the central fixation cue was extinguished, indicating that the monkey was permitted to initiate a saccade; peripheral target cues were extinguished after the monkey completed a saccade to one of the presented targets. Rewards were delivered either immediately or after the designated delay; monkeys were not required to maintain fixation over the delay interval. In immediate reward trials, an additional interval was imposed after the immediate reward was delivered, thus equalizing the duration of immediate and delayed reward trials. Each session was conducted in blocks of 40 forced choice followed by 20 free choice trials. Delay and reward magnitudes were held constant across a block. Delays were varied between blocks and chosen to span choice threshold in behavior-only sessions. In electrophysiology sessions, delays were 0, 1, 2, 4, 8, and 12 s, presented in randomized order (3-6 blocks). The immediate reward was 0.130 ml water; the delayed reward was constant in a session and either 0.143, 0.163, 0.196, or 0.260 ml water.
Monkeys were implanted with a Cilux recording chamber (Crist Instruments) targeting the lateral bank of the intraparietal sulcus, centered 3 mm caudal and 12 mm lateral to the intersection of the midsagittal and interaural planes in either the left hemisphere (Monkey D) or the right hemisphere (Monkey W). Chamber location was verified using anatomical magnetic resonance imaging (3T; Siemens). At the start of each recording session, a 23 gauge guide tube was positioned in a support grid (1 mm spacing; Crist Instruments) and inserted through intact dura. A tungsten steel electrode (8-10 MΩ; FHC) was lowered through the guide tube using a computer-controlled micropositioner (EPS; Alpha-Omega). Electrophysiological signals were amplified, band-pass filtered, and digitized, and individual neurons were isolated based on waveform characteristics (MAP; Plexon).
Within a given session, recording was initiated once stable electrophysiological signals were obtained from a depth corresponding to LIP according to the magnetic resonance images. Single intraparietal neurons were identified and response fields were characterized as previously described (Platt and Glimcher, 1999). Once a stable response field was estimated, the intertemporal choice task was run with the delayed reward target location placed within the estimated response field, and the immediate reward target placed outside the response field, typically in the opposite hemifield and at an equivalent distance from fixation. Neurons were recorded while monkeys performed 3-6 blocks of the intertemporal choice task, with randomized selection of delays between blocks. For neural analyses, the first two forced trials in a block to each target were excluded to minimize block transition effects, while all free choice trials were included.
The intertemporal choice task was conducted under four different conditions of delayed reward magnitude (0.143, 0.163, 0.196, 0.260 ml), randomized across sessions, in order to quantify the discount function (or more precisely, the discounted “utility” function). Four choice curves and a discount function were fit to the complete binary choice dataset using a two-parameter binary logit model, with a separate fit for each monkey. The choice function:
where pL is the probability of choosing the delayed reward as a function of the difference between the subjective values of the two options (SVL, SVS) and a noise parameter β, and the discount function:
where the decline in subjective value SV is a function of delay D, amount A, and a discount parameter k, were simultaneously fit by maximum likelihood estimation. Bootstrap distributions were obtained for each discount factor k by resampling the sample distribution of behavioral data, treating individual blocks of free choice data as samples. A bootstrap sample k was produced for each resample procedure and repeated for a total of 2000 iterations, and 95% percentile confidence intervals were quantified for significance testing.
Individual LIP neurons were run under a single delayed reward magnitude condition, with neurons from Monkey W collected under two conditions (0.143, 0.260 ml) and from Monkey D under three conditions (0.143, 0.163, 0.260 ml). Because discounting data, normalized to the zero delay condition for that magnitude condition (see below), were not significantly different between different magnitude conditions, normalized neural data recorded under the magnitude conditions were combined together for each monkey.
For population neural analyses, each neuron was normalized by its mean neural firing rate across the zero delay condition trial. To construct the neural discount function, the mean normalized population activity (0-200 ms) was quantified for each delay condition and normalized to the population zero delay condition mean. This procedure produces a delay-dependent activity function, relative to the zero delay condition; this function allows comparison to the behavioral discount function, which describes relative subjective value as a function of delay. For forced choice trial data, the first two trials to each target after a block transition were excluded from analysis so as to examine data only after the animal had already sampled the new reward contingencies. In free choice trial blocks, the number of trials with saccades into the response field was delay dependent and therefore varied by block, resulting in some blocks with few or no trials where the monkey chose the delayed reward target (i.e. at long delays). For the neural discount function analyses, free choice neural data was included only from blocks with a minimum of eight trials with a saccade into the response field.
To examine the relative contribution of subjective value and choice to LIP activity, univariate and multiple regression analyses were performed in non-overlapping 100 ms windows across the duration of free choice trials. For each block of free choice data, neural activity was quantified as mean firing rate across the block; because individual neurons were not recorded under all delay conditions, single neuron firing rates were normalized by the average activity of the neuron in the zero delay condition to allow comparison across delays and neurons. Each block of free choice data was associated with a subjective value, calculated from the delay to the delayed reward implemented in that block and the individual monkey discount function, and a choice probability, quantified as the average probability of delayed reward target choice during that particular block. Regression analyses were conducted on the population of block data points, combining across monkeys (n = 279 blocks). In each temporal window, univariate linear regression was performed with either subjective value or observed choice probability as predictors, and multiple linear regression was performed including both subjective value and choice probability. Neural data from all free choice trials were included in this analysis.
We trained two monkeys (Macaca mulatta) to perform an oculomotor version of an intertemporal choice task (Fig. 1a). In each trial the monkey viewed two targets, one associated with a small immediate reward and the other with a larger delayed reward. In forced choice trials, a change in the color of the central fixation cue instructed the monkey to make a saccade to either the target that yielded the smaller immediate reward or the target that yielded the larger delayed reward; in free choice trials the monkey could select either reward. Each block of trials began with forced choice trials, in which both targets had an equal probability of instruction, followed by a series of free choice trials. Reward contingencies (delay and magnitude) were fixed during a block, so that for any given block monkeys first learned the values of the two alternatives and subsequently expressed their preference between the two options. The total trial duration was also fixed, regardless of the choices of the monkey, to insure that selecting the smaller immediate reward could not lead to higher overall reward rates. To examine the effect of delay on subjective value, we varied between blocks the delay required to receive the larger reward, holding both reward magnitudes constant over the course of any single session.
Intertemporal choice behavior is governed by the delay to reward in a wide array of species (Mazur, 1987; Rachlin et al., 1991; Myerson and Green, 1995; Laibson, 1997; Kim et al., 2008). In the present task, monkeys chose the saccade yielding the larger reward when both rewards were offered immediately, but as the delay to the larger reward increased they eventually preferred the smaller but more immediate option (Fig. 1b). Because trial duration within a block was identical regardless of the monkey's choice, preference for the smaller but more immediately available option reflects a true subjective preference rather than an underlying rate-maximization strategy.
Logistic choice functions fit to these data quantify an indifference point (a point of subjective equality) for each magnitude condition: the delay at which the monkey showed equal preference for the small immediate and larger delayed rewards. Figure 1c shows delay-dependent choice behavior for four different magnitudes of delayed reward. Changing the size of the delayed reward across days correspondingly shifted the position of the indifference point; monkeys would wait longer for larger rewards. Together, these temporally defined indifference points for different magnitudes of reward describe a function, closely related to the discounted utility function of neoclassical economic theory, which we term the behavioral discount function (Fig. 1d). Consistent with previous behavioral studies (Mazur, 1987; Rachlin et al., 1991; Myerson and Green, 1995; Laibson, 1997), this decline in subjective reinforcer value (SV) as a function of delay is well described by a simple hyperbolic equation:
where A is the reward magnitude, D the delay to reinforcement, and k the discount factor quantifying how steeply the discount function declines. For each animal, the four preference curves and the discount function were fit simultaneously using a binary logit model, which assumes a minimal number of free parameters (two). Importantly, while these discount functions were stable for each individual animal (see Supplementary Data, Fig. S1), there is a significant difference in the rates of discounting between the two monkeys we studied (Monkey D: k = 0.040 s−1; Monkey W: k = 0.158 s−1; p<10−45, permutation test).
Given these choice and behavioral discount functions, is the activity of neurons in LIP better correlated with the objective value (magnitude) of the offered reward, the subjective value of the offered reward, or the subsequent choice behavior? To answer this question we recorded the activity of 71 LIP neurons while monkeys performed the intertemporal choice task described above. Neurons in LIP are spatially tuned, increasing their firing rate when a visual stimulus appears in a circumscribed region of space termed the response field. Consistent with a decision-related visuomotor transformation, many of these neurons also show pre-saccadic activity specifically for eye movements that carry gaze into the response field. To examine the effect of delay on neural activity, we placed the target yielding the larger delayed reward in the response field of each recorded neuron and placed the target yielding the small immediate reward outside the response field. We then monitored neuronal activity as the delay-to-reward was changed across blocks.
Figure 2 shows example activity from single LIP neurons during forced choice trials in which the monkey was instructed to make a saccade to the delayed reward target. In all such trials, the monkey views the same visual stimuli and performs the same saccadic movement; only the delay to reinforcement after the saccade is complete varies between blocks of trials. As previously reported (Gnadt and Andersen, 1988), the spiking activity of these neurons evolves throughout the trial, with activation typically highest immediately after target onset, then maintained above baseline throughout the trial, and finally rising before a saccade into the response field (Fig. 2a). We found that under longer delay conditions, LIP neurons showed lower firing rates throughout much of the trial despite the presence of identical reward magnitudes. These neurons were thus sensitive to delay-to-reward, a variable that influences both subjective value and choice but not objective value in our task (Fig. 2b). The majority of sampled LIP neurons (47/71, 66.2%) showed significant modulation by delay (regression analysis, activity in the epoch 0-200 ms after target onset), and delay strongly modulates the population response of these neurons (Fig. 3a).
Does this effect of delay represent the monkeys' subjective valuation of delayed reward? To answer this question we computed for each animal a neural discount function, defined - analogously to the traditional behavioral discount function described above - as the best hyperbolic fit to the population firing rate as a function of delay. We examined activity in the epoch immediately after target onset (0-200 ms), pooling data for all of the neurons studied in each animal, at all magnitudes, normalized by average response to the zero delay condition. Thus activity is represented as a fraction of the neuronal response to a given immediate reward magnitude. Importantly, because the behavioral discount functions differed significantly between monkeys, we analyzed the individual monkey neural data separately. Because the behavioral (red line) and neural (black line) discount functions are defined in the same units (discounted value as a function of delay), we can directly compare delay-discounted subjective value and LIP activity (Fig. 3b). We found that for each monkey the neural discount function matches the behavioral discount function with surprising precision (Monkey W: kneural = 0.157 s−1, kbehav = 0.158 s−1; Monkey D: kneural = 0.038 s−1, kbehav = 0.040 s−1; 95% bootstrap confidence intervals shaded in Fig. 3b). Furthermore, each neural discount function differs significantly from both the behavioral and neural discount functions of the other monkey (see Supplementary Data), suggesting a specific psychometric-neurometric match between perceived value and neuronal activity in each individual.
The preceding data, however, only reflect neural activity during forced choice trials. If these representations drive decision-making processes, then subjective value should modulate LIP neurons during free choice trials as well. We therefore examined neural activity during free choice trials, restricting our analysis to the subset of trials in which the monkey chose the target in the neuron's receptive field. Despite a smaller number of sampled trials imposed by the subject's preferences, it is clear that LIP population activity during free choice is strongly modulated by delay (Fig. 3c, displayed for trials with saccades into the RF), and the free choice neural discount functions also match the behavioral discount functions (Fig. 3d, Monkey W: kneural = 0.140 s−1, kbehav = 0.158 s−1; Monkey D: kneural = 0.048 s−1, kbehav = 0.040 s−1; 95% bootstrap confidence intervals shaded).
Comparison of neural activity across both monkeys confirms that subjective value is a more parsimonious explanation of LIP activity than delay. For each neuron, we quantified the influence of either delay or subjective value using separate linear regression models. Figure 4 plots the regression slopes relating delay to firing rate versus the regression slopes relating subjective value to firing rate for all neurons, separated by monkey, as well as the cumulative marginal regression weight distributions. This figure indicates that while LIP neurons in both monkeys show similar regression slopes for subjective value (p = 0.70, Wilcoxon rank-sum test), delay regression slopes differ significantly between monkeys (p = 0.0008, Wilcoxon rank-sum test). Thus, while subjective value controls LIP activity in the same manner in both monkeys, delay to reward more strongly modulates neural firing rates in Monkey W than Monkey D. Importantly, these neural results are consistent with the behavioral data, in which subjective value also declines more quickly as a function of delay in Monkey W (k = 0.158) than Monkey D (k = 0.040) - when subjective value is a steeper function of delay, neural firing rates are also more strongly delay-modulated.
This encoding of subjective value is evident in the strong correspondence between the behavioral and neural discount functions seen in Figure 3. To examine this directly, we plot in Figure 4b the normalized population neural activity of both monkeys as a function of delay (left) and subjective value (right). Note that the computation of subjective value relies solely on choice behavior in the discounting task, and does not rely on neuronal data. Nevertheless, compared to delay, subjective value explains more of the variance in the population data in both Monkey W (R2d = 0.87, R2sv = 0.96) and Monkey D (R2d = 0.87, R2sv = 0.93). Furthermore, combining all data, population LIP activity is much better characterized as a function of subjective value (R2 = 0.95) than delay (R2 = 0.58).
The preceding data show that immediately after target onset, LIP activity precisely covaries with delay-discounted subjective value. Decision areas such as LIP are also known to ultimately signal the chosen action (Gnadt and Andersen, 1988; Andersen and Buneo, 2002), which is a function of subjective value – subjectively higher-valued targets are by definition chosen over lower-valued ones (Platt and Glimcher, 1999; Sugrue et al., 2004). Thus LIP activity appears to represent both input and output variables necessary in decision-making: value and selected action. It has, however, been proposed that option values are transformed into choice probability functions as an intermediate step in generating stochastic choice behavior, and that LIP activity may actually reflect these underlying choice probabilities (Sugrue et al., 2005). Could the modulation we observe simply reflect the animals' upcoming probability of choice rather than a distinct representation of subjective value per se?
To examine the relative influence of subjective value and choice probability on LIP activity, we quantified single neuron firing rates for individual blocks of free choice trials (n = 279 blocks); each block was associated with a subjective value (determined by the specified delay to reward and the individual-specific discount function) and a mean choice probability (averaged over the monkey's choices in that block). Though subjective value is directly calculated from overall choice behavior, two properties of this dataset allow us to effectively disassociate subjective value and choice probability. First, choice behavior at the block level exhibits variability between blocks for identical subjective value conditions. Second, the subjective value of the delayed target continues to diminish even after choice probabilities have reached asymptotically high levels (Fig. 1). As evident in Figure 5, there is considerable variation between these two parameters, particularly when data are grouped across magnitude conditions (variance inflation factor = 1.26). Utilizing this relative independence, we employed regression analyses to determine if neural activity is better correlated with choice probability or subjective value, and how this encoding changes during decision-making.
We performed a sliding window analysis across the duration of all free choice trials to examine modulation of LIP activity over the decision process. In each non-overlapping 100 ms bin, we quantified the influence of subjective value or choice probability on normalized block-averaged firing rates by univariate linear regression (individual neuronal firing rates normalized by mean zero delay activity; see Materials and Methods). As shown in Figure 6a, subjective value (blue line) explains a significant proportion of LIP population variability across the length of free choice trials, with a peak in the coefficient of determination (R2) immediately after target presentation. The small but significant value modulation during fixation likely reflects the task design, in which target locations and rewards are fixed within blocks, making information about the value of saccades available to the animals before target onset. In contrast to strong value modulation, neural activity early in the trial is minimally explained by choice probability (red line). R2 values from multiple regression analysis (black line) confirm that including choice probability as a factor provides little additional explanatory power beyond that of subjective value alone.
To examine these results more directly, we plot in Figure 6b population average firing rates as a function of either subjective value or choice probability for the 0-1000 ms period following target presentation. Consistent with the sliding window analysis, firing rate is significantly dependent on subjective value (R2SV = 0.151, p < 10−11, F-test) but not choice probability (R2CP = 0.005, p = 0.23, F-test). Furthermore, when data is restricted to blocks where choice probability was equal to one (i.e.. the monkey chose the delayed option exclusively), LIP activity is still a significant function of subjective value (R2SV = 0.163, p < 0.00001, F-test; data not shown). We note that the use of local, block-average choice probability represents a conservative approach to estimating the influence of subjective value: under the alternative hypothesis that firing rates are driven by choice probability and not subjective value, using block-level probabilities more closely tied to daily variations in behavior and global subjective values would make it more difficult to detect subjective value-related modulations. To ensure that our results did not depend on the particular definition of choice probability we employed, we repeated the univariate regression analysis with global, experiment-averaged choice probabilities. Using this formulation of choice probability subjective value (R2SV = 0.151, p < 10−11, F-test) is still a stronger predictor of LIP activity than choice probability (R2CP = 0.035, p = 0.002, F-test). The higher coefficient of determination for global versus local choice probability likely arises from the relationship between global choice probability and subjective value (in our analysis, average choice probability is a function of subjective value); when both global choice probability and subjective value are included in a multiple regression analysis, LIP activity is only dependent on subjective value (SV regression slope = 0.96, 95% C.I. [0.65 1.24], global CP regression slope = 0.13, 95% C.I. [−0.28 0.03]). Thus, LIP neurons are dependent on subjective value and not choice probability, defined either locally or globally, early in the decision process.
LIP activity does not show peak modulation by choice probability in this task until late in trials, in the interval immediately preceding saccadic eye movement (Fig. 6a, red). Thus, there is a shift in the population response: at the onset of each trial, activity reflects the subjective value of the option in the response field, irregardless of the subsequent choice behavior; as the trial progresses, activation associated with the monkey's choice grows, peaking immediately before movement onset. These results were not driven by either magnitude or individual animal effects; additional regression analyses on data segregated by either reward magnitude or individual subject (see Supplemental Material) demonstrated equivalent results in both animals: LIP activity reflects subjective value and not choice probability.
We examined the relationship between parietal neuron activity and subjective values for actions during value-guided decision-making. Monkeys were trained to choose between rewards differing in magnitude and delay-to-reinforcement in a delay-discounting task, enabling a precise behavioral quantification of the subjective values of saccadic targets. We found that LIP activity is tightly correlated to the delay-discounted value of a saccade, independent of the underlying probability of choice. While previous studies have demonstrated subjective value-related activity in LIP (Platt and Glimcher, 1999; Dorris and Glimcher, 2004; Sugrue et al., 2004), our results extend these findings in three significant ways. First, the sensitivity of parietal activity to delay indicates that this value representation extends beyond manipulations of expected reward like probability of reinforcement and magnitude to include exclusively subjective components of value such as delay. Neural signals related to reinforcement delay have been observed in other brain regions, notably the frontal cortices (Roesch and Olson, 2005; Kim et al., 2008), but typically only modulate a minority of neurons. The present discounting modulation was observed to strongly influence a majority of parietal neurons, suggesting the importance of subjective value (action value) coding at this stage of visuomotor processing (Glimcher et al., 2005). Second, precise quantification of delay-discounted subjective value demonstrates a surprisingly accurate match between the behavioral and neural value representations. This relationship is similar to the psychometric-neurometric correspondence of sensory signals (Newsome et al., 1989), now extended to parietal cortex and the domain of value representation. This precise neural representation of subjective value suggests that parietal activity is not simply modulated by reward-related variables, but instead may reflect the underlying neural value signal guiding choice, a neural correlate of the economic discounted utility function. Finally, our comparison of choice and value signals demonstrates for the first time that LIP neurons carry a subjective value signal that can be separated from signals encoding the probability of choosing an option. We find that value and choice signals are temporally dissociated, with subjective value representation early in the decision process giving way to representation of the chosen action near the time of saccade; this value-to-choice transformation in neural activity may represent the critical input and output stages hypothesized in standard models of the value-guided decision process.
In studies of perceptual decision-making rooted in signal detection theory, choice probabilities are typically constructed by presenting a perceptually ambiguous stimulus that varies from trial-to-trial and measuring the aggregate probability that a subject will make one of two evaluative responses. In the classic paradigm of this type, monkeys view a random-dot stimulus that contains net image motion in one of two possible directions and choose one of two responses; if the monkey selects the response associated with the correct motion signal it receives a reward (Newsome et al., 1989; Britten et al., 1992). The goal is to demonstrate that neural activity (the neurometric function) is correlated with measured response probabilities (the psychometric function) across different stimulus conditions, and evidence for such neurometric-psychometric matches exists in multiple cortical areas.
Neoclassical economists developed an alternative approach to the behavioral study of choice, hypothesizing that subjects choosing between two gains behave as if those gains were represented on an internal scale (Von Neumann and Morgenstern, 1944; Samuelson, 1947; Savage, 1954). For this reason, a number of neurobiological studies have proposed that the brain must contain neural signals representing the subjective values of options in a way that is at least partially independent of observed choice probabilities (Dorris and Glimcher, 2004; Glimcher et al., 2005; Padoa-Schioppa and Assad, 2006, 2008). Employing economic theory, these studies argue that the largely transitive nature of monkey choices necessitates an underlying representation of subjective value that is distinct from choice (or choice probabilities). The correlation between neural activity and external reward value is taken as evidence that these signals encode subjective value, with an implied physiological mapping rule from external to mean internal value.
In practice, however, distinguishing these two frameworks is difficult because choice probability and subjective value are often tightly correlated. Consider for instance the recent neurophysiological decision-making studies utilizing the matching law (Sugrue et al., 2004; Lau and Glimcher, 2008) in which reward magnitude was systematically manipulated while choices were observed. Under the specified experimental conditions, the animals distributed their choices according to Herrnstein's matching law:
where R1/R2 is the ratio of reward magnitudes, C1/C2 is the observed ratio of choices, and α is a fixed constant. When behavior follows the matching law, choice probabilities and relative reward values are directly related. Thus, while there is increasing evidence for decision-related neurophysiological signals, these studies cannot discriminate between the choice probability and value frameworks.
In this experiment, we took advantage of two characteristics of choice behavior to effectively dissociate subjective value and choice probability. First, stochastic choice behavior is a function of the values of both options in a decision. Critically, this means that the subjective value of a single option (the RF target) can vary widely without a change in the associated choice probability, if the value of the other option remains much lower (or higher). Second, choice behavior at the block level exhibits considerable variability between blocks for identical subjective value conditions, suggesting a role for additional, value-independent sources of variance in choice. Together, these characteristics produced an effective experimental dissociation of value and choice probability, enabling a comparison of their relative influences on LIP activity.
We examined temporal discounting behavior, which displays a clear, well-established behavioral subjective value signal: the discount function. Detailed quantification of the hyperbolic form of the discount function showed that parietal activity encodes a precise neural correlate of subjective value. Furthermore, comparison of neural activity to both the discount function and observed choice probabilities revealed that subjective value is the primary influence on early LIP activity, independent of the ultimate choice behavior. The striking correspondence between LIP activity and the discount function across animals suggests that neural activity is linearly related to the internal, subjective experience of value.
Classically, neurons in LIP respond to a spatially restricted subset of visual stimuli and saccadic eye-movements, suggesting that these neurons mediate the visuomotor transformations underlying saccade selection and attentional allocation (Andersen and Buneo, 2002; Goldberg et al., 2002). However, LIP activity has also been shown to be modulated by additional classes of information, including more abstract, nonspatial task variables ranging from color or shape, to elapsed time, to reward probabilities, to the accumulation of sensory signals (Platt and Glimcher, 1999; Shadlen and Newsome, 2001; Roitman and Shadlen, 2002; Toth and Assad, 2002; Leon and Shadlen, 2003; Sugrue et al., 2004; Janssen and Shadlen, 2005; Yang and Shadlen, 2007) . While each of these responses can be characterized with a task-specific model, one is led to wonder whether a unifying framework exists that could relate these various findings.
We show here that the initial activity of a given LIP neuron during intertemporal choice is tightly correlated with the delay-dependent subjective value of the associated saccade. This representation of subjective value may allow a reinterpretation of many previous LIP studies within a broader common framework. For example, early studies showed that LIP activity in a motion discrimination task reflects the integral of the motion signal, a quantity encoding the accumulated evidence for motion in a particular direction (Shadlen and Newsome, 2001; Roitman and Shadlen, 2002). Because reward was contingent on the monkey correctly indicating the true motion direction with a saccade, the integral of the motion signal was very closely related to the probability of reward - the more evidence for a particular direction of motion the higher the subjective value of the associated saccade. Similarly, LIP activity in tasks involving the perception of elapsed time could also be seen as reflecting not time per se, but rather how such temporal information affects the instantaneous subjective value the subject places on a particular eye movement (Leon and Shadlen, 2003; Janssen and Shadlen, 2005). More generally, LIP neurons have been shown to encode stimulus attributes such as color only when such features are behaviorally relevant for obtaining rewards (Toth and Assad, 2002); such activity may not reflect color, but rather the information color carries about the value of making a particular saccade. Given the relative specificity of LIP for eye movements compared to other actions such as reaches (Snyder et al., 1997; Andersen and Buneo, 2002), the representation of saccadic subjective value we observe in LIP may be paralleled by similar subjective value coding for different types of actions in adjacent regions of parietal cortex (e.g., the parietal reach region).
Information about the subjective values of actions is certainly not unique to the parietal cortex, but present in a larger brain network that processes rewards, updates stored value-information, and guides behavior (Schultz, 2004). One would expect the nature of such value information to vary from region to region in the brain, in a way that corresponds to the specific role of a given region in learning values and guiding behavior. For example, activity in dopaminergic nuclei, postulated to guide the learning of stimulus and action values, has been hypothesized to encode the difference between predicted and received rewards (Schultz et al., 1997). In decision-related areas, value information would be expected to closely approximate subjective valuations because such valuations are quantified directly from choice behavior. This representation of subjective value should thus combine all relevant information guiding choice, ranging from reward characteristics like magnitude to cost information such as required delay or effort.
We thus propose that subjective value representation in LIP operates within the existing parietal spatial framework: the subjective value associated with a given visuospatial location modulates the corresponding response field activity. It should be noted that while most LIP activity is spatially tuned, consistent with intentional or attentional activity, recent studies have also demonstrated nonspatial modulation by information such as learned categorical membership, effector limb usage, stimulus shape characteristics, and cognitive task rules (Sereno and Maunsell, 1998; Stoet and Snyder, 2004; Freedman and Assad, 2006; Oristaglio et al., 2006). The value framework may be difficult to reconcile with these nonspatial functions of LIP, and such nonspatial processing may represent an additional role for parietal cortex in visuomotor processing .
In addition to its role in oculomotor decision making, the parietal cortex is also modulated by both top-down and bottom-up attentional processing (Gottlieb et al., 1998; Goldberg et al., 2002), raising the question of whether the signals we observed in area LIP reflect the allocation of attention independent of any movement-related phenomena. It is often difficult to separate the effects of value and attention, since these concepts are closely coupled in the real world: attention is naturally directed towards more valuable objects or locations (Maunsell, 2004). Several details tentatively suggest, however, that the subjective value model may account for LIP activity in a way that is dissociable from general models of attentional allocation in this particular experiment. First, LIP firing rates are strongly correlated with subjective value even during the cue presentation period of forced trials, when the monkeys might be expected to direct their attention towards the central instruction cue. This finding is analogous to data from motion discrimination experiments, where the activity of LIP neurons reflects the accumulated motion information for or against a particular saccade, even though the monkeys are presumed to be attending the central motion stimulus (Shadlen and Newsome, 2001; Roitman and Shadlen, 2002). Together these findings suggest that the locus of spatial attention does not uniquely specify LIP activity. Second, we observed reaction times that were nonmonotonic functions of delay (Supplemental Data, Fig. S8). Given the general relationship between attention and reaction times (Posner, 1980), this data also tentatively suggests that in this task attention may not be strongly correlated with delay to reward or subjective value. However, without direct behavioral measures of attentional allocation, we cannot exclude the possibility that our data reflects a delay- and reward-dependent allocation of spatial attention; this could be explicitly addressed in future work by employing a nonspatial choice mechanism, such as a lever release. Given the strong one-to-one correspondence between LIP single unit activity and delay-discounted value we observed, if attention mediates this parietal modulation then these findings would imply the novel conclusion that subjective value serves as a primary and precise determinant of attentional allocation in this task.
We find that neural activity in the posterior parietal cortex is linearly related to the private, idiosyncratic experience of subjective value. We inferred this from a novel type of psychometric-neurometric match, one that specifically relates a subjective internal percept of value to a neural activation; such a physiological variable provides an empirical link between brain function and existing theoretical models of value, such as utility. Over the course of the decision process, this close match between LIP activity and subjective value evolves into a correlation between activity and choice in these same neurons. Both the unexpectedly linear mapping between activity of LIP neurons and subjective preference and the transition these neurons undergo during choice are precisely the kind of signals expected in decision-making circuits, and may provide avenues for future studies at the intersection of valuation and decision-making.
We thank B. Lau, J. Kable, A. Caplin, G. Frechette, and E. DeWitt for discussions and reading of the manuscript, and M. Grantner and E. Ryklin for technical support.