The ability to maintain information so that it can be manipulated, integrated with other information and then used to guide behavior has been variously described as working, scratchpad or representational memory, and it depends crucially on the prefrontal cortex [2
]. Within the prefrontal cortex, the OFC, by its connections with limbic areas, is uniquely positioned to enable associative information regarding outcomes or consequences to access representational memory (Box 1
). Indeed a growing number of studies suggest that a neural correlate of the expected value of outcomes is present and perhaps generated in the OFC. For example, human neuroimaging studies show that blood flow changes in the OFC during anticipation of expected outcomes and also when the value of an expected outcome is modified or not delivered [3
]. This activation appears to reflect the incentive value of these items and is observed when that information is being used to guide decisions [7
]. These results suggest that neurons in the OFC increase activity when such information is processed. Accordingly, neural activity in the OFC that precedes predicted rewards or punishments increases, typically reflecting the incentive values of these outcomes [8
]. For example, when monkeys are presented with visual cues paired with differently preferred rewards, neurons in the OFC fire selectively according to whether the anticipated outcome is the preferred or non-preferred reward within that trial block [10
]. Moreover, Roesch and Olson [11
] have recently demonstrated that firing in the OFC tracks several other specific metrics of outcome value. For example, neurons fire differently for a reward depending on its expected size, the anticipated time required to obtain it and the possible aversive consequences associated with inappropriate behavior [11
Box 1. The anatomy of the orbitofrontal circuit in rats and primates
Rose and Woolsey [53
] proposed that prefrontal cortex might be defined by the projections of the mediodorsal thalamus (MD) rather than by ‘stratiographic analogy’ [54
]. This definition provides a foundation on which to define prefrontal homologs across species. However, it is the functional and anatomical similarities that truly define homologous areas (Figure I of this box).
In the rat, the MD can be divided into three segments [55
]. Projections from the medial and central segments of the MD define a region that includes the orbital areas and the ventral and dorsal agranular insular cortices [55
]. These regions of the MD in rat receive direct afferents from the amygdala, medial temporal lobe, ventral pallidum and ventral tegmental area, and they receive olfactory input from the piriform cortex [55
]. This pattern of connectivity is similar to that of the medially located, magnocellular division of primate MD, which defines the orbital prefrontal subdivision in primates [60
]. Thus, a defined region in the orbital area of rat prefrontal cortex is likely to receive input from thalamus that is very similar to that reaching primate orbital prefrontal cortex. Based, in part, on this pattern of input, the projection fields of medial and central MD in the orbital and agranular insular areas of rat prefrontal cortex have been proposed as homologous to the primate orbitofrontal region [55
]. These areas in rodents include the dorsal and ventral agranular insular cortex, and the lateral and ventrolateral orbital regions. This conception of the rat orbitofrontal cortex (OFC) does not include the medial or ventromedial orbital cortex, which lie along the medial wall of the hemisphere. This region has patterns of connectivity with the MD and other areas that are more similar to other regions on the medial wall.
Other important connections highlight the similarity between the rat OFC and the primate OFC. Perhaps most notable are reciprocal connections with the basolateral complex of the amygdala (ABL), a region thought to be involved in affective or motivational aspects of learning [66
]. In primate, these connections have been invoked to explain specific similarities in behavioral abnormalities resulting from damage to either the OFC or the ABL [14
]. Reciprocal connections between basolateral amygdala and areas within rat OFC, particularly the agranular insular cortex [58
], suggest that interactions between these structures might be similarly important for regulation of behavioral functions in rats. In addition, in both rats and primates, the OFC provides a strong efferent projection to the nucleus accumbens, overlapping with innervation from limbic structures such as the ABL and subiculum [81
]. The specific circuitry connecting the OFC, limbic structures and nucleus accumbens presents a striking parallel across species that suggests possible similarities in functional interactions among these major components of the forebrain [81
Figure I Anatomical relationships of the OFC (blue) in rats and monkeys. Based on their pattern of connectivity with the mediodorsal thalamus (MD, green), amygdala (orange) and striatum (pink), the orbital and agranular insular areas in rat prefrontal cortex are (more ...)
Such anticipatory activity appears to be a common feature of firing activity in the OFC across many tasks in which events occur in a sequential, and thus predictable, order (Box 2
). Importantly, however, these selective responses can be observed in the absence of any signaling cues, and they are acquired as animals learn that particular cues predict a specific outcome. In other words, this selective activity represents the expectation of an animal, based on experience, of likely outcomes. These features are illustrated in , which shows the population response of OFC neurons recorded in rats as they learn and reverse novel odor-discrimination problems [8
]. In this simple task, the rat must learn that one odor predicts reward in a nearby fluid well, whereas the other odor predicts punishment. Early in learning, neurons in the OFC respond to one but not to the other outcome. At the same time, the neurons also begin to respond in anticipation of their preferred outcome. Over a number of studies, 15–20% of the neurons in the OFC developed such activity in this task, firing in anticipation of either sucrose or quinine presentation [8
]. The activity in this neural population reflects the value of the expected outcomes, maintained in what we have defined here as representational memory.
Box 2. Orbitofrontal activity provides an ongoing signal of the value of impending events
The orbitofrontal cortex (OFC) is well positioned to use associative information to predict and then signal the value of future events. Although the main text of this review focuses on activity during delay periods before rewards to isolate this signal, the logical extension of this argument is that activity in the OFC encodes this signal throughout the performance of a task. Thus, the OFC provides a running commentary on the relative value of the current state and of possible courses of action under consideration.
This role is evident in the firing activity of OFC neurons during sampling of cues that are predictive of reward or punishment [86
]. For example, in rats trained to perform an eight-odor discrimination task, in which four odors were associated with reward and four odors were associated with non-reward, OFC neurons were more strongly influenced by the associative significance of the odor cues than by the actual odor identities [87
]. Indeed if odor identity is made irrelevant, OFC neurons will ignore this sensory feature of the cue. This was demonstrated by Ramus and Eichenbaum [89
], who trained rats on an eight-odor continuous delayed non-match-to-sample task, in which the relevant construct associated with reward is not odor identity but rather the ‘match’ or ‘non-match’ comparison between the cue on the current and preceding trial. They found that 64% of the responsive neurons discriminated this match–non-match comparison, whereas only 16% fired selectively to one of the odors.
Although cue-selective firing has been interpreted as associative encoding, we suggest that this neuronal activity actually represents the ongoing evaluation of potential outcomes by the animal. Thus, the selective firing of these neurons does not simply reflect the fact that a specific cue has been reliably associated with a particular outcome in the past, but instead reflects the judgment of the animal given current circumstances that, acting on that associative information, will lead to that outcome in the future. This judgment is represented as the value of that specific outcome relative to internal goals or desires, and these expectancies are updated constantly. Thus, the firing in the OFC reflects in essence the expected value of the subsequent state that will be generated given a particular response, whether that state is a primary reinforcer or simply a step towards that ultimate goal. Consistent with this proposal, a review of the literature shows that encoding in the OFC reliably differentiates many events, even those removed from actual reward delivery, if they provide information about the likelihood of future reward (Figure I of this box). For example, in odor-discrimination training, OFC neurons fire in anticipation of the nose-poke that precedes odor sampling. The response of these neurons differs according to whether the sequence of recent trials [87
] or the place [91
] predicts a high probability of reward.
Figure I Neural activity in the OFC in anticipation of trial events. Neurons in the rat OFC were recorded during performance of an eight-odor, Go–NoGo odor-discrimination task. The activity in four different orbitofrontal neurons is shown, synchronized (more ...)
Figure 1 Signaling of outcome expectancies in the orbitofrontal cortex. Black bars show the response on trials involving the preferred outcome of the neurons in the post-criterion phase. White bars show the response to the non-preferred outcome. Activity is synchronized (more ...)
After learning, these neurons come to be activated by the cues that predict their preferred outcomes, thereby signaling the expected outcome even before a response is made. This is evident in the population response presented in , which exhibits higher activity, after learning, in response to the odor cue that predicts the preferred outcome of the neuronal population. These signals would allow an animal to use expectations of likely outcomes to guide responses to cues and to facilitate learning when expectations are violated.
The notion that the OFC guides behavior by signaling outcome expectancies is consistent with the effects of OFC damage on behavior. These effects are typically evident when the appropriate response cannot be selected using simple associations, but instead requires outcome expectancies to be integrated over time or to be compared between alternative responses. For example, humans with damage to the OFC are unable to guide behavior appropriately based on the consequences of their actions in the Iowa gambling task [14
]. In this task, subjects must choose from decks of cards with varying rewards and penalties represented on the cards. To make advantageous choices, subjects must be able to integrate the value of these varying rewards and penalties over time. Individuals with OFC damage initially choose decks that yield higher rewards, indicating that they can use simple associations to direct behavior according to reward size; however, they fail to modify their responses to reflect occasional large penalties in those decks. Integrating information about the occasional, probabilistic penalties would be facilitated by an ability to maintain information about the value of the expected outcome in representational memory after a choice is made, so that violations of this expectation (occasional penalties) could be recognized. This deficit is analogous to the reversal deficits demonstrated in rats, monkeys and humans after damage to the OFC [15
This ability to hold information about expected outcomes in representational memory has also been probed in a recent study in which subjects made choices between two stimuli that predicted punishment or reward at varying levels of probability [22
]. In one part of this study, subjects were given feedback about the value of the outcome that they had not selected. Normal subjects were able to use this feedback to modulate their emotion about their choice and to learn to make better choices in future trials. For example, a small reward made them happier when they knew that they had avoided a large penalty. Individuals with OFC damage showed normal emotional responses to the rewards and punishments that they selected; however, feedback about the unselected outcome had no effect on either their emotions or on their subsequent performance. That is, they were happy when they received a reward, but they were no happier if they were informed that they had also avoided a large penalty. This impairment is consistent with a role for the OFC in maintaining associative information in representational memory to compare different outcome expectancies. Without this signal, individuals cannot compare the relative value of the selected and unselected outcomes and thus fail to use this comparative information to modulate emotional reactions and facilitate learning.
Although these examples are revealing, a more-direct demonstration of the crucial role of the OFC in generating outcome expectancies to guide decision-making comes from reinforcer devaluation tasks. These tasks assess the control of behavior by an internal representation of the value of an expected outcome. For example, in a Pavlovian version of this procedure (), rats are first trained to associate a light cue with food. After conditioned responding is established to the light, the value of the food is reduced by pairing it with illness. Subsequently, in the probe test, the light cue is presented again in a non-rewarded extinction session. Animals that have received food-illness pairings respond less to the light cue than do non-devalued controls. Importantly, this decrease in responding is evident from the start of the session and is superimposed on the normal decreases in responding that result from extinction learning during the session. This initial decrease in responding must reflect the use of an internal representation of the current value of the food in combination with the original light-food association. Thus, reinforcer devaluation tasks provide a direct measure of the ability to manipulate and use outcome expectancies to guide behavior.
Figure 2 Effects of neurotoxic lesions of the orbitofrontal cortex (OFC) on performance in a reinforcer devaluation task. (a) Control rats and rats with bilateral neurotoxic lesions of the OFC were trained to associate a conditioned stimulus (CS, light) with an (more ...)
Rats with OFC lesions fail to show any effect of devaluation on conditioned responding in this paradigm, despite normal conditioning and devaluation of the outcome [23
]. In other words, they continue to respond to the light cue and attempt to obtain the food, even though they will not consume it if it is presented (). Importantly, OFC-lesioned rats display a normal ability to extinguish their responses within the test session, demonstrating that their deficit does not reflect a general inability to inhibit conditioned responses [24
]. Rather, the OFC has a specific role in controlling conditioned responses according to internal representations of the new value of the expected outcome. Accordingly, OFC lesions made after learning continue to affect behavior in this task [25
]. Similar results have been reported in monkeys trained to perform an instrumental version of this task [19
Rats with OFC lesions also show neurophysiological changes in downstream regions that are consistent with the loss of outcome expectancies. In one study [26
], responses were recorded from single units in the basolateral amygdala, an area that receives projections from OFC, in rats learning and reversing novel odor discriminations in the task described earlier. Under these conditions, OFC lesions disrupted outcome-expectant firing normally observed in the basolateral amygdala. Furthermore, without OFC input, neurons of the basolateral amygdala became cue-selective much more slowly, particularly after cue-outcome associations were reversed. Slower associative encoding in the basolateral amygdala as a result of OFC lesions, particularly during reversal, is consistent with the idea that outcome expectancies facilitate learning in other structures, especially when expectations are violated as they are in reversals. Thus, OFC appears to generate and represent outcome expectancies that are critical not only to the guidance of behavior according to expectations about the future, but also to the ability to learn from violations of those expectations. Without this signal, animals engage in maladaptive behavior, driven by antecedent cues and stimulus-response habits, rather than by a cognitive representation of an outcome or goal.