In avoidance learning, an animal or human learns to perform a response in order to avoid an aversive outcome. Here we provide evidence with fMRI that during such learning a part of the human brain previously implicated in responding to reward outcomes, the medial OFC, increases in activity following successful avoidance of the aversive outcome. These results are compatible with the possibility that activity in the medial OFC during avoidance reflects an intrinsic reward signal that serves to reinforce avoidance behavior.
Activity in the medial OFC not only increased after avoiding an aversive outcome or receiving reward, but also decreased after failing to obtain a reward or receiving an aversive outcome. Consequently, this region shows a fully opponent response profile to rewarding and aversive outcomes and their omission [
22]. This finding suggests the relevance of opponent process theory to avoidance learning, following a recent report of similar underlying processes in pain relief [
31]. These OFC responses cannot be explained as PE, because activity does not decrease to rewarding outcomes nor increase to aversive outcomes even as these outcomes become better predicted over the course of learning. Rather, responses to rewarding and aversive outcomes in this region likely reflect a positive affective state arising from the successful attainment of reward and a negative affective state from failing to avoid aversive outcome. Similarly, differential activity in this region to avoiding an aversive outcome and missing reward may reflect a positive affective response to successfully avoiding an aversive outcome and a negative affective state arising from failure to obtain a reward. Thus, our findings indicate that medial OFC activity at the time of outcome reflects the affective (or reinforcing) properties of goal attainment. This is bolstered by a number of previous neuroimaging studies that implicate this region in responding to receipt of many different types of reward including money, but also attractive faces, positively valenced face expressions, pleasant music, pleasant odors, and foods [
27–
30,
32,
33]. While other studies have reported a role for the medial OFC in complex emotions such as “regret,” which may contain both positive and negative affective components [
34], the results of these previous reward studies, when combined with the present findings, suggest a specific (though not necessarily exclusive) role for the medial OFC in encoding the positive hedonic consequences of attaining both extrinsic and intrinsic reward. The finding described here, of a specific role for the medial OFC in signaling goal-attainment, adds to burgeoning literature implicating the ventromedial prefrontal cortex as a whole, in goal-directed decision making and motivational control [
35–
42].
Alongside outcome-related responses, activity in the medial OFC was also found to correlate with an expected reward value signal derived from our reinforcement-learning model. This is reflected by an increase in activity following the onset of reward trials, during which (following learning) delivery of reward is expected, as well as by a decrease in activity following the onset of avoidance trials, during which (following learning) delivery of an aversive outcome is expected. This expected reward value signal co-exists in the same region of the medial OFC found to respond to reward outcomes. While we observe the same region of the medial OFC responding during both anticipation and receipt of reward, limits in the spatial resolution of fMRI preclude us from determining whether the same population of neurons within the medial OFC are sensitive to both reward expectation and receipt of reward outcomes, or if two distinct but spatially intermingled populations of neurons within this region exhibit selective responses to either expectation or receipt of reward. Nevertheless, we also found a region of the lateral OFC responding during reward expectation that did not respond during receipt of reward, indicating that these two components of reward processing are at least partially dissociable [
43].
The findings reported here also help to address previous discrepancies in the reward neuroimaging literature as to the differential role of the medial versus lateral OFC in processing rewarding and aversive outcomes [
29,
44,
45]. In the present study we show that the medial OFC responds to reward outcomes (as well as following successful avoidance of aversive outcomes), whereas both medial and lateral OFC responds during anticipation of reward. Indeed, when we tested for regions showing increases in activity to receipt of aversive outcome or omission of reward, we found a region of the lateral prefrontal cortex extending down to the lateral orbital surface with this response profile, implicating this region in responding to monetary an aversive outcomes [
30]. These findings suggest the possibility that dissociable activity within the medial versus lateral OFC may be evident during receipt of rewarding and punishing events, but not during their anticipation.
We also tested for regions of the human brain involved in encoding PE signals during both reward and avoidance learning. We found a fully signed reward PE signal in the ventral striatum on reward trials, whereby activity increases following unexpected delivery of reward, but decreases following unexpected omission of reward (as shown previously [
10,
46]). However, we did not find an aversion-related PE signal in the ventral striatum on avoidance trials, whereby signals increase following expected delivery of an aversive event but also decrease following unexpected omission of the aversive event. This is in direct contradiction of previous studies that have reported such signals during aversive learning with pain or even a least preferred food stimulus [
31,
47–
49]. In our study, PE signals were significantly greater in reward trials than avoidance trials in this region, even following presentation of an unexpected aversive stimulus. Yet, decreases (rather than increases) in activity in the ventral striatum during aversive learning have been reported in at least a few other studies, specifically those featuring receipt of monetary aversive outcomes [
50,
51]. One plausible explanation for these apparent contradictory findings is that monetary loss as a secondary reinforcer may be processed differently in the ventral striatum than more primary punishing stimuli such as aversive flavors or pain.
The main finding of this study is that the medial OFC responds during successful avoidance of aversive outcome as well as during receipt of explicit rewards. An important caveat is that the results presented here do not necessarily provide a complete explanation for why, in some animal learning studies, behavior is maintained even after complete avoidance of such outcomes [
15]. Unlike in those studies, avoidance behavior in the present study could have been maintained by virtue of the fact that the participants continued to receive aversive outcomes from time to time. Nonetheless, it is certainly plausible that similar opponent reinforcement mechanisms to those shown here could also play a role even when a punisher can be completely avoided. However, in this case, additional mechanisms may also come into play in order to account for resistance to extinction, such as the onset of habitual control processes (see Mackintosh, 1983, Chapter 6) [
52].
A role for the medial OFC in responding following avoidance of an aversive outcome provides an important insight into the conundrum of avoidance learning. It seems that the same neural circuitry is recruited during avoidance of an aversive outcome as is recruited during receipt of reward. Consequently, this neural avoidance signal may itself act as a reinforcer, and just as a reward does, bias action selection so that actions leading to this outcome are chosen more often. More generally, our results point to a key role for the medial OFC in mediating the affective components of goal attainment, whether the goal is to obtain reward or avoid an aversive outcome.