One of the more important features of game-theoretic probes is the solution concept for the game, that is, the way that a rational self-interested player should play the game in order to maximize their own returns. Solution concepts for game theoretical probes are valuable because they can be used to guess the kinds of control or learning signals that an organism would need to generate in order to play the game optimally (Camerer, 2003
; Rangel et al., 2008
). Simple games therefore provide an excellent way to expose the computations underlying value-dependent choice in humans (Guth et al, 1982
; Axelrod, 1984
; Roth et al., 1995
; Dickhaut et al., 1995; Camerer, 2003
; Glimcher et al., 2009
, Montague et al., 2006
In the domain of solution concepts, the idea of Nash Equilibrium has been a leading principle (Fudenberg and Tirole, 1991
; Nash, 1950
). In a strategic interaction between two agents (two humans, a human and an institution, two institutions, etc), the Nash equilibria
are the set of choices by the two agents where no unilateral change by either agent can improve their outcome. More colloquially, in the absence of some other knowledge, if my choice is at a Nash equilibrium, then any other choice that I might have made does not improve my payoff. This statement also applies to my partner. The concept of Nash equilibrium is illustrated in . In this case, we show the payoffs for each player in a table. For illustration purposes we show the simple example of a simultaneous game where each player chooses their actions (there are two available to each player) independent of their partner after which the choices and their attendant payoffs are revealed to both players.
Now consider the case where player 1 takes action 1 and player 2 takes action 2. This places each player in the upper right hand cell of the payoff table. This is not
a Nash equilibrium since the outcome would improve for either player had they chosen the alternative action. This same conclusion would be true if player 1 had chosen action 2 and player 2 had chosen action 1 thus placing them in the lower left hand cell of the table. In this illustration, the Nash equilibria are indicated by green circles. Let’s suppose that the players make choices that place them in the upper left hand cell. This set of choices is a Nash equilibrium because if either player were to unilaterally change their action, then their outcome gets worse. This is a very simple example and we have made the actions mutually exclusive for illustration purposes (this is called a pure strategy Nash equilibrium). In some games, especially those that model pertinent real-world situations, a player can choose a mixed strategy where some fraction f
of their choices are allocated to action 1 and the remaining fraction 1-f
of their choices are allocated to action 2. This is called a mixed strategy
since a player’s actions are a mixture of the available actions. The game shown in also possesses a mixed strategy equilibrium; however, it exposes one flaw in this particular equilibrium concept: with the pure strategy one player always makes more than the other and in the real world this might discourage participation altogether. The Nash concept shows what a player’s best response should be provided that one’s partner is playing their equilibrium strategy. In practice, partners do not always play theit equilibrium strategy – humans often deviate from Nash in practice even when they know this to be sub-optimal (Van den Bos et al., 2008
) – in these instances playing according to a Nash equilibrium does not equate to the best response. These equilibrium concepts, and others, provide rational ways to identify the best response
in a game; a central concept in economics (Fudenberg and Tirole, 1991
Mixed strategy games can be much better representations of real-world choice situations. For example, consider a weasel that visits a clearing possessing two holes. The weasel can poke its head into hole 1 and get water from an underground spring that flows sometimes to yield water or is otherwise dry. Or the weasel can poke its head into hole 2 where another animal has stashed nuts or not. How should the weasel allocate the fraction of its choices to each of these alternatives once it arrives in the clearing? In this case, each choice has variable yields and the weasel’s nervous system must decide the relative value of likely returns from each choice before nudging the weasel into a specific choice. Since the returns from these choices are stochastic, the weasel must bring a good model of the returns from each hole, and it must continually re-estimate those returns conditioned on its past experiences with each hole. This example shows that the weasel must compute a mixed strategy for the two choices, but it also illustrates the limit of equilibrium solutions to this game – the weasel must use some of its processing power to re-estimate the value of each choice and so simply focusing on equilibrium solutions would miss some of the central features of this real-world problem. Here, the weasel is constructing and updating predictive models of its (possibly dynamic) food sources, which represents a departure from simple equilibrium-based ‘best response’ models.
This same reasoning would hold for humans interacting in real-world exchanges and suggests a kind of blending of strict equilibrium-based accounts of best responses and predictive models of other’s behavior. Here, rational agents in the real-world should estimate what fraction of the other agents will choose their equilibrium strategy and possess or guess something about what the non-equilibrium agents will do. These kinds of blended accounts (blends between economic models and predictive behavioral models) have given rise to cognitive hierarchy models of how human agents should model
those with whom they interact (Camerer and Ho, 2004; also see Ray et al., 2008
for use of cognitive hierarchy model during social exchange).
Some of the most precise connections among valuation, choice, and neural function have been produced using game theoretic settings in non-human primates paired with single unit electrophysiological recordings (Dorris and Glimcher, 2004
; Sugrue et al., 2004
; Seo and Lee, 2007
; also see Platt and Glimcher, 1999
; Hayden et al., 2009
). shows three examples where the visual system was used primarily as an input-output device and the parameters of interest involved the valuation of ‘where to look’. The experiment illustrated at the left represents work by Dorris and Glimcher using what is called the work-shirk inspection game. The animal gets one payoff if he is working when the ‘employer’ looks in, another payoff if he is shirking his duties when the ‘employer’ looks in, and so on. The best response for this game is a mixed strategy, that is, the animal should choose to distribute its actions across its two behavioral options (e.g. 20% to one option, 80% to the other) depending on the setting of a parameter. This game took place while the experimenters recorded from neurons in the posterior parietal cortex. The clever maneuver in this experiment is that the experimenters parameterized the Nash equilibrium in this game (controlled by parameter i in left panel). The main conclusion in that study was that subjective desirability of a behavioral option covaried with firing rate changes in the recorded neurons independent of the objective parameters related to reward acquisition. This explanation of the observed data has been questioned (Sugrue et al., 2005
) by an alternative experiment that manipulated local changes in reward-harvesting task (Sugrue et al., 2004
, see below) while monitoring neural activity in the same region. We highlight these examples (Dorris and Glimcher, 2004
; Sugrue et al., 2004
) to demonstrate that the nature of the quantitative economic choice model in constructing an experiment is very important. Similar approaches will continue to improve our model-based understanding of the decision variables encoded in recordable neural activity.
Neuroeconomic approaches to studying choice behavior in non-human primates paired with single unit electrophysiological recordings
A similar kind of game theory strategy was used by Lee and colleagues where a ‘matching pennies’ game (a coordination game like the example in , right panel) was used in monkeys while single unit recordings were carried out in the brain. Again, correlations between neural activity and variables that in theory could (or should) influence outcomes were observed. Sugrue et al., (2004)
demonstrated another experiment that has used a game-like probe while recording neural activity. This group used a visual reward-harvesting task and also recorded from neurons in the parietal cortex while the rate of reward from different behavioral options was controlled (a dynamic foraging task akin to the example in , middle panel). These investigators found that their recordings were most consistent with the relative value of competing options (here local probability of eye movements to one of two targets) rather than subjective desirability (the average payoff of each target). These examples represent a small fraction of an ever-growing literature using game-theory-designed tasks to probe animals while recording some kind of neural variable.
This work is in its early days; however, one common theme emerges from these experiments, that is, the ability to use game theoretic probes to expose and model expected changes in subject’s behavior
. While the connection of economic variables to single neuron activity in these studies remains either provisional or in some cases disputed, there is no dispute that the experimental probes provide an excellent way to probe value-dependent choice in primates at the behavioral level
. This conclusion is supported by the fact that all three studies highlighted in produced excellent and quite sensitive behavioral models of the animal’s observed choice behavior (Dorris and Glimcher, 2004
; Sugrue et al,. 2004
; Seo and Lee, 2007
). The conflicting results (Dorris and Glimcher, 2004
; Sugrue et al., 2004
) may simply be a limitation of the current experimental capacity to record from a sufficient number and range of neurons during tasks. Alternatively, it may reflect a deeper issue of the myriad ways that different brains implement solutions to economic problems at the level of neural networks. Nevertheless, the behavioral lesson in the animal work has been taken to heart in the human neurobiology community where detailed game-theoretic probes have been used to probe everything from response to monetary reward, risk, and even response to the risks involved in exchanging with other humans.