|Home | About | Journals | Submit | Contact Us | Français|
We develop a new method for studying stochastic evolutionary game dynamics of mixed strategies. We consider the general situation: there are n pure strategies whose interactions are described by an n × n payoff matrix. Players can use mixed strategies, which are given by the vector (p1, …, pn). Each entry specifies the probability to use the corresponding pure strategy. The sum over all entries is one. Therefore, a mixed strategy is a point in the simplex Sn. We study evolutionary dynamics in a well-mixed population of finite size. Individuals reproduce proportional to payoff. We consider the case of weak selection, which means the payoff from the game is only a small contribution to overall fitness. Reproduction can be subject to mutation; a mutant adopts a randomly chosen mixed strategy. We calculate the average abundance of every mixed strategy in the stationary distribution of the mutation-selection process. We find the crucial conditions that specify if a strategy is favored or opposed by selection. One condition holds for low mutation rate, another for high mutation rate. The result for any mutation rate is a linear combination of those two. As a specific example we study the Hawk-Dove game. We prove general statements about the relationship between games with pure and with mixed strategies.
Evolutionary game theory is the study of frequency dependent selection (Maynard Smith & Price, 1973; Maynard Smith, 1982; Hofbauer & Sigmund, 1988; Weibull, 1995; Samuelson, 1997; Cressman, 2003; Vincent & Brown, 2005; Nowak, 2006; Skyrms, 2003). The fitness of individuals is not constant but depends on the composition of the population. Constant selection can be seen as adaptation on a constant fitness landscape, but for frequency dependent selection the fitness landscape changes as the population moves over it (Nowak & Sigmund, 2004). The classical approach to evolutionary game dynamics is based on the replicator equation (Taylor & Jonker, 1978; Hofbauer & Schuster, 1979; Zeeman, 1980; Hofbauer & Sigmund, 1988, 1998). The population is well-mixed and infinitely large. Any two individuals are equally likely to interact. The fitness of individuals is given by the expected payoff from all interactions. The dynamics are described by deterministic differential equations.
Evolutionary game theory represents one of the foundations for the mathematical approach to evolution. Evolutionary games occur whenever the reproductive success of individuals depends on interaction with other individuals, which is almost always the case. Therefore, evolutionary game theory permeates every area of biology including viral and bacterial evolution (Turner & Chao, 1999; Kerr et al., 2002), host parasite interactions (Nowak & May, 1994), the evolution of metabolic pathways (Pfeiffer et al., 2001), theoretical approaches to immunology (Nowak et al., 1991, 1995), and the study of animal and human behavior (Parker, 1974; Enquist & Leimar, 1983; McNamara & Houston, 1986; Milinski, 1987; Fudenberg & Tirole, 1991; Binmore, 1994; Ohtsuki & Iwasa, 2004; Nowak & Sigmund, 2005). Understanding the evolution of animal communication and human language requires evolutionary game theory (Nowak & Krakauer, 1999; Nowak et al., 2002). The Lotka-Volterra equation, which is a fundamental concept in ecology describing the interaction of species in an ecosystem, is equivalent to the replicator equation of evolutionary game dynamics (Hofbauer & Sigmund, 1998).
Typically a game is formulated in terms of pure strategies, which can be stochastic or deterministic. A payoff matrix describes the outcome of an interaction between any two pure strategies. Sometimes these pure strategies are the only options available to the players. But in other situations it could be natural that players have the possibility to use ‘mixed strategies’. A mixed strategy is a vector whose entries specify the probability for using each one of the pure strategies. A game with just pure strategies need not have a Nash equilibrium. But a game with pure and mixed strategies always has a Nash equilibrium (Nash, 1950). Mixed strategies allow a certain kind of randomization which could be advantageous in certain games (Sklansky, 2005).
Consider a game with n pure strategies. The payoff values are given by the n × n payoff matrix A = (aij). This means that an individual using pure strategy i receives payoff aij when interacting with an individual that uses pure strategy j. Let us now consider mixed strategies. A player can choose to play pure strategy i with probability pi. A mixed strategy is thus given by a stochastic vector p = (p1, … , pn) with 0 ≤ pi ≤ 1 and p1 + … + pn = 1. We denote the set of all such mixed strategies by Sn; this is a simplex in Rn. The unit vectors ei correspond to the pure strategies. The payoff of a mixed strategy p against a mixed strategy q is given by the function A(p, q) = pAqT.
We focus on stochastic evolutionary dynamics in well-mixed populations of finite size (Schaffer, 1988; Kandori et al., 1993; Kandori & Rob, 1995; Schreiber, 2001; Nowak et al., 2004; Taylor et al., 2004; Wild & Taylor, 2004; Traulsen et al., 2005, 2007a; Antal et al., 2006; Imhof et al., 2005; Imhof & Nowak, 2006; Lessard & Ladret, 2007). Evolutionary updating occurs according to the frequency dependent Moran process (Nowak et al., 2004; Taylor et al., 2004), the frequency dependent Wright-Fisher process (Imhof & Nowak, 2006) or the Pairwise Comparison process (Szabó and Tőke, 2006; Traulsen et al., 2007b). Reproduction is proportional to fitness and subject to mutation probability u. With probability 1 − u the offspring inherits the strategy of the parent. With probability u, the offspring chooses one of the mixed strategies uniformly at random. In a population of N players mutations occur at rate μ = Nu.
A state of our system describes the strategy of each individual. Our state space is . We look at the stationary distribution of the mutation-selection process and ask what are the average stationary frequencies (abundances) of each strategy (Antal et al., 2009b,c; Tarnita et al., 2009a,b). Note that we do not calculate the stationary distribution over . Instead, we calculate the average density of each mixed strategy in the stationary distribution. Then, we ask which strategies are favored, on average, by selection, i.e. which strategies are more abundant than the mean.
We study the case of weak selection. For the frequency dependent Moran process, the effective payoff of an individual is given by f = 1+δ· payoff. Here δ determines the intensity of selection. Weak selection means δ → 0. To obtain results in the limit of weak selection, we use a perturbation theory method also employed in Antal et al. (2009b,c) and Tarnita et al. (2009a).
For the game dealing only with the n pure strategies in a finite well-mixed population, Antal et al. (2009c) obtained the following result. In the limit of weak selection, all strategies have approximately the same abundance, 1/n, in the stationary distribution of the mutation-selection process. There are only minor deviations from this uniform distribution. One can say that selection favors a strategy if its abundance exceeds 1/n. Selection opposes a strategy if its abundance is less than 1/n. It has been shown that for low mutation probability (u 1/N), selection favors strategy k if
For high mutation probability (u 1/N), selection favors strategy k if
For arbitrary mutation probability, u, the general expression for selection to favor strategy k is
where μ = Nu is the rate of mutation. Moreover, strategy k is more abundant than strategy j if and only if
Finally, the equilibrium abundance of strategy k is:
All these results of Antal et al. (2009c) hold for large, but finite population size, N.
In this paper we analyze the same questions but for games with mixed strategies. By analogy with Antal et al. (2009c), we use global mutation rates. Hence, if a mutation occurs, then a mixed strategy is chosen at random from a uniform distribution over the simplex Sn. For a mixed strategy p = (p1, … , pn) with 0 ≤ pi ≤ 1 and p1+…+pn = 1, let p be the probability density of players using strategy p. In the game of n pure strategies, one determines which strategy is favored by comparing its abundance to the average, 1/n. In the game with mixed strategies, the equivalent condition for a strategy to be favored is that its abundance is greater than the mean, .
We establish the following results. For low mutation (u 1/N), strategy p is favored by selection if and only if
where . Note that condition (6) gives a quadratic hypersurface in p. For high mutation (u 1/N), the condition for strategy p to be favored by selection is
Note that (7) gives a hyperplane in p. For any mutation probability u, strategy p is favored by selection if and only if
where μ = Nu is the mutation rate. Moreover, strategy p is favored over strategy q if and only if
Finally, the abundance (density) of strategy p is:
All these results hold for large, but finite population size, 1 N 1/u and for payoff functions A(·, ·) which are Riemann-integrable. They allow a characterization of games with mixed strategies, in the limit of weak selection and for global mutation.
To gain some qualitative understanding of our results, let us first discuss the conditions for low and high mutation. When the mutation is low, all players use the same strategy, until a mutant strategy appears. This mutant strategy either takes over or dies out before any new mutant appears. Thus, for low mutation, only two strategies are involved in a takeover at any time. This explains why the low mutation condition is an average over all pairwise comparisons, A(p, p)+A(p, q)−A(q, p)−A(q, q). Conversely, for high mutation rates, the densities of all strategies are close to 1 all the time. Hence, the payoff of strategy p is ∫Sn A(p, q)dq. Strategy p is favored if its payoff is greater than the average payoff of the population, which is ∫Sn ∫Sn A(r, q)drdq. This difference is equivalent to (7). Note that the condition for high mutation rate holds for any intensity of selection, while the condition for low mutation requires weak selection.
The rest of the paper is structured as follows. In Section 2 we derive the general conditions for strategy selection for any mutation rates. Games with two pure strategies are analyzed in Section 3, and the Hawk-Dove game is considered as an example. In Section 4 we present a few results for games with n pure strategies. Our findings are discussed in Section 5.
We consider a well-mixed population of N individuals. Each of them plays a mixed strategy, based on n pure strategies. The payoffs of the n pure strategies are given by the n × n matrix A = (aij). The payoff of a mixed strategy p = (p1, … ,pn) Sn playing another mixed strategy q = (q1, … , qn) Sn is given by the function
First, we discuss the n = 2 case and then turn to the general n case, which is completely analogous.
When n = 2, a mixed strategy is given by p = (p, p′) S2 where p is the probability to play strategy 1 and p′ is the probability to play strategy 2. Moreover, p + p′ = 1. To integrate over the simplex S2, we use the projection onto the line p′ = 0. This is a line segment of length 1 whose left end corresponds to p = 0 which is the pure strategy 2 and whose right end corresponds to p = 1 which is the pure strategy 1. We partition the [0, 1] interval into segments of length 1/m, as shown in Fig. 1, and allow only m types of mixed strategies. This means that strategy k = [k/m, (m − k)/m], plays the pure strategy 1 with probability k/m, and pure strategy 2 with probability (m−k)/m. We are interested in the stationary abundance of these strategies.
and use (5) to obtain the frequency of strategy k. Here, for simplicity we continue to write A(p, q) for what in fact is the function A((p, 1 − p), (q, 1 − q)).
Taking the limit m → ∞, and given our assumption that the payoff function is Riemann-integrable, the sums in (5) converge to the integrals
For general n, completely analogously to the n = 2 case, we first partition each coordinate of the simplex Sn into m equal parts, and use the corresponding roughly (asymptotically) (n − 1)!mn−1 discrete strategies. We can use the pure strategy formulas (1), (2), (5) for this problem, where each sum breaks into n − 1 sums, corresponding to the n − 1 coordinates. Taking the limit m → ∞ we obtain (6), (7), and (10). Hence we conclude as before that the condition for strategy p to be favored by selection is indeed (8).
Consider the game between two strategies A and B. The payoff matrix is given by
In addition to the two pure strategies, we have mixed strategies p = (p, 1 − p) where p is the probability to play strategy A and 1 − p is the probability to play strategy B. The payoff of strategy p against strategy q is given by (11):
Then condition (6) that strategy p is favored by selection, in the limit of low mutation becomes
For p = (0, 1), which is the pure strategy B, we obtain:
This condition is equivalent to ρA < 1/N, where ρA is the fixation probability of an A mutant in a population of B individuals (Nowak et al., 2004; Ohtsuki, 2007; Imhof & Nowak, 2006; Lessard & Ladret, 2007). The above condition can thus be interpreted as follows: in a 2 × 2 game with mixed strategies, pure strategy B is favored if pure strategy A is not an advantageous mutant in a pure B population, ρA < 1/N. It is the so-called ‘1/3-rule’: A is an advantageous mutant when invading B if the fitness of A at a frequency of 1/3 is greater than the fitness of B. By symmetry, for p = (1, 0), which is the pure strategy A, (16) becomes 2a + b > 2c + d. This condition is equivalent to ρB < 1/N. Thus, in a 2 × 2 game with mixed strategies, A is favored if B is not an advantageous mutant, ρB < 1/N.
Now we can ask when strategy A is better than strategy B. For this we need to compare A to B; we conclude that A is favored over B if and only if a+b > c+d. This is the usual risk-dominance condition of strategy A over strategy B (Kandori et al., 1993; Nowak, 2006; Antal et al., 2009a). Thus, adding the mixed strategies does not change the ranking of the two pure strategies. If a + b > c + d then A is more abundant than B in the stationary distribution of the mutation-selection process with or without mixed strategies. This result will be generalized for games with n pure strategies in Lemma 2.
Next we analyze the case of high mutation. Condition (7) becomes
As before, the condition that selection favors p = (1, 0), which is the pure strategy A is
This is the usual condition for risk dominance of strategy A over strategy B. By symmetry, the condition that pure strategy B is favored by selection is a + b < c + d, which is the usual risk-dominance condition of B over A. Note that for high mutation, adding mixed strategies does not change the selection conditions for the pure strategies. This result will be generalized for games with n pure strategies in Lemma 3. Moreover, for high mutation rate, if a + b > c + d, then all strategies with p > 1/2 are favored. Thus, if A is risk dominant over B, all mixed strategies that play A with higher probability than B are favored. By symmetry, if B is risk-dominant over A, i.e. a + b < c + d, then strategies which are more likely to play B (i.e. p < 1/2) are favored.
Note that the low mutation condition is a quadratic equation in p. Hence, the best (or worst) strategy can be mixed. For high mutation, the condition is linear in p. Hence one of the two pure strategies is the best (most favored by selection) and the other pure strategy is the worst. In general, for any mutation rate u, the condition for strategy p = (p, 1 − p) to be favored by selection can be deduced from (8) to be
This condition describes a parabola. The mixed strategy corresponding to the tip of the parabola is = (, 1 − ) where
We are interested in the part of this parabola supported by the interval [0, 1]. Based on where is situated relative to the interval [0, 1] as well as on the sign of the coefficient of p2 in (20), the relevant part of the parabola can have several shapes, as shown in Fig 2.
For a non-degenerate (a ≠ d) payoff matrix (14), a > d can always be achieved by renaming the strategies. Moreover, under weak selection adding an overall constant to all elements or multiplying all elements by the same positive number does not affect the dynamics; it only changes the intensity of selection δ. Hence we can define an equivalent payoff matrix (see also Antal et al. (2009b))
with only two parameters
This means that each specific game given by numerical values in (14) corresponds to a point in the (α, β) plane. A class of games then corresponds to a particular region in this plane. In Figure 3.1 we show a few common games in the (α, β) plane. For example all Prisoner’s Dilemma games belong to the second quarter plane α < 0, β > 0. In the Hawk Dove game we first have to flip the Hawk and the Dove strategies to obtain a matrix with a > d. Then the corresponding region is 0 < α < 1 and β > 0. Moreover, the simplified Hawk-Dove game (26), which is the subject of the next subsection, is given by α = / and β = 1 − /. Hence, in the (α, β) plane, the simplified Hawk-Dove game is given by the line segment β = 1 − α, with 0 < α < 1.
Now we present the same analysis as before for mixed strategies in 2 × 2 games using the simplified payoff matrix. For any finite mutation rate, μ, we have three different regions in the (α, β) parameter space, as depicted in Fig. 4. The best strategy is
In the mixed case the optimal value of p is given by
In the small mutation rate limit μ → 0, the mixed phase extends to the quarter-plane α > 0, β > 0. For large mutation rates μ → ∞, however, the mixed phase disappears.
As an example, we consider the Hawk-Dove game (Maynard Smith, 1982; Houston & McNamara, 1988, 1991; Mesterton-Gibbons, 1994; Killingback & Doebeli, 1996; Nakamaru & Sasaki, 2003), which is given by the payoff matrix
The two strategies are hawk (H) and dove (D). While hawks escalate fights, doves retreat when the opponent escalates. The benefit of winning the fight is . The cost of injury is . We have 0 < < . If two hawks meet, then the fight will escalate. One hawk wins, while the other is injured. Since both hawks are equally strong, the probability of winning or losing is 1/2. Hence, the expected payoff for each of them is ( − )/2. If a hawk meets a dove, the hawk wins and receives payoff , while the dove retreats and receives payoff 0. If two doves meet, there is no injury, but one of them wins eventually. The expected payoff is /2. Since < , neither strategy is a Nash equilibrium. If everyone else plays hawk, then it is better to play dove and vice versa. Hence, hawks and doves can coexist. At the stable equilibrium, the frequency of hawks is given by /.
Next we consider mixed strategies that play hawk with probability p and dove with probability 1 − p. The strategy space is given by the interval [0, 1]. The pure strategies are p = 0 (D) and p = 1 (H). The evolutionarily stable strategy (ESS) is given by the mixed strategy that plays hawk with probability p* = /. No other strategy can invade this ESS under deterministic evolutionary dynamics.
We can now study the Hawk-Dove game with the methods developed in the previous section. Using (20), we can write the condition that the mixed strategy p = (p, 1 − p) is favored
The first observation is that the ESS strategy p* = (p*, 1 − p*) is always favored. We can see this by substituting p* into (27) and noting that it is always positive. However, p* is not always the best strategy. In (27) the leading coefficient is − < 0; hence the parabola is concave, as in Fig. 2 (a–e). Thus, the best strategy is either the tip of the parabola or one of the two pure strategies. The tip of the parabola is given by
The best strategy is the same as the ESS when = p*. From (28), this happens when either μ → 0 or / = 1/2. Otherwise, we can compare the tip of the parabola to the ESS and we find that > p* if and only if / > 1/2, while < p* if and only if / < 1/2. Thus if the ESS leans more towards one pure strategy, increasing the mutation rate pushes the best strategy even closer towards that pure strategy. An intuitive explanation is as follows. The mixed ESS, p*, dominates every other strategy p, in the sense that A(p*, p*) = A(p, p*) but A(p*, p) > A(p, p). Therefore it is intuitively obvious that the mixed ESS is the best strategy for low mutation, μ → 0. For high mutation, μ → ∞, the best strategy must be one of the pure strategies, because of the linearity of the payoff functions. Thus for any positive mutation rate the best strategy should be between the mixed ESS and one of the pure strategies.
We will now proceed with the analysis of the game. For low mutation, μ → 0, the parabola looks like Fig. 2 (b), (c) or (d). The favored strategies always form a neighborhood around the best strategy, p* = . If / < 1/3 then this neighborhood includes pure dove, p = 0. If 1/3 < / < 2/3 then this neighborhood includes neither pure dove nor pure hawk. If 2/3 < / then this neighborhood includes pure hawk, p = 1.
For high mutation, the condition for strategy p = (p, 1 − p) to be favored is
Therefore, If / < 1/2 then the best strategy is pure dove, p = 0. If 1/2 < / then the best strategy is pure hawk, p = 1. The mixed ESS, p*, is always favored but is never the best strategy.
When mutation is neither low nor high, we have a mixture of the above two cases. All situations of Fig. 2 (a–e) are possible. The mixed ESS, p*, is always favored, but is never the best strategy, unless / = 1/2. The tip of the parabola, , is given by (28).
Note that lower / values favor pure doves, while higher / values favor pure hawks. This makes sense, because / is the ratio of the benefit gained from winning the fight over the cost of possible injury. If this ratio is small, then it is better not to escalate fights (which means to play dove). It is interesting to note that the mixed ESS of classical game theory, p*, is always one of the strategies that is favored by selection, but is never the best strategy for any positive mutation rate μ > 0.
These results agree with the general picture of Section 3.1. Note that (after flipping the H and D strategies), the simplified Hawk-Dove game (26) corresponds to the segment β = 1 − α, with 0 < α < 1 on the plane of Fig. 3. Then the region where mixed strategies are favored is described by (24), and it can be seen on Figure 4.
In order to check the validity of our theory we have performed simulations of our stochastic model. We measured the abundance (density) in the stationary state for a Hawk-Dove game and compared it to the theoretical asymptotic result of (10). In Figure 5 we find good agreement between theory and simulations for benefit = 2 and cost = 5. Note that the results are quite close to the asymptotic limit already for N = 10 and Nδ = 0.2.
In this section we present a few results which characterize the general behavior of pure strategies. We will compare the behavior of pure strategies, k, in a pure n × n game with the behavior of pure strategies, ek in a mixed n × n game. In an n × n game with only pure strategies, the condition that strategy k is favored is given by (1).
In the limit of weak selection and low mutation in an n × n game with mixed strategies, the condition that pure strategy ek is favored is
The proof is by induction. From this lemma, it is immediate that Li − Lj = ei − ej. Hence, we can make the following statement.
In the limit of weak selection and low mutation, the relative ranking of pure strategies in a game with mixed strategies is the same as that in a game with just the pure strategies. Hence, adding mixed strategies does not change the relative ranking of the pure strategies.
A direct proof for this statement is immediate. To find the difference ei − ej, one simply needs to integrate:
We note that: A(ei, ei) = aii, A(ei, q) = Σk qkaik and A(q, ei) = Σk qkaki, where qn = 1 − q1 − … −qn−1. All these functions are linear and hence easily integrable over the simplex Sn. For example, we have .
In the limit of weak selection and high mutation we have
Hence, the condition for a pure strategy k to be favored in a mixed game is the same as in the pure game.
The proof is again immediate by induction. Note that this lemma is stronger than Lemma 2. In the high mutation case, the conditions that the pure strategies are favored are exactly the same for both the mixed and the pure game. In the low mutation case, only the relative behavior of the pure strategies is the same between the two cases. The actual conditions for selection are different between the mixed game and the pure game.
Finally, for any mutation, the condition for a strategy to be selected is a linear combination between the low mutation and high mutation behaviors. Hence, combining the results in Lemmas 2 and 3 we conclude that:
In the limit of weak selection, for any mutation rate, the relative ranking of the pure strategies is not affected by the addition of mixed strategies.
We have presented a method to study games with mixed strategies in finite sized populations under stochastic evolutionary dynamics. We assume that individuals reproduce at a rate that is proportional to the payoff earned in the game. Reproduction is subject to mutation. We characterize the average abundance of each mixed strategy in the mutation-selection equilibrium. Strategies that are favored by selection are more abundant than average. Strategies that are opposed by selection are less abundant than average. If strategy A is more abundant than strategy B in the mutation-selection equilibrium, then we say that A is favored over B.
Our results allow a simple and straightforward characterization of games with mixed strategies. The crucial condition for low mutation rate (6) is based on averaging over all pairwise comparisons. The crucial condition for high mutation rate (7) is based on evaluating the fitness of each strategy at the point where all strategies are equally abundant. Intriguingly, the condition for any mutation rate is simply a linear combination of the conditions for low and high mutation rates. We have no intuitive explanation why this condition is linear in the mutation rates.
Although our results provide interesting new insights into evolutionary game dynamics, our approach has limitations. (i) We can only consider the case of weak selection. This means the fitness of an individual is 1 + δ · payoff where δ is small. One way to interpret weak selection is to say that the game under consideration provides only a small contribution to overall fitness. Another way to explain weak selection is the following: when choosing to update their strategies, players make random choices that are weakly affected by payoff; this means players do not really know what is going on. Both of these aspects are fairly realistic and should apply in many situations. Hence weak selection is an important case to be studied. (ii) We consider only global mutation. A new mutant is picked at random from a uniform distribution of the strategy space, which is given by the simplex Sn. The location of the mutant in strategy space does not depend on the parent. (iii) We calculate the average abundance of each mixed strategy in the stationary distribution of the mutation-selection process, but we do not in fact calculate the stationary distribution over the entire state space. Our stochastic process has state space : for each of the N players we must specify which strategy (in Sn) is being used. Instead of calculating the stationary distribution over , we calculate the average abundance of each mixed strategy. With this method though, we are able to decide whether a strategy is favored or not by selection in the stationary state.
We are grateful to two anonymous referees for their extremely helpful comments which improved our paper. This work was supported by the John Templeton Foundation, the National Science Foundation/National Institutes of Health joint program in mathematical biology (NIH Grant R01GM078986), the Japan Society for the Promotion of Science, the China Scholarship Council and J. Epstein.