|Home | About | Journals | Submit | Contact Us | Français|
Evolution and the maintenance of polymorphism under the multilocus Levene model with soft selection are studied. The number of loci and alleles, the number of demes, the linkage map, and the degree of dominance are arbitrary, but epistasis is absent or weak. We prove that, without epistasis and under mild, generic conditions, every trajectory converges to a stationary point in linkage equilibrium. Consequently, the equilibrium and stability structure can be determined by investigating the much simpler gene-frequency dynamics on the linkage-equilibrium manifold. For a haploid species an analogous result is shown. For weak epistasis, global convergence to quasi-linkage equilibrium is established. As an application, the maintenance of multilocus polymorphism is explored if the degree of dominance is intermediate at every locus and epistasis is absent or weak. If there are at least two demes, then arbitrarily many multiallelic loci can be maintained polymorphic at a globally asymptotically stable equilibrium. Because this holds for an open set of parameters, such equilibria are structurally stable. If the degree of dominance is not only intermediate but also deme independent, and loci are diallelic, an open set of parameters yielding an internal equilibrium exists only if the number of loci is strictly less than the number of demes. Otherwise, a fully polymorphic equilibrium exists only nongenerically, and if it exists, it consists of a manifold of equilibria. Its dimension is determined. In the absence of genotype-by-environment interaction, however, a manifold of equilibria occurs for an open set of parameters. In this case, the equilibrium structure is not robust to small deviations from no genotype-by-environment interaction. In a quantitative-genetic setting, the assumptions of no epistasis and intermediate dominance are equivalent to assuming that in every deme directional selection acts on a trait that is determined additively, i.e., by nonepistatic loci with dominance. Some of our results are exemplified in this quantitative-genetic context.
To achieve a proper understanding of the evolutionary dynamics of phenotypic traits, it is essential to study the effects of selection on multiple linked or unlinked loci. Because many species are subdivided into colonies, or demes, and selection varies geographically, the consequences of migration and spatially varying selection need to be taken into account. Each of these aspects has been studied extensively but mostly separately. Multilocus selection and the maintenance of polygenic variation in a panmictic population inhabiting a constant, homogeneous environment have been prime topics of research during the past decades; for reviews or recent treatments of general models, see Karlin (1978), Turelli and Barton (1990), Lyubich (1992), Nagylaki (1992),Zhivotovsky and Gavrilets (1992), Christiansen (1999), Bürger (2000), Kirkpatrick et al. (2002), and Ewens (2004).
Spatially varying selection in subdivided populations was intensively investigated as well, mainly for the single-locus case. A particularly prominent role has been played by the Levene (1953) model, which assumes a finite number of demes, selection within demes, and individuals dispersing independently of their deme of origin. As a consequence, there is no population structure despite geographically variable selection. This property makes it more amenable to mathematical analysis than general migration–selection models. A good overview of the literature can be acquired from the articles of Karlin (1977, 1982) and Nagylaki and Lou (2008), and the pertinent chapters in the books of Nagylaki (1992) and Christiansen (1999).
Work on multilocus selection in subdivided populations is relatively scarce. There is early work by Li and Nei (1974), who showed that even in the absence of epistasis and dominance, migration–selection balance in two demes can maintain linkage disequilibrium (see also Christiansen and Feldman, 1975). Zhivotovsky et al. (1996) used a multilocus Levene model to study the evolution of phenotypic plasticity. Wiehe and Slatkin (1998) investigated a haploid Levene model in which linkage disequilibrium is caused by epistasis. Christiansen (1999) derived sufficient conditions for the protection of gametes in a multilocus context. More recently, Spichtig and Kawecki (2004) and van Doorn and Dieckmann (2006) performed numerical studies on the maintenance of multilocus polymorphism in two demes for arbitrary migration and for the Levene model, respectively. Roze and Rousset (2008) derived recursions for the allele frequencies and for various types of genetic associations in a multilocus infinite-island model. Barton (2010) explored certain aspects of speciation using a generalized, haploid multilocus Levene model that admits habitat preferences.
The notorious complexity of the evolutionary dynamics of multilocus systems as well as the richness of evolutionary phenomena in subdivided populations leave little hope for a general theory combining both aspects. However, considerable progress has been made recently for important limiting or special cases. These include weak migration and weak selection (Bürger, 2009a,b) and the Levene model without epistasis (Nagylaki, 2009b; Bürger, 2009c). If either migration or selection is weak, the evolutionary dynamics are perturbations of relatively simple limiting dynamics which are amenable to mathematical analysis. In both cases, global convergence of trajectories to equilibria in quasi-linkage equilibrium could be proved under natural, quite general assumptions (Bürger, 2009a). If migration is weak, this conclusion requires the assumption of weak epistasis; if selection is weak, then equilibrium states are also spatially quasi-homogeneous.
These results were applied in Bürger (2009b) to study the maintenance of multilocus polymorphism if epistasis is weak or absent and dominance is intermediate. In a panmictic population, polymorphism is impossible under such conditions. For strong migration, however, arbitrarily many recombining loci can be maintained polymorphic if there are at least two demes, and this holds for an open set of parameters. By contrast, for weak migration, the maximum number of loci that can be maintained polymorphic on an open set of parameters equals the number of demes.
Nagylaki (2009b) investigated evolution under the multiallelic multilocus Levene model without epistasis. He demonstrated that geometric-mean fitness, , depends only on the vector of gene frequencies and is monotone increasing except at equilibria. Therefore, converges generically, i.e., for almost all parameters and initial data, to a gene-frequency equilibrium that is a local maximum of . In addition, Nagylaki proved global convergence to linkage equilibrium if there are either only two loci or there are multiple loci without dominance. He conjectured that the set of gene-frequency equilibria that are in linkage equilibrium is globally attracting, hence that global convergence to linkage equilibrium occurs always. For deme-independent degree of intermediate dominance (DIDID) he showed that, generically, at most diallelic loci can segregate at equilibrium, where denotes the number of demes.
In Bürger (2009c), the equilibrium and stability structure of the diallelic two-locus Levene model with two demes was derived in considerable generality. Epistasis was ignored but dominance admitted. Absence of genotype-by-environment () interaction was shown to lead to nongeneric, and nonrobust, properties.
In this paper, we shall prove Nagylaki’s conjecture for multiple multiallelic loci under the generic assumption that gene-frequency equilibria are isolated. As a consequence, evolution in the Levene model without epistasis can be fully understood by studying the much simpler gene-frequency dynamics on the linkage-equilibrium manifold for which geometric-mean fitness is monotone increasing along nonconstant solutions. More generally, we establish convergence of trajectories to a stationary point in quasi-linkage equilibrium if epistasis is sufficiently weak. Analogous, and even stronger, results hold if selection acts on haploids. We apply these results to investigate the maintenance of multilocus polymorphism in the Levene model for diallelic nonepistatic loci and, especially, for a quantitative trait that is under linear selection in every deme.
In Section 2, we formulate the multilocus Levene model and summarize the results on the gene-frequency dynamics that are needed subsequently. Section 3 is devoted to the study of convergence to linkage equilibrium in the absence of epistasis. The main result is Theorem 3.1, which states global convergence to stationary points in linkage equilibrium under mild, generic conditions. The proof is based on the simple observation that, at gene-frequency equilibrium, the dynamics reduces to a panmictic dynamics without epistasis. The theorem follows by employing the results of Nagylaki (2009b) on convergence of gene frequencies in the Levene model and those of Kun and Lyubich (1979, 1980) on convergence in the panmictic case without epistasis. Corollary 3.3 is most useful for applications because it formulates several of the conclusions that can be deduced for the full dynamics by analyzing the much simpler gene-frequency dynamics at linkage equilibrium. In Section 3.2, geometric convergence to a unique, globally asymptotically stable equilibrium is established under the assumption of DIDID (Theorem 3.5). Among others, this includes absence of interaction as a special case. In Section 4, the haploid Levene model is studied and geometric convergence to a unique, globally asymptotically stable equilibrium is proved.
The maintenance of multilocus polymorphism is investigated in Section 5. Result 5.1 is a slight extension of Theorem 2.2 in Bürger (2009b) which shows that in the Levene model an arbitrary number of loci can be polymorphic at a globally asymptotically stable equilibrium, and this holds for an open set of parameters. The main results are Theorems 5.3 and 5.5. They apply if the degree of dominance is intermediate and deme independent and show that this assumption considerably restricts the possibility of multilocus polymorphism relative to intermediate dominance that varies among demes. The first theorem establishes that Nagylaki’s (2009b) generic upper bound, , for the number of segregating loci at equilibrium is assumed on an open set of parameters. The second theorem shows that in the nongeneric case, in which an internal equilibrium with more than polymorphic loci exists, there is a manifold of equilibria with generic dimension , where is the number of loci. This highly degenerate case occurs for instance in the absence of interaction.
Section 6 applies these results to a quantitative-genetic model with linear directional selection in each deme. Corollary 6.3, which summarizes the main results, should serve as a warning when studying models under highly specialized assumptions. It shows that natural assumptions, such as absence of genotype-by-environment interaction, may lead to nongeneric model behavior. Indeed, the analysis of such a simple degenerate model (two diallelic loci, two demes, no dominance and linear fitnesses; see Bürger, 2009c) and the ensuing discussions with Thomas Nagylaki initiated the recent series of papers on multilocus migration–selection models by Nagylaki and the author.
In Section 7, weak epistasis is studied. In Section 7.1, we establish a perturbation result that yields global convergence of trajectories to quasi-linkage equilibrium. In Section 7.2, some of the results of Section 5 on the maintenance of polymorphism are extended to weak epistasis. In Section 8, we recapitulate and discuss our main findings, and mention some open problems.
We briefly introduce the multilocus Levene model and summarize some basic results from Nagylaki (2009b), hereafter abbreviated as N09b, that will be needed. There the model is developed in detail.
We suppose that there are diploid loci with alleles () at locus . We designate the set of all loci by and the set of all demes by , where is assumed. The relative size of deme is denoted by , hence . (Whenever no summation range is indicated, it is assumed to be over all admissible values; here, .) We shall consistently use the letters for loci, for gametes, and for demes (see Table 1 for a glossary of symbols). The linkage map is arbitrary, except for the assumption that all recombination probabilities are positive.
Throughout, we assume the Levene model with soft selection, which means that population regulation by selection occurs within demes. This assumption induces frequency-dependent selection. Because in the Levene model migration rates are independent of the deme of origin, there is no population structure, and gamete and gene frequencies before selection are deme independent (Levene, 1953; Nagylaki, 1992). We denote the frequency of gamete , which carries allele at locus , by , and the frequency of allele by . Let be the fitness of the diploid genotype in deme . Then the marginal fitness of gamete and the mean fitness of the population in deme are
respectively. Further, let denote the probability that a parent of genotype produces a gamete during meiosis. Because there is soft selection, adult dispersal, and random mating within demes, the gamete frequencies evolve according to (N09b, Eq. (2.42))
where the prime, ′, signifies the next generation. We note that these recursions are obtained whether dispersion precedes recombination or not (Bürger, 2009a). Moreover, they admit the classical interpretation that intrademic selection is followed by random mating in the entire population (cf. Nagylaki and Lou, 2008).
The state space is the simplex of probability vectors of length , where is the number of gametes. We write for the vector of gametic frequencies. The vector consisting of all gene frequencies (for every and every ) is denoted by , where
is the space of gene frequencies, or the gene-frequency space, for short.
For the rest of this section, we assume absence of epistasis. Then we can assign fitness contributions to single-locus genotypes. We denote the contributions at locus in deme by , where and refer to the alleles carried by the genotype at locus , and assume . Thus, we posit that the fitness of genotype is given by
An easy calculation shows that the mean fitness in deme becomes
is the fitness contribution of allele in deme . Importantly, depends only on the vector of gene frequencies, but not on the vector of gamete frequencies.
We introduce the compact notation
and note that , , and are nonnegative functions of , and
With this notation, the dynamics of gene frequencies (Eq. (2.48) in N09b) can be written as
as the set of gametic frequencies at gene-frequency equilibrium, or the set of gene-frequency equilibria for short, and
as the linkage-equilibrium manifold. Further, let
denote the geometric-mean fitness and define
The following result will play a fundamental role. It depends crucially on the fact that and are functions only of rather than of .
In the absence of epistasis, the dynamics (2.2) of gamete frequencies has the following properties:
Statement (a) about is Theorem 3.1 in N09b. The statement about is a trivial consequence. Statement (b) is Theorem 3.3 in N09b.
We call a property generic if it holds in an open dense set of full measure.
Simple important consequences of statement (a) are (Remark 3.2 in N09b):
If there is linkage equilibrium, i.e., on , the recursions (2.10) for the gene frequencies simplify to
and define and as the covariance matrix with entries . Then, for every locus , we obtain
and are Lyapunov functions for (2.16), and for (2.17), and the asymptotically stable equilibria of (2.16) are the local maxima of . The internal equilibria of (2.16) are exactly the internal critical points of , or of , on , and this holds for every lower-dimensional subsystem. They are precisely the solutions of (2.15). Comparison with Result 2.1 yields that is an (asymptotically stable) equilibrium of (2.16) if and only if it gives rise to an (asymptotically stable) gene-frequency equilibrium of (2.2).
We designate equilibrium points of (2.16) by . Every gives rise to the set
of gene-frequency equilibria of (2.2). We write , , and if , , and , respectively, are evaluated at .
We will need the following two assumptions on the gene-frequency equilibria:
Clearly, (2.20) is satisfied if no single-locus genotype is lethal everywhere. A simple consequence is . We recall that we also assume
These three conditions hold generically.
In the absence of epistasis, generic global convergence to , hence to a stationary point in linkage equilibrium, was proved in N09b if there are two multiallelic loci (Theorem 4.6) or if the number of loci is arbitrary and there is no dominance (Theorem 4.14). Theorem 4.13 in N09b states (local) asymptotic stability of for multiple multiallelic loci. However, as Professor Nagylaki informed me, the induction proof of this theorem has a gap that may require extensive calculations to fill. (The gap does not affect the validity of the proof of his Theorem 4.14 for no dominance.) Motivated by this failure, I found a simple alternative method that is more general and yields global convergence to linkage equilibrium for the general multiallelic multilocus model without epistasis. It is based on a simple observation and employs the results in N09b on the gene-frequency dynamics (summarized in Result 2.1) as well as the result of Kun and Lyubich (1979, 1980) on convergence to equilibrium in the panmictic multilocus model without epistasis. Throughout this section, we assume absence of epistasis.
The crucial observation is the following. Because there is no epistasis, is a function of the gene frequencies only. For a given gene-frequency equilibrium , we write and obtain from (2.4) and (2.8a):
On , every is constant because every is. Therefore, on , the dynamics (2.2) of gamete frequencies becomes
is constant. Hence, (3.2) is equivalent to a panmictic multilocus selection dynamics without epistasis. It describes the (panmictic) evolution of linkage disequilibria under selection. Global convergence of trajectories to an equilibrium point in linkage equilibrium now follows from Theorems 1 and 2 in Kun and Lyubich (1979); for a detailed treatment in English, see Section 9.6 in Lyubich (1992). (Application of their result requires the assumptions (2.20) and (2.21).) Because the equilibria of the gene-frequency dynamics are isolated, every converges as to some . Hence, for every solution of (2.2), there exists a such that the limit points of are contained in (Remark 2.2). Now LaSalle’s invariance principle (LaSalle, 1977, p. 10) yields the following theorem:
Theorem 3.1 holds more generally if every trajectory converges to an equilibrium point . However, if a manifold of equilibria exists in , convergence has not been proved. A proof would require a generalization of the argument that establishes Theorem 9.6.3 in Lyubich (1992), which seems very challenging.
Because we have shown convergence to linkage equilibrium under generic conditions, the full dynamics of gamete frequencies as well as the equilibrium and stability structure can be determined by studying the much simpler gene-frequency dynamics (2.16). To formulate the result precisely, we denote the vector of all linkage disequilibria in the -locus system by , where is the linkage disequilibrium defined in Eq. (4.47) of N09b. Then we can write
Under the assumptions of Theorem 3.1 the following hold:
Corollary 3.3(c) applies in particular if there is no dominance (cf. Theorem 4.14 in N09b). For diallelic loci, it applies whenever fitness contributions at every locus are sublinear, i.e., if for every and every ,
holds. This assumption, meaning that there is either no dominance or the beneficial allele is (partially) dominant, implies that every is concave (Bürger, 2009c). Hence, is concave.
Here, we make an assumption about the genetic architecture. Following N09b, we say there is deme-independent degree of intermediate dominance (DIDID) if
holds for constants such that
for every , every , and every pair . In particular, .
Obviously, DIDID covers complete dominance or recessiveness ( or if ), and no dominance (), but not multiplicativity. We also note that DIDID includes the biologically important case of absence of genotype-by-environment interaction. In general, is not concave under DIDID. Nevertheless, Theorem 3.14 in N09b establishes that under DIDID there exists exactly one stable gene-frequency equilibrium (point or manifold), and it is globally attracting. If an internal gene-frequency equilibrium exists, it is globally asymptotically stable.
Let us assume DIDID and that is an internal gene-frequency equilibrium. It can be shown that
for which convergence to linkage equilibrium is well known (Geiringer, 1944). Of course, it is sufficient to consider internal gene-frequency equilibria because, otherwise, restriction to the subsimplex that supports yields the result. Under pure recombination, the linkage-equilibrium manifold is globally attracting at a uniform geometric rate. For generic initial conditions the rate of approach is , where is the smallest two-locus recombination rate (see Lyubich, 1992; Nagylaki, 1993; Nagylaki et al., 1999). Therefore, the assumption , which is weaker than (2.21), is sufficient to establish convergence to linkage equilibrium. We summarize these results as follows.
It would be interesting to identify the dominance patterns that lead to (3.7), hence, to (3.8). The set of these dominance patterns is a proper subset of all dominance patterns because for the internal equilibrium computed in Theorem 4.3 in Bürger (2009c), in general, the first locus does not satisfy (3.7).
Here we consider a species that is haploid but reproduces sexually with recombination. General properties of the multilocus dynamics with viability selection in a panmictic population of haploids were derived by Kirzhner and Lyubich (1997). Various aspects of the one-locus haploid Levene model were studied by Strobeck (1979); see also Nagylaki and Lou (2008). Wiehe and Slatkin (1998) and Barton (2010) investigated haploid multilocus Levene models with certain forms of epistasis. As in the above-treated diploid case, and as in Wiehe and Slatkin (1998) and in one of Barton’s (2010) models, we assume that the life cycle consists of viability selection, dispersal, and recombination. Moreover, soft selection is assumed.
All unexplained notation is as in Section 2. In haploids, the constant fitness is assigned to gamete in deme . The mean fitness in deme is simply . Because recombination may occur between haplotypes originating from different demes, the recursion relations for the gamete frequencies are given by
From now on we assume absence of epistasis and set
Furthermore, we define
Without epistasis we have
and (4.1) simplifies to
Essentially the same, sometimes slightly simpler, calculations as in Sections 2 and 3 of N09b prove that the dynamics of gene frequencies has the form (2.10) and that Result 2.1 and Remarks 2.2 and 2.3 hold. An argument analogous to that yielding Corollary 3.8 in N09b shows that there exists exactly one stable gene-frequency equilibrium (point or manifold), and it is globally attracting. If an internal gene-frequency equilibrium exists, it is globally asymptotically stable.
For the rest of this section we assume (2.19). The results about the gene-frequency dynamics ensure that every converges to some . Hence, for every solution of (4.1), there exists a such that the limit points of are contained in . (For generic initial conditions solutions are attracted by the set that is generated by the (unique) maximum of geometric-mean fitness .) For a given internal gene-frequency equilibrium , the equilibrium condition (2.15) together with (4.4) yields
Therefore (4.6) reduces to
which is the well-known dynamics of gamete frequencies under a pure recombination process. For (4.8) convergence to linkage equilibrium is well known and requires only instead of (2.21) (see Section 3.2). Thus, we have proved the following result:
An analog of Remark 3.6 applies. We also note that the haploid model is not equivalent to the diploid model without epistasis and multiplicative fitnesses within loci. Instead, it is equivalent the diploid model with additive fitnesses within gametes and multiplicative fitnesses between them (see Remark 3.4 and Kirzhner and Lyubich, 1997).
In this section, we study the maintenance of multilocus polymorphism in the Levene model. In addition to assuming absence of epistasis, we assume that the degree of dominance is intermediate at every locus and in every deme. Thus, overdominance and underdominance are excluded. In a panmictic population, no polymorphism is possible in the absence of epistasis if there is intermediate dominance (Bürger, 2009b, Proposition 3.2 and Corollary 3.4). Although the Levene model lacks population structure, the following result shows that, nevertheless, it harbors the potential for extensive multilocus polymorphism under such conditions:
Assume an arbitrary number of multiallelic loci, demes, and let all recombination rates be positive and fixed. Then there exists a nonempty open set of parameters such that for every parameter combination in this set, there is a unique, internal, asymptotically stable equilibrium point of the dynamics (2.2). This equilibrium is in linkage equilibrium and it is globally asymptotically stable.
Essentially, this result was proved in Bürger (2009b, Theorem 2.2, Remark 2.3 (iii), Remark 2.4). There, only convergence to quasi-linkage equilibrium was shown. Theorem 3.1 demonstrates convergence to linkage equilibrium. More generally, Theorem 2.2 in Bürger (2009b) shows that Result 5.1 holds for arbitrary ergodic, i.e., irreducible and aperiodic, migration patterns, and not only for the Levene model. Such extensive polymorphism can occur if selection is weak relative to migration and recombination. The constructive proof in Bürger (2009a,b) requires balancing selection and a certain form of average overdominance across demes that can be achieved only if the direction of selection and the degree of dominance vary among demes.
If, however, the degree of dominance is not only intermediate within demes, but there is DIDID, generically, at most diallelic loci can segregate (Proposition 3.18 in N09b). If, in addition, selection is sufficiently weak, then no polymorphism can be maintained (Proposition 2.6 and Remark 2.7 in Bürger, 2009b). Obviously, these results are in sharp contrast to Result 5.1 and underline that dominance plays an important role in maintaining genetic variation in a subdivided population.
For the rest of this section, we assume diallelic loci. Our main aim is to prove that the above mentioned upper bound is indeed assumed on an open set of parameters. As a consequence, such polymorphic equilibria are structurally stable. We also show that if and if, as is nongeneric but may occur under additional constraints on the parameters, there exists an internal equilibrium, then it is a manifold.
For diallelic loci, we write for the frequency of , and for the frequency of . The vector of gene frequencies can then simply be represented as
where the definition of is modified. The condition (2.15) for internal gene-frequency equilibria reduces to
where the fitness contributions of the two alleles at locus in deme become
The contribution to mean fitness of locus in deme is
From (5.3), we obtain
Therefore, and yield if . If for every , then(5.5a) and Remark 2.3 imply that allele becomes fixed. This confirms the intuition that, in the absence of overdominance and underdominance, an allele that is the best in every deme is fixed. Hence, a locus can be polymorphic only if each of the alleles is the best in at least one deme.
For two alleles, also the definition of DIDID simplifies. There is DIDID if for every , there exist constants such that for every ,
and if . We shall write . Therefore, the dynamics of gamete frequencies (2.2) is uniquely determined by the linkage map and by the parameters , , and , where and .
Our main result in this section is the following.
Assume diallelic loci with DIDID, i.e., (5.7). Let be arbitrary but fixed. Then there exists an open nonempty set of parameters such that for every parameter combination in , there is a unique, internal, asymptotically stable equilibrium point of (2.2). This equilibrium is in linkage equilibrium and globally attracting. The set is independent of the choice of the recombination rates.
Together with Proposition 3.18 in N09b, this result implies that under DIDID and if , at most diallelic loci can be maintained polymorphic for an open set of parameters. The proof of the above theorem is based on
The basic idea for the proof of Proposition 5.4 is to find parameter combinations such that is an equilibrium of the gene-frequency dynamics (5.5a), and then to show that every sufficiently small perturbation of the parameters still yields an (isolated) internal equilibrium. The proof of this proposition is rather technical and constructs the set , which is the same in Theorem 5.3 and Proposition 5.4. A more detailed formulation of the proposition and its proof are given in the Appendix.
Statements (a) and (b) of Corollary 3.3 show that for every in the open set of Proposition 5.4, is the desired unique and internal equilibrium point of (2.2). Because the proof of Proposition 5.4 is based entirely on the gene-frequency dynamics (5.5a), the construction of is independent of the linkage map. □
Now we formulate and prove our second main result of this section.
Assume diallelic loci with DIDID. If there exists an internal equilibrium of the gene-frequency dynamics (5.5a), then there is a manifold of equilibrium points containing it. Generically, this manifold has dimension .
holds. In matrix form this reads
where . Generically, we therefore have and, because ,
Although scalar multiples of also solve (5.11), scalar multiples of do not give rise to further solutions of (5.10) because, by Result 2.1, geometric-mean fitness is maximized, hence constant, at gene-frequency equilibria.
Therefore, it is sufficient to determine the dimension of the solution space of the system
for the given vector . This is a system of quadratic equations in the unknowns . We transform it into a system of linear equations in the unknowns by using the nonlinear transformation defined by
This is the specification of (3.20) in N09b for diallelic loci. By Lemma 3.1 in Nagylaki (2009a), every is a homeomorphism on . Denoting , we obtain from (3.23) in N09b by a simple calculation,
which is linear in for every (cf. Lemma 3.13 in N09b). Therefore, the system (5.13) of quadratic equations in is equivalent to the linear system
in of the same dimension. This can be written as
Here, the second equality follows from (5.12) which holds generically. □
Because under the conditions of the above theorem a manifold of gene-frequency equilibria exists, it cannot be inferred that convergence to linkage equilibrium occurs although this seems likely. The reason is that the application of Theorem 3.1 requires isolated gene-frequency equilibria or at least a proof that every converges (to a single point); cf. Remark 3.2.
We apply the above results and those of N09b to a simple quantitative-genetic model and discuss their implications. To this end, we consider a quantitative trait that is determined additively (i.e., without epistasis) by multiallelic loci. Intermediate dominance and genotype-by-environment () interaction are admitted. Let the multilocus genotype have the trait value
in deme , where is the effect on the trait in deme assigned to the single-locus genotype . To avoid degeneracy, we posit if .
We assume that the trait is under linear directional selection in each of the demes, i.e.,
where the range of possible values and the selection coefficients are constrained such that for every . Therefore, we define the single-locus contribution of to fitness by
Then, as in (2.4),
A glance at (6.3) reveals that, for given , , and selection coefficients , (6.3) establishes a one-to-one correspondence between the single-locus fitness contributions and the single-locus trait effects .
Therefore, this model of linear selection on a quantitative trait is as general as the nonepistatic model introduced in Section 2 and studied in Sections 3 and 5. Obviously, no generality would be lost by setting for every . However, we refrain from doing so. Moreover, it is immediate from (6.3) that no or intermediate dominance at the trait level is equivalent to no or intermediate dominance, respectively, at the fitness level. The dominance relations are reversed in demes with .
In analogy to (3.6), we say there is DIDID on the trait level if there exist constants such that
for every , every , and every pair . Trivially, the relation (6.3) transforms DIDID on the trait level to DIDID on the fitness level, and vice versa; the constants are the same for trait and fitness.
Genotype-by-environment interaction is absent on the trait level if there exist constants and such that
Therefore, (6.6) implies (6.5). Summarizing, there is no interaction on the trait level if and only if there is DIDID and (6.6) holds for all homozygous single-locus effects. A further simple consideration establishes equivalence of absence of interaction at the trait and the fitness level.
An important consequence of this one-to-one relation between fitness and trait value is that results that hold generically for the model defined in terms of the fitness contributions also hold generically for this model of a quantitative trait under linear selection. Especially, Result 5.1 and Theorems 5.3 and 5.5 apply. Thus, if the number of demes is at least two, for an open set of parameters a globally asymptotically stable equilibrium exists such that
Parameter combinations yielding (i) can be constructed by applying the procedure in Proposition 2.1 in Bürger (2009b) with and substituting for the parameters used there (here, is the backward migration rate into ). In fact, it is easy to show that this is possible if , i.e., in the absence of interactions between homozygous effects and environment.
An unresolved question concerns the number of alleles that can be maintained at a locus if there is DIDID. In N09b (Corollary 3.9) it is shown that if , then generically no internal gene-frequency equilibrium exists. Therefore, if the number of alleles is the same at every locus ( for every ), the number of demes is a generic upper bound on the number of alleles per locus. Clearly, a generalization of (ii) to multiallelic loci would be desirable.
Therefore, the condition (2.15) for internal gene-frequency equilibria becomes
for every and . We conjecture that, at least generically, for every , holds for at least one (in fact, for all but one). If this is true, then
is a necessary and sufficient condition for the existence of an internal gene-frequency equilibrium in the absence of interaction.
In the absence of dominance, this is easy to prove because we have for every and every , hence
Obviously, for each , can hold for at most one . Thus, (6.13) is the condition for an internal gene-frequency equilibrium.
Here, we specialize to diallelic loci and use the notation introduced in Sections 2 and 5, e.g., and . By Corollary 3.3, the equilibrium and stability structure can be determined by studying the relatively simple system (5.5a), where a straightforward calculation yields
Because is quadratic in , (6.13) is equivalent to a single polynomial equation (of degree ) in the variables . Therefore, if an internal solution exists, there is a manifold of solutions of (generic) dimension . For more than two demes, this exceeds the dimension derived in Theorem 5.5. As a consistency check, we observe that the matrix defined in the proof of Theorem 5.5 has the entries because . Therefore, it has rank 1 and a calculation as in (5.18) yields . Thus, if , the equilibrium structure in the absence of G×E interaction is even more highly degenerate than under DIDID alone.
The above examples demonstrate that (6.13) is a necessary and sufficient condition for an internal gene-frequency equilibrium if there is no interaction and either there is no dominance or loci are diallelic. In the absence of interaction, an internal equilibrium cannot exist if all selection coefficients have the same sign (for a more general statement, see Remark 5.2). If, for instance, for every , then the gamete with the largest genotypic value has maximum fitness and becomes fixed (cf. N09b, Proposition 3.15). If the selection coefficients vary in sign, it is always possible to choose the relative demes sizes such that (6.13) has an internal solution , hence a manifold of solutions. In fact, there is an open set of such parameters combinations . For other deme sizes, there is at least one isolated, asymptotically stable equilibrium. At such a stationary state, at most loci can be polymorphic.
We summarize the main results of this section.
Assume diallelic loci and let all recombination rates be positive and fixed. Further, assume the selection model given by (6.1)–(6.4), i.e., linear directional selection on a quantitative trait that is determined additively by diallelic loci exhibiting intermediate dominance.
Among others, this result demonstrates that biologically reasonable but special assumptions can lead to nongeneric, and nonrobust, model behavior. Therefore, caution is necessary when generalizing conclusions obtained from specific models.
A detailed and rather complete analysis of the diallelic two-locus case in two demes was performed in Bürger (2009c).
Several of our results can be extended to weak epistasis. By weak epistasis we mean that there is a small , such that the fitness scheme has the form
First, we generalize Theorem 3.1 and show that global convergence to quasi-linkage equilibrium occurs under weak epistasis. Then we briefly point out some applications for the maintenance of polymorphism. Throughout, we assume (2.19)–(2.21).
As noted by Nagylaki et al. (1999), the assumption of hyperbolicity of every equilibrium is not robust to small perturbations because limit sets need not change continuously. What has good behavior under perturbations is the set of chain-recurrent points, which contains the limit sets of all orbits (Conley, 1978; Akin, 1993). For the Levene model without epistasis, we have the following:
In the case of no epistasis, the only chain-recurrent points of (2.2) are its equilibria.
By Result 2.1, the Lyapunov function takes only finitely many values on the set of gene-frequency equilibria. Therefore, Theorem 3.16 in Akin (1993) shows that every chain-recurrent point is contained in . By our general assumption (2.19), has finitely many components . Since, by (3.1), on the dynamics reduces to the panmictic multilocus dynamics (3.2) with no epistasis, Lemma 2.2 of Nagylaki et al. (1999) yields the assertion. □
To formulate the main result of this section, we introduce the following notation. We write for the set of equilibria of (2.2) if there is no epistasis (), and we write for the set of equilibria for a fitness scheme that satisfies (7.1) with .
Let fitness contributions be given, such that without epistasis every equilibrium of (2.2) is hyperbolic. Further, let all recombination probabilities be fixed (and positive), and let be sufficiently small. Then for every set of fitnesses satisfying (7.1) the following holds:
Parts (a) and (b) are consequences of the implicit function theorem and the Hartman–Grobman theorem, and follow immediately from Theorem 4.4 in Karlin and McGregor (1972). The only issue that remains to be shown in (b) is that perturbed equilibria do not leave (cf. Remark 2.3 in Nagylaki et al., 1999). This follows from the explicit characterization
of equilibria if ; see (2.16). Let be some equilibrium for , and denote the set of alleles present at locus by . The face of determined by if is invariant under the dynamics (2.2) for every , as can be seen from the alternative representation
where is a measure of linkage disequilibrium in gamete in deme (see Eq. (4.47) in N09b). Therefore, the equilibrium , which is hyperbolic for , persists in this face for .
(c) Upon replacing the reference to Lemma 2.2 in the proof of Theorem 2.3 in Nagylaki et al. (1999) by one to the above Lemma 7.1, their proof applies unaltered. However, our assumption (7.1) is slightly weaker than their assumption (2.1). Our assumption is uniform in the sense that we first choose and then admit all parameter combinations satisfying (7.1), whereas following their formulation, we first had to fix a set of and then . Because Corollary 32 in Akin (1993), which is used in the proof of Theorem 2.3 in Nagylaki et al. (1999), admits this greater generality, their proof indeed applies. □
We conjecture that an equilibrium of (2.2) with is hyperbolic if is a hyperbolic equilibrium of (2.16). In the presence of selection, in general, this cannot be inferred from the results or proofs in Lyubich (1992) because it has not been shown that convergence to linkage equilibrium occurs at a geometric rate. If there are only two loci, or if there is DIDID, or if selection acts on haploids, then Theorem 4.6 in N09b, or Remark 3.6, or the remark following Theorem 4.1, respectively, establish the conjecture. It can also be established for three diallelic loci (Bürger, unpublished).
With the help of Theorem 7.2, several of the results on the maintenance of polymorphism can be generalized to weak epistasis. Of course, Result 5.1 generalizes to weak epistasis. The only difference then is that equilibria are not necessarily in linkage equilibrium but in quasi-linkage equilibrium. In fact, this was already proved in Bürger (2009b, Theorem 2.2 and Remark 2.4). Proposition 3.18 in N09b and Theorem 5.3 have the following generalization.
Assume weak epistasis and diallelic loci with DIDID.
Also statements (a) and (b) in Corollary 6.3 can be generalized to weak epistasis. In (d), quasi-linkage equilibrium can be stated.
The analysis of multilocus systems is greatly simplified, and often only feasible, if linkage equilibrium or, at least, quasi-linkage equilibrium can be assumed (e.g. Karlin and Liberman, 1979; Turelli and Barton, 1990; Christiansen, 1999; Bürger, 2000). For the classical multilocus selection model, it has long been known that global convergence to linkage equilibrium occurs if there is no (additive) epistasis (Kun and Lyubich, 1979, 1980; Lyubich, 1992). If either epistasis or selection is weak, generic convergence to a stationary point in quasi-linkage equilibrium has been proved (Nagylaki, 1993; Nagylaki et al., 1999). The latter results can be generalized to subdivided populations as follows. If migration and epistasis are weak (relative to recombination) or if selection is weak (relative to migration and recombination), then global convergence to a stationary point in quasi-linkage equilibrium occurs generically (Bürger, 2009a). However, if migration is moderately strong, stable linkage disequilibrium may persist in the absence of epistasis (Li and Nei, 1974).
For the multilocus Levene model without epistasis, and under the generic assumption that all equilibria are isolated, Nagylaki (2009b) proved global convergence to an equilibrium point in linkage equilibrium if there are two (multiallelic) loci or if there is no dominance. He conjectured global convergence for multiple multiallelic loci, even if equilibrium points are not isolated. In Section 3, we establish this conjecture under the generic assumption of isolated gene-frequency equilibria (Theorem 3.1). For the haploid Levene model, a slightly stronger result is demonstrated in Section 4.
Whenever convergence to an equilibrium point in linkage equilibrium occurs in the Levene model without epistasis, the analysis of the model simplifies greatly. Then it is sufficient to study the system of recursion equations (2.16), which describes the dynamics of gene frequencies under the assumption of linkage equilibrium. Corollary 3.3 summarizes the most important conclusions that can be drawn from investigating (2.16) and its Lyapunov function (2.14). Further useful properties of the gene-frequency dynamics may be found in Section 3 of N09b.
With the help of perturbation methods (Karlin and McGregor, 1972; Nagylaki et al., 1999), the convergence result Theorem 3.1 can be extended to weak epistasis. Then equilibria are not necessarily in linkage equilibrium, but they are in quasi-linkage equilibrium (Theorem 7.2). Therefore, many results that hold without epistasis can be extended to weak epistasis. For an example, see Theorem 7.4.
In Section 5, we apply our convergence results to investigate the maintenance of multilocus polymorphism. Overdominance and underdominance (within demes) are excluded. It has been shown previously that, if there is intermediate dominance in every deme, arbitrarily many multiallelic loci can be fully polymorphic on an open set of parameters (Result 5.1). In the proof, this open set was constructed by assuming, among others, that at each locus and in each deme, the fitter alleles are partially dominant. Because an internal equilibrium requires that the direction of selection is different in at least two demes, the proof of this result invokes interaction. In N09b it was shown that for DIDID, generically, the number of segregating loci is strictly less than the number of demes. Hence, some form of interaction is necessary to maintain or more loci polymorphic. Theorem 5.3 complements Nagylaki’s result and shows that with DIDID, loci can indeed be maintained at a stable equilibrium for an open set of parameters. If and, as is nongeneric, an internal equilibrium exists, Theorem 5.5 establishes that a manifold of equilibria exists. Generically, its dimension is . If there is DIDID and selection is sufficiently weak, then no polymorphism is possible in the Levene model without epistasis.
Of course, DIDID itself is a nongeneric property. Perturbations of isolated equilibrium points can be studied using, for instance, Karlin and McGregor’s (1972) method of small parameters which requires that equilibria are hyperbolic. Thus, if there is an asymptotically stable equilibrium under DIDID, there will be an asymptotically stable equilibrium in its neighborhood if the deviation from DIDID is small. A much stronger result can be inferred from Lemma 7.1 and the proof of Theorem 7.2. If all equilibria are hyperbolic when there is DIDID, under small deviations from DIDID, the global dynamics will remain qualitatively unchanged.
In Section 6, these results are applied to a quantitative trait that is determined by additive loci (i.e., without epistasis but with intermediate dominance). If the trait is under linear directional selection in every deme, the resulting model is equivalent to the general nonepistatic model studied in this paper. Hence, the results of Section 5 concerning the maintenance of variation apply unaltered. Interestingly, under the assumption of no interaction on the trait level, there is an open set of parameters such that an internal equilibrium exists. However, because absence of interaction implies DIDID, this is a manifold of equilibria. In fact, in this case, its dimension is instead of . Thus, if there are more than two demes, the assumption of no interaction in the Levene model with linear selection, leads to a highly degenerate equilibrium structure. In such a case, an arbitrarily small perturbation of the parameters that introduces interaction, may drastically change the equilibrium structure and the dynamics (see Bürger, 2009c). As already mentioned in the Introduction, these results (Corollary 6.3) should serve as a warning when studying models under highly specialized assumptions, even if they are biologically well motivated.
The fact that the maximum possible number of polymorphic loci may be constrained by the number of demes demonstrates that, even in the absence of epistasis and in the presence of linkage equilibrium, one cannot simply extrapolate single-locus results to multiple loci. It would be of considerable interest to find out for which genetic architectures and migration patterns such constraints occur.
It remains an unresolved problem how many alleles at how many loci can be maintained segregating under the assumption of DIDID or under the stronger assumption of no interaction. Theorem 3.14 and Corollary 3.9 in N09b imply that if there is DIDID and the number of alleles is the same at each locus, then is a generic upper bound on the number of segregating alleles per locus.
Eventually, one would like to quantify how frequent polymorphism is in the Levene model without epistasis, and how this depends on the selection scheme and the dominance relations. Numerical results for two diallelic loci and two demes show that for arbitrary intermediate dominance, the volume of the parameter space in which a full polymorphism is maintained varies between about 12% (for very weak selection) and about 16% (for very strong selection). Under the assumption of sublinear fitnesses (see (3.5) and Corollary 3.3(c)), these numbers increase to 22% and 36%, respectively. In the absence of interaction, they are 0% and 43%, respectively (Bürger, 2009c). With increasing number of alleles or loci, the volume of parameter space in which a stable internal equilibrium exists will certainly decrease rapidly, at least if the parameter range is unconstrained. Still, the set of parameter combinations for which a significant fraction of loci can be maintained polymorphic may be quite large.
Therefore, we expect that spatially varying selection has the potential to maintain considerable multilocus polymorphism, especially if locally beneficial alleles tend to be partially dominant. Properties of the genetic architecture, such as the dominance pattern or the presence or absence of (certain forms of) interaction, may greatly constrain this potential. In view of this, it seems worthwhile to study phenotypic plasticity and the evolution of genetic architecture in a spatial context (see e.g. Zhivotovsky et al., 1996; de Jong, 1999; Otto and Bourguet, 1999; de Jong and Gavrilets, 2000; van Doorn and Dieckmann, 2006).
Whereas DIDID plays an important role in the Levene model in constraining the amount of polymorphism, this is not necessarily so for general migration patterns. Peischl (2010) studied a single-locus model with two demes and general migration. He proved that, if there is DIDID, three alleles can be maintained at a stable equilibrium. His numerical computations suggest that more than three alleles can be maintained for an open set of parameters. His results have interesting interpretations in the context of the coexistence of specialists and generalists. Further, it is yet unknown whether DIDID also plays a prominent role in the Levene model with epistasis, for instance, if there is spatially varying stabilizing selection on a quantitative trait. This reconfirms the conviction that studying the role of genetic architecture for the maintenance of polymorphism in a spatial context seems to be a worthwhile enterprise.
The author is most grateful to Professor Thomas Nagylaki for extensive conversation and sharing of unpublished results. Many thanks go to Tom Nagylaki, Nick Barton, and an anonymous reviewer for perceptive comments on previous versions of this paper. This work was supported by a grant of the Vienna Science and Technology Fund (WWTF) and by grant P21305 of the Austrian Science Fund (FWF).
For every locus , we introduce the nonnegative matrices , , , and as follows:
is diagonal with
and is diagonal with
A simple calculation shows that is stochastic:
Because we consider only internal equilibria, we have for every and . Hence, is nonsingular. Furthermore,
whence is also nonsingular.
If is a gene-frequency equilibrium and is nonsingular, then
Let . Then, in vector form, (A.2) reads
where , , and all depend on . Since the internal gene-frequency equilibria are precisely the solutions of (2.15), they are precisely the solutions of
where . Because is stochastic, is a solution of (A.11). If is nonsingular, for any given , it is the unique solution. Therefore, is satisfied for every . The assertion now follows from (A.1) and (3.6b). □
Our aim now is to show that is nonsingular. We write for the matrix with all entries 1 and recall that a real matrix is skew-symmetric if .
Let be a skew-symmetric matrix and . Then for every and .
We begin by deriving a recursive formula for . If we first subtract row from all other rows of and then (the resulting) column from all other columns, we obtain the matrix
where the entry of the matrix is
Because , a simple calculation yields . Clearly, if , the determinant of the matrix (A.12) is simply . Developing the determinant of (A.12) with respect to the last row (or column), it follows immediately that
For arbitrary number of alleles, arbitrary matrix of dominance parameters satisfying (3.6b), and arbitrary positive matrix , the matrix is nonsingular.
We shall prove . For the proof, we omit the locus superscripts and assume that all matrices are . Because is nonsingular, (A.3)–(A.7) show that
where is the identity matrix. By assumption (3.6b), can be written as , where is skew-symmetric. Therefore, Lemma A.2 shows that . Because this holds for arbitrary dimension, every principal minor (of arbitrary order) of is also nonnegative. Since is a diagonal matrix with positive elements on the diagonal, we have . Because this holds for every dimension and because is diagonal, the principal minors of are the products of the corresponding principal minors of and (the latter being just products of the corresponding diagonal elements). Hence, all principal minors of are nonnegative.
It is another well-known fact (e.g. Gantmacher, 1986, p. 99) that the coefficient of of the characteristic polynomial of a square matrix can be written as the sum of all principal minors of order . As a consequence, , which equals the characteristic polynomial of evaluated at , can be written as 1 plus the sum of all principal minors of order . Since by the above reasoning all of them are nonnegative, we have
which finishes the proof. □
Throughout, we assume two alleles per locus and DIDID, i.e., (5.7). Let denote the set of all matrices with strictly positive entries. For given alleles at locus , we denote by the matrix containing all fitness contributions , where and and . Further, let
Then, and .
For any given matrix , where () is a column vector, we write
for the matrix obtained by omitting the last column (which requires ). For any given vector , means that every component of is positive, by which we always mean strictly positive.
We shall need the following parameter sets:
Our aim is to prove Proposition 5.4, which here we formulate more precisely.
The proof is based on three lemmas.
is an equilibrium of the gene-frequency dynamics (5.5a) if and only if
We denote .
Let . There exists an open subset such that for every a unique solution of (B.4) exists. It satisfies .
We solve this system for . Let and .
Multiplying (B.8) by and summing over all , we obtain
If , (B.9) has the unique solution
which exists and is uniquely determined if . This yields the desired solution . It satisfies if
It remains to show that (B.12) holds on an open set of parameters. Because , we have whenever . The latter is easy to achieve by choosing the , , sufficiently small. As a consequence and because for every , the estimate (B.12) can be realized for every and every by choosing a sufficiently small such that . Choosing from an arbitrary, but bounded, open subset , we can find a uniformly small . Thus, we can choose the desired set , where for sufficiently small . Thus, for every , there exists an open bounded set and a vector with the desired properties. □
depends smoothly on and on .
The set can be chosen such that it applies to every . This follows from the proof because and, therefore, the critical can be chosen independently of . Hence, the left-hand side in (B.12) can be made small uniformly in .
Next we show that small perturbations of yield equilibria close to . For this, we need the Hessian matrix of , evaluated at . We denote it by , i.e.,
where if , and otherwise.
Now we have all ingredients to formulate and prove our last lemma.
Let and . Then there exists an open subset such that for every and corresponding solution of (B.4), the Hessian is invertible.
Let denote the diagonal matrix with entries along the diagonal, and its square root, which exists and is invertible because for every . Moreover, we set
and define as the matrix with these entries, i.e., . Then we can write
Because (B.4) informs us that for every , generically, has rank . If , then, generically, . Because is valid for every matrix , we obtain , where the latter equality holds because multiplication by a nonsingular matrix leaves the rank unchanged. Therefore, is (generically) invertible if and only if .
Finally, if denotes the preimage of the open subset of of matrices of rank under the continuous map , then we can choose , where is as in the proof of Lemma B.3. Therefore the set is open and nonempty. □
In the absence of DIDID, this Hessian matrix is of the form , where is a nonnegative diagonal matrix that vanishes if there is DIDID; cf. (B.15). Then is generically invertible whenever and . In fact, the proof can be extended to arbitrary dominance relations, thus giving a very different proof of Result 5.1 that does not assume weak selection. However, the generalizations of Lemmas B.3 and B.6 require considerable work.
We choose and the corresponding according to Lemma B.6, i.e., . Clearly, . By Lemma B.2, is an internal equilibrium of (5.5a), and by Lemma B.6 the Hessian is invertible. Therefore, is an isolated equilibrium of (5.5a). Since (B.15), together with (5.3), (5.4), (2.5) and Remark B.4, demonstrates that is smooth in , the Implicit Function Theorem shows that there exists an open neighborhood of for which there is a unique, isolated gene-frequency equilibrium close to . Theorem 3.14 in N09b yields global asymptotic stability. □