The generation of two-way RIL by selfing and of two-, four-, and eight-way RIL by sibling mating is shown in . The notation for generation numbers for RIL can be confusing. The numbering indicated in is used throughout, with F_{1} being the first generation in which all parental alleles are present in a single individual. In the following, I abbreviate G_{1} : F_{k} in four-way RIL and G_{2} : F_{k} in eight-way RIL as simply F_{k}. In particular, G_{1} in four-way RIL and G_{2} in eight-way RIL is called F_{0}.

Consider a particular crossing strategy, and let

*X*_{k} denote the parental type at generation F

_{k}. For RIL by selfing, this is the diplotype of the individual; for RIL by sibling mating, this is the pair of diplotypes for the two siblings. For example, in considering two loci in four-way RIL by sibling mating, one possible state is the starting state at F

_{0},

*AA* |

*BB* ×

*CC* |

*DD*. (In this notation, the pairs of letters on each side of the vertical bar denote the two haplotypes for an individual; the first and second letters in each haplotype correspond to the alleles at the first and second loci, respectively.) The sequence

*X*_{0},

*X*_{1},

*X*_{2},

…

, forms a Markov chain. That is,

*X*_{k}_{+1} is conditionally independent of

*X*_{0},

*X*_{1},

…

,

*X*_{k}_{−1}, given

*X*_{k}.

Let *P* denote the transition matrix of the Markov chain, defined by *P*_{ij} = Pr(*X*_{k}_{+1} = *j* | *X*_{k} = *i*). Our goal is to calculate the *k*-step probabilities, π_{k} = π_{0}*P*^{k}, where π_{0} is the starting distribution (at F_{0}), which contains 1 at the fixed starting state and 0 for all other states.

First note that, for RIL by selfing, it is sufficient to consider two-way RIL, and for RIL by sibling mating, it is sufficient to consider four-way RIL. This is due to the bottleneck with two chromosomes in RIL by selfing at generation F

_{1} and with four chromosomes in RIL by sibling mating at generation F

_{0}. The results may be extended from two-way RIL by selfing to four-way RIL by selfing or from four-way RIL by sibling mating to eight-way RIL by sibling mating, by considering an additional generation of recombination. One may obtain the results for two-way RIL by sibling mating from the results for four-way RIL by sibling mating by collapsing states: let

*A* *B* and

*C* *D*.

The major technique for deriving the *k*-step probabilities, π_{k}, is to derive the eigen decomposition of the transition matrix: *P* = *V*Λ*V*^{−1}, where Λ is the diagonal matrix of eigenvalues and *V* is a matrix whose columns are the corresponding eigenvectors. Then *P*^{k} = *V*Λ^{k}V^{−1}, and Λ^{k} is obtained from Λ by taking the *k*th powers of the eigenvalues.

Such an eigen decomposition is straightforward in theory but is unwieldy in practice, due to the extremely large number of possible states. And so the second major technique is to take account of various symmetries to collapse the states into a smaller number. For two-way RIL by selfing with two loci, the simplest formulation would give 2

^{4} = 16 possible states (two possible alleles at each locus on each of the two chromosomes). But considering that the order of the two haplotypes is immaterial, these may be reduced to 10 possible diplotypes. As shown in

Haldane and Waddington (1931), these may be further reduced to just five states, by taking account of two additional symmetries: the order of the two loci may be ignored, and the symbols

*A* and

*B* may be switched.

Let us formalize this idea. (For a more rigorous approach, see

Burke and Rosenblatt 1958.) Let the possible states of the chain be

*S* = {

*s*_{1},

…

,

*s*_{n}}. Partition

*S* into

*m* subsets of equivalent states,

*S*_{i} *S*, so that, for any pair

*i* and

*j*, Pr(

*X*_{k}_{+1} *S*_{j} |

*X*_{k} =

*s*) =

*q*_{ij} for all

*s* *S*_{i}. The

*q*_{ij} form an

*m* ×

*m* transition matrix,

*Q*, for the collapsed states. Let

*Z* denote the

*n* ×

*m* incidence matrix defined by

*z*_{ij} = 1 if

*s*_{i} *S*_{j} and 0 otherwise. Then

and so

*P*^{k}Z =

*ZQ*^{k}. As a result, π

_{k}Z = π

_{0}*P*^{k}Z = π

_{0}*ZQ*^{k}. Thus, one may work with the

*m* ×

*m* transition matrix

*Q* in place of the

*n* ×

*n* transition matrix

*P*.

For this collapse of states to be useful, the multiple states within each equivalence class,

*S*_{i}, need to have equal probabilities at each generation, so that the probabilities of the individual states may be derived from the probabilities of the collapsed states. This will depend on the starting distribution. For example, consider one locus in two-way RIL by sibling mating. If the starting state is

*AA* ×

*BB*, then at any future generation, the chance of being in state

*AA* ×

*AB* is the same as that of being in state

*AB* ×

*BB*. However, if the starting state is

*AA* ×

*AB*, then there will be a lack of symmetry between

*A* and

*B*. (For the asymmetric case of two-way RIL initiated from a backcross, see

Johannes and Colomé-Tatché 2011.)

Kimura (1963) described a further technique that has been critical in this work. In many instances, we do not need the full distribution π

_{k}, but only various linear combinations, say π

_{k}z = π

_{0}*P*^{k}z, where

*z* is an

*n* × 1 vector.

Kimura (1963) demonstrated how to expand

*z* to an

*n* ×

*m* matrix

*Z* in such a way that there exists a matrix

*Q* satisfying

Equation 1. Then we again have π

_{k}Z = π

_{0}*P*^{k}Z = π

_{0}*ZQ*^{k} and may work with the

*m* ×

*m* matrix

*Q* in place of the

*n* ×

*n* matrix

*P*. Here, the matrix

*Q* is not a transition matrix but simply defines a recursion. The first element of π

_{k}Z is the target quantity, π

_{k}z.

Consider, for example, the probability of a random two-locus haplotype drawn from generation F

_{k} in the formation of RIL by sibling mating. Let

*C*_{k}(

*AA*) denote that chance that

*AA* is drawn. This could either be an intact haplotype, transmitted without recombination from generation F

_{k}_{−1}, or be the result of recombination between the two haplotypes in a random F

_{k}_{–1} individual. Consider drawing a single random allele at the first locus from generation

*k* and then taking the allele at the second locus but on the opposite chromosome in that individual. Let

*S*_{k}(

*AA*) denote the probability that these two alleles are both

*A*. Then

*C*_{k}(

*AA*) = (1 −

*r*)

*C*_{k}_{−1}(

*AA*) +

*rS*_{k}_{−1}(

*AA*), where

*r* is the recombination fraction between the two loci. Further,

*S*_{k}(

*AA*) =

*T*_{k}_{−1}(

*AA*), where

*T*_{k}(

*AA*) is the chance that, if one draws a random allele at the first locus from generation F

_{k} and then a random allele from the opposite individual at the second locus, both alleles are

*A*. We may further write

*T*_{k}(

*AA*) as a function of

*C*_{k}_{−1},

*S*_{k}_{−1}, and

*T*_{k}_{−1}, forming the recursion matrix,

*Q*, which is shown in

Supporting Information,

Table S1. Moreover, this same recursion applies for all of the other haplotypes; one just needs to use different starting distributions, π

_{0}*Z*. For the three distinct cases for four-way RIL by sibling mating, these are shown in

Table S2.

A particularly useful aspect of Kimura's technique is that the recursion matrix can be constructed by probabilistic arguments, without the need to form the full transition matrix,

*P*, or even the

*n* ×

*m* matrix

*Z*. For two autosomal loci in four-way RIL by sibling mating, there are 4

^{8} = 65, 536 diplotype pairs without accounting for any symmetries. This may be reduced to 9316 after accounting for the obvious symmetries (exchange the two haplotypes in each individual and exchange the two individuals) and then to 700 diplotype states after accounting for the less obvious symmetries (exchange the two loci, exchange alleles

*A* and

*B*, exchange alleles

*C* and

*D*, and exchange both

*A* for

*C* and

*B* for

*D*). By the technique of

Kimura (1963), one may work with a 3 × 3 matrix in place of the 700 × 700 transition matrix, if only haplotype probabilities are desired.