Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2760668

Formats

Article sections

- Abstract
- 1. Introduction
- 2. Model
- 3. Methods
- 4. Results
- 5. Discussion
- 6. Conclusion
- Supplementary Material
- References

Authors

Related links

J Theor Biol. Author manuscript; available in PMC 2010 November 21.

Published in final edited form as:

Published online 2009 August 11. doi: 10.1016/j.jtbi.2009.08.006

PMCID: PMC2760668

NIHMSID: NIHMS138753

Joseph Lachance, Graduate Program in Genetics, Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY 11794-5222;

Email: ude.bsynus@ecnahcaL.hpesoJ, Phone: 631-632-8588, Fax: 631-632-9797

The publisher's final edited version of this article is available at J Theor Biol

See other articles in PMC that cite the published article.

How many generations ago did the common ancestor of all present-day individuals live, and how does inbreeding affect this estimate? The number of ancestors within family trees determines the timing of the most recent common ancestor of humanity. However, mating is often non-random and inbreeding is ubiquitous in natural populations. Rates of pedigree growth are found for multiple types of inbreeding. This data is then combined with models of global population structure to estimate biparental coalescence times. When pedigrees for regular systems of mating are constructed, the growth rates of inbred populations contain Fibonacci *n*-step constants. The timing of the most recent common ancestor depends on global population structure, the mean rate of pedigree growth, mean fitness, and current population size. Inbreeding reduces the number of ancestors in a pedigree, pushing back global common ancestry times. These results are consistent with the remarkable findings of previous studies: all humanity shares common ancestry in the recent past.

All modern humans are ultimately related and the most recent common ancestor (MRCA) of humanity lived in the recent past. The exact timing of the MRCA depends on whether gene lineages or family trees are considered. Substantial numbers of individuals can trace their heredity to the likes of Niall of the Nine Hostages and Genghis Khan (Moore *et al.*, 2006; Zerjal *et al.*, 2003). On a grander scale, global coalescence times have been found for mtDNA and non-recombining Y chromosomal DNA lineages (Ayala, 1995; Cann *et al.*, 1987; Thomson *et al.*, 2000). However, the so-called “mitochondrial Eve” and “Y-chromosome Adam” need not be the most recent common ancestors of humanity. Relatedness does not require individuals to share an unbroken matrilineal or patrilineal lineage. Instead, two individuals are related if they share at least one direct ancestor (i.e. the same individual appears in the pedigrees of both individuals). The organismal MRCA is defined here as the most recent individual that is in the family tree of every present-day individual. This biparental definition captures the colloquial meaning of common ancestry. As family trees of present-day individuals are traced backwards in time, the likelihood of common ancestry increases. The number of ancestors in the pedigree of a single present-day individual can be used to calculate biparental coalescence times for an entire population (Chang, 1999). These methods yield estimates of global coalescence times as low as 33 generations for panmictic populations (Chang, 1999) and 76 generations for subdivided populations (Rohde *et al.*, 2004).

Organismal lineages coalesce much faster than gene genealogies. Population size *t* generations ago is defined as *N*_{t}, with present-day population size equal to *N*_{0}. Gene trees coalesce on the order of 2*N*_{0} generations, while organismal lineages in randomly mating populations coalesce on the order of log_{2} *N _{o}* generations (Chang, 1999). This difference in time scales is because genes are inherited uniparentally and organismal ancestry is biparental (every individual has a mother and a father). However, the genetic contribution of the organismal MRCA to present-day individuals can be quite small and two individuals need not inherit the same genes from a shared ancestor (Hein, 2004; Matsen and Evans, 2008). Note that biparental coalescence times are much less variable than uniparental coalescence times (Chang, 1999). Additionally, the existence of a MRCA does not imply that only a single pair of individuals were alive: other individuals existed at the time of the MRCA (Ayala, 1995).

Population structure, such as inbreeding, influences the timing of the MRCA of humanity. Mating is rarely panmictic and inbreeding is ubiquitous in natural populations (Hedrick and Kalinowski, 2000; Keller and Waller, 2002). Regional estimates of consanguinity (second cousin or closer mating) range from less than 1% to greater than 50% (Bittles, 2001). Inbreeding is defined here as positive assortative mating with respect to heredity, and does not refer to incidental matings between close relatives in small populations. Inbreeding causes the number of direct ancestors in a pedigree to differ from 2* ^{t}* (where

In this paper, rates of pedigree growth are found for multiple types of inbreeding. This information is then combined with observed levels of inbreeding to estimate the timing of the most recent common ancestor of humanity. By incorporating inbreeding, estimates of *T _{MRCA}* become more realistic. Inbreeding results in increased biparental coalescence times.

A population of diploid individuals with discrete generations is modeled. Population sizes are finite, but large enough that sex ratios do not differ appreciably from 1:1. Inbreeding is the sole exception to random mating within demes. Thus, the Wright-Fisher model of population genetics is modified to include inbreeding and biparental inheritance. Within demes, individuals preferentially mate with relatives. A proportion of matings involve siblings, a proportion of matings involve first cousins, a proportion of matings involve second cousins, etc. This type of inbreeding is independent of population size. The global population is composed of a number of demes connected by migration. Under this formulation global population structure can be represented via an evolutionary graph (as per Rohde *et al.*, 2004).

Levels of inbreeding are assumed to be uncorrelated within families (i.e. an individual’s probability of inbreeding is independent of the type of mating of his or her parents). If inbred pairings are clustered within certain families, those families will have smaller pedigrees. This phenomenon is a ramification of individuals being double-counted within pedigrees. To fully describe such a process would require a large transition matrix (where each row and column represents a particular level of inbreeding). Consequently, mathematical tractability is retained by assuming that mating patterns are uncorrelated in families.

Organismal coalescence times hinge upon the number of ancestors (*A _{t}*) relative to population size (

Previous studies compute global coalescence times from the number of ancestors in a pedigree (Chang, 1999; Rohde *et al.*, 2004). This approach can be extended to situations where the number of direct ancestors of each individual differs from two, such as species that reproduce both sexually and asexually (Donnelly *et al.*, 1999; Hein *et al.*, 2005). Similarly, inbreeding causes the rate of pedigree growth to differ from two. How fast do pedigrees grow for different types of inbreeding? This question is answered by creating pedigrees where every mating pair shares the same level of relatedness. These pedigrees grow deterministically at a rate determined by the type of inbreeding. For example, pedigrees where every mating involves first cousins grow more slowly than pedigrees where each mating involves second cousins.

Each generation ancestors are added using a two-step algorithm. First, parents are generated for every individual at the current pedigree depth. Secondly, each instance of inbreeding involves the removal of a pair of putative ancestors and the closing of hereditary loops. The type of inbreeding determines the size and shape of hereditary loops. For example, first cousin mating results in shared grandparents and hereditary loops span that two generations. This algorithm is iterated backwards in time, resulting in multiple generations of ancestors. The greater the level of inbreeding, the more slowly pedigrees expand in size. Inbred pedigrees are characterized by repeating motifs. Let *g* denote the number of generations separating an inbreeding pair from their most recent common ancestor (*g* = 2 for first cousins). Pedigrees for regular systems of mating are constructed, where every individual within a pedigree shares the same, uniform level of inbreeding (Figure 1). This enables the rate of pedigree growth for a particular type of inbreeding to be found.

Uniform pedigrees for different types of inbreeding. Males are represented by squares, females with circles. The proband is indicated with an arrow. Diagonal lines correspond to trans-generation matings. A. Sibling mating, no growth of pedigree size. **...**

The recursive nature of uniform pedigrees allows the number of individuals at a particular depth of a pedigree to be computed from other levels of the family tree. These recursion equations recapitulate the pedigree construction algorithm. The number of ancestors *t* generations ago is equal to two times the number of ancestors *t*−1 generations ago, minus the number of ancestors *t*−*g* generations ago.

$${A}_{t}=2{A}_{t-1}-{A}_{t-g}$$

(1)

Iteration of Equation 1 gives the exact number of ancestors *t* generations ago for different types of inbreeding (Table 1). Looking backwards in time, the ratio of the number of ancestors in consecutive generations quickly converges to the mating-specific rate of pedigree growth (*r _{g}*). For regular systems of mating with

$${r}_{g}=\frac{{A}_{t}}{{A}_{t-1}}$$

(2)

Algebraic manipulation of Equations 1 and 2 (see Appendix A) indicates that mating-specific rates of pedigree growth are equal to the largest root of:

$${({r}_{g})}^{g-1}({r}_{g}-2)+1=0$$

(3)

Values of *r _{g}* in Table 1 are calculated numerically from Equation 3.

Pedigrees grow more slowly if matings span generations. For example, matings might involve older males and younger females. In this scenario the number of generations within a pedigree differs from the number of “generations” of absolute time that have transpired. Uniform pedigrees with trans-generation mating can be constructed by modifying step one of the pedigree construction algorithm. Here, each individual has one parent that is a single “generation” older and second parent that is two “generations” older. Trans-generation mating causes pedigrees to become stretched (see diagonal lines in Figure 1). The number of ancestors *t* generations ago for a trans-generation outbred pedigree follows below. Note that this recursion equation generates terms of the Fibonacci sequence.

$${A}_{t}={A}_{t-1}-{A}_{t-2}$$

(4)

Although pedigrees for regular levels of inbreeding are generated via a time-backward algorithm, subsequent proofs and computer simulations require time-forward growth rates. With a few restrictions, it is possible to construct regular inbred pedigrees in a time-forward manner. First, each individual is required to have two offspring (one female, one male). Second, a single inbreeding event occurs *g* generations in the future for each individual. For example, two of an individual’s eight great-grandchildren mate in second cousin pedigrees (*g*=3, see Figure 2). Over multiple generations, this results in fewer descendants relative to outbred expectations. Third, pedigrees are constructed so that lines do not cross. An example of a pedigree created using these rules is shown in Figure 2. Let *τ* denote the number of generations in the future and *D _{τ}* denote the number of descendants at time

Pedigree growth rates for regular systems of mating are the same forward and backwards in time. All matings involve second cousins (*g*=3). The proband is indicated with an arrow, and individuals that are not direct descendants of the proband are labeled **...**

$${D}_{\tau}=2{D}_{\tau -1}-{D}_{\tau -g}$$

(5)

Note that this recursion equation has the same form as Equation 1. This indicates that the number of relatives in regular inbred pedigrees grows at the same rate whether one looks forward or backward in time.

Real populations contain a mixture of different types of inbreeding. The mean rate of pedigree growth for a population () will be used to calculate biparental coalescence times. In large populations depends on the proportion of each type of mating and mating-specific rates of pedigree growth. This is because pedigree sizes are large and we have assumed that levels of inbreeding are uncorrelated within families. Let *p _{g}* denote the expected proportion of matings that involve a particular type of inbreeding. Realized proportions of each type of mating follow a multinomial distribution. Looking backwards in time, the number of ancestors in a pedigree grows geometrically. In only a few generations the number of direct ancestors becomes quite large. The Law of Large Numbers indicates that if

$$\overline{r}=\sum _{g=0}^{\infty}{p}_{g}{r}_{g}$$

(6)

The mean rate of pedigree growth can be calculated by plugging *p _{g}* and

Biparental coalescence times depend on the relative sizes of ancestral pedigrees and past populations. Consider a single individual who lived *t* generations ago and a single present-day individual. The probability that they are related is:

$$P(\text{past}\phantom{\rule{0.16667em}{0ex}}\text{and}\phantom{\rule{0.16667em}{0ex}}\text{present}\phantom{\rule{0.16667em}{0ex}}\text{individuals}\phantom{\rule{0.16667em}{0ex}}\text{are}\phantom{\rule{0.16667em}{0ex}}\text{related})=\frac{{A}_{t}}{{N}_{t}}$$

(7)

For a past individual to be a common ancestor, they must be related to all *N _{0}* individuals living in the present.

$$P(\text{specific}\phantom{\rule{0.16667em}{0ex}}\text{past}\phantom{\rule{0.16667em}{0ex}}\text{individual}\phantom{\rule{0.16667em}{0ex}}\text{is}\phantom{\rule{0.16667em}{0ex}}\text{not}\phantom{\rule{0.16667em}{0ex}}\text{a}\phantom{\rule{0.16667em}{0ex}}\text{common}\phantom{\rule{0.16667em}{0ex}}\text{ancestor})=1-{\left(\frac{{A}_{t}}{{N}_{t}}\right)}^{{N}_{0}}$$

(8)

Each of the *N _{t}* individuals living in the past is a potential common ancestor, and the probability that each is a common ancestor is assumed to be independent. The probability that a common ancestor existed

$$P(\text{a}\phantom{\rule{0.16667em}{0ex}}\text{common}\phantom{\rule{0.16667em}{0ex}}\text{ancestor}\phantom{\rule{0.16667em}{0ex}}\text{exists}\phantom{\rule{0.16667em}{0ex}}t\phantom{\rule{0.16667em}{0ex}}\text{generations}\phantom{\rule{0.16667em}{0ex}}\text{ago})=1-{\left(1-{\left(\frac{{A}_{t}}{{N}_{t}}\right)}^{{N}_{0}}\right)}^{{N}_{t}}$$

(9)

However, the existence of a common ancestor *t* generations ago does not imply that a common ancestor first appeared *t* generations ago. The cumulative probability that no biparental coalescence has occurred in *t* −1 generations is:

$$P(\text{no}\phantom{\rule{0.16667em}{0ex}}\text{common}\phantom{\rule{0.16667em}{0ex}}\text{ancestor}\phantom{\rule{0.16667em}{0ex}}\text{in}\phantom{\rule{0.16667em}{0ex}}t-1\phantom{\rule{0.16667em}{0ex}}\text{generations})=\underset{{\left(1-{\left(\frac{{A}_{t}}{{N}_{i}}\right)}^{{N}_{0}}\right)}^{{N}_{i}}}{\overset{}{i=1t-1}}$$

(10)

If there has been no biparental coalescence in *t* −1 generations but a common ancestor existed *t* generations ago, then the MRCA existed *t* generations ago.

$$P(\text{MRCA}\phantom{\rule{0.16667em}{0ex}}\text{existed}\phantom{\rule{0.16667em}{0ex}}t\phantom{\rule{0.16667em}{0ex}}\text{generations}\phantom{\rule{0.16667em}{0ex}}\text{ago})=1-{\left(1-{\left(\frac{{A}_{t}}{{N}_{t}}\right)}^{{N}_{0}}\right)}^{{N}_{t}}\times \underset{{\left(1-{\left(\frac{{A}_{t}}{{N}_{i}}\right)}^{{N}_{0}}\right)}^{{N}_{i}}}{\overset{}{i=1t-1}}$$

(11)

Exponents in Equation 11 involve population size (which is assumed to be large). The probability that there exists a MRCA approaches unity when the ratio of *A _{t}* to

The number of ancestors *t* generations ago is approximately equal to the mean rate of pedigree growth raised to the *t* power.

$${A}_{t}\approx {\overline{r}}^{t}$$

(12)

However, Equation 12 is inexact for two reasons. First, the number of ancestors in a pedigree must be an integer. It takes *g*+1 generations for inbreeding to modify the size of a pedigree (see Figure 1). Even if a pedigree grows geometrically at a rate of 1.6180, the first generation in the past must include two parents. Thus, Equation 12 underestimates *A _{t}* when

Equations for *T _{MRCA}* rely upon the fact that a common ancestor exists when the number of ancestors is close to population size (Chang, 1999). Chang found that biparental coalescence times for outbred populations are given by

$${T}_{\mathit{MRCA},\mathit{outbred}}={log}_{2}{N}_{t}$$

(13)

Wiuf and Hein argue that Chang’s reasoning can be extended to situations where family trees do not double every generation (Donnelly *et al.*, 1999; Hein *et al.*, 2005). Inbreeding is one such scenario, suggesting that the base 2 logarithm of Equation 13 can be replaced by a different logarithm for inbred populations. Here, an equation for the biparental coalescence time of a single, inbred population is derived and tested via multiple approaches. First, the number of ancestors relative to population size yields an estimate of *T _{MRCA}*. Computer simulations subsequently verify this approximation. In addition, Chang’s 1999 proof is extended to cases where pedigrees do not double every generation in (see Appendix B).

An approximation of *T _{MRCA}* can be found by setting the number of ancestors equal to population size (

$${N}_{0}={N}_{t}{\overline{w}}^{t}$$

(14)

Algebraic manipulation of Equation 13 yields:

$${N}_{t}={N}_{0}{\overline{w}}^{-t}$$

(15)

Equation 12 gives the number of ancestors *t* generations ago, and Equation 15 gives population size *t* generations ago. By definition, *t* = T_{MRCA} at the time of the most recent common ancestor. Setting *A _{t}* equal to

$$\overline{r}{\overline{w}}^{{T}_{\mathit{MRCA}}}={N}_{0}$$

(16)

Taking the base logarithm of both sides of the above equation gives an approximation of *T _{MRCA}*.

$${T}_{\mathit{MRCA}}={log}_{\overline{r}\overline{w}}{N}_{o}$$

(17)

The validity of Equation 17 is verified by Monte Carlo simulations (see Table 3). These time-forward MATLAB (Mathworks, 2005) simulations begin with a single individual (labeled *I*) at time *t*. The number of descendants over time is modeled as a Galton-Watson process, and the number of offspring per individual is assumed to be a Poisson random variable with a mean of *.* Parentage of offspring in subsequent generations is assigned with the restriction that each individual can have at most two parents that are connected to the putative common ancestor (i.e. double counting is allowed). This process is iterated until the lineage of individual *I* dies out or the entire population in the current generation is related to individual *I*. The number of generations for the entire population to share common ancestry is recorded. As each of the *N _{t}* individuals living at time

In a previous study (Rohde *et al.*, 2004), evolutionary graphs were used to compute global coalescence times and this method is repeated here. Graphs involve a collection of nodes (demes) that are connected via edges (with connections indicating that migration can occur between a pair of demes). The radius (*R*) of a graph is the length of the longest path from the center of a graph to the perimeter. Looking forward in time, a common ancestor first appears in a central deme. It takes log* _{}N_{o}* generations for every individual in the initial deme to be related to the common ancestor. As this is occurring, individuals from the source deme migrate to adjacent demes and establish lineages. Common ancestry spreads outwards via migration to encompass the entire graph. This takes R+1 pulses, and Equation 17 gives the duration of each pulse. However, individuals related to the common ancestor can emigrate before every individual in a source deme is related to the common ancestor. This is modeled by treating one of the pulses of common ancestry as a partial pulse. The parameter Δ quantifies the effect of migratory head starts, and ranges from zero to one. Migratory head starts depend on both the connectivity of central nodes and the size of graphs. If there are multiple paths of length

$$\mathrm{\Delta}=max\left(\frac{\nu -1}{\nu},\frac{1}{R}\right)$$

(18)

Computer simulations (see below) assess the suitability of Equation 18. Population sizes change over time, and every deme is assumed to have the same population size. Looking forward in time, the proportion of individuals in a single deme that are related to the common ancestor increases at a rate of each generation. Pulse times for populations follow from Equation 17, and migration rates are assumed to be one migrant per generation (*Nm* = 1). Molecular data indicates that rates of human gene flow are at least this high (Santos et al., 1997). After common ancestry fills the central node, it expands via a number of pulses equal to the radius of the graph (i.e. it takes R + Δ pulses for the entire population to become related). Considering outbred populations, Rohde *et al*. found

$${T}_{\mathit{global}\phantom{\rule{0.16667em}{0ex}}\mathit{MRCA},\mathit{outbred}}=(R+\mathrm{\Delta}){log}_{2}{N}_{0}$$

(19)

Rohde et al. demonstrated via probabilistic analysis that Equation 19 provides a good estimate of global coalescence times for large *N* (see supplemental information of Rohde et al., 2004). Note that Equations 19 and 20 are heuristic approximations that are justified by subsequent computer simulations. Incorporating inbreeding and changing population size yields the following equation for the global coalescence time:

$${T}_{\mathit{global}\phantom{\rule{0.16667em}{0ex}}\mathit{MRCA}}=(R+\mathrm{\Delta}){log}_{\overline{r}\overline{w}}{N}_{o}$$

(20)

As per Rohde *et al*., global population structure is represented by simple 10-node graph (*R* = 3, Δ = 0.333). Here, nodes refer to sub-Saharan Africa, North Africa, Eurasia, Northeast Asia, North America, Greenland, South America, Indonesia, Australia, and Oceania (Figure 4). Real-world population structures are much more complex than this graph. However, use of a simple 10-node graph allows comparisons with previous studies.

Global population structure viewed as an evolutionary graph. As per “Modeling the recent common ancestry of all living humans” (Rohde *et al.*, 2004), the graph has 10 nodes, a radius of 3, and a diameter of 5. This graph has two centers **...**

Forward time Monte Carlo simulations allow the validity of Equation 20 to be assessed. Computer simulations were coded in MATLAB (Mathworks, 2005). Each simulation run involves deterministic growth of family trees within demes and stochastic migration between demes. These simulations quantify the effects of graph structure and the number of migrants per generation (Table 4). Simulations were run 1000 times for each set of parameter values (*R*, Δ, *Nm*, *, ,* and *N _{t}*). See Supplemental Material for MATLAB code.

Inbred pedigrees for regular systems of mating are shown in Figure 1. The number of ancestors at each level of a family tree and mating-specific rates of pedigree growth (*r _{g}*) are listed Table 1. Pedigrees in which every mating involves siblings contain two individuals each generation. Qualitative differences exist between first cousin and second cousin pedigrees. First cousin matings result in pedigrees that increase linearly as a function of time and the second cousin matings result in pedigrees that increase geometrically as a function of time. The more outbred the type of mating, the closer the rate of pedigree growth is to two. Rates of pedigree growth for second, third, fourth, and fifth cousin matings are 1.6180, 1.8393, 1.9276 and 1.9659, respectively.

Some forms of inbreeding (uncle-niece, aunt-nephew, first cousins once removed, etc.) involve individuals that belong to different generations. In these cases two separate processes reduce the number of ancestors in a pedigree: inbreeding and trans-generation mating. When every mating involves unrelated individuals separated by one generation, the number of ancestors follows the Fibonacci sequence and the rate of pedigree growth is 1.6180 (see Figure 1 and Table 1). When individuals mate later in life, pedigrees are vertically stretched, and the number of ancestors grows a reduced rate. Conversely, when individuals mate early in life the number of ancestors grows at an increased rate. Ancestral pedigrees grow at reduced rates when a single recent ancestor, as opposed to two recent ancestors, is shared (i.e. half-sibling vs. full sibling mating). When only a single recent ancestor is shared, rates of pedigree growth are one level more outbred than in Figure 3A. For example, half-first cousin pedigrees grow at the rate of full-second cousin pedigrees (1.6180 in both cases).

Fibonacci *n*-step constants, decay of heterozygosity, and rates of pedigree growth. A. Proportion of heterozygosity retained (*λ*_{g}) and rates of pedigree growth (*r*_{g}) for multiple levels of inbreeding (*g*). Shared colors indicate values that are multiples **...**

Intriguingly, the golden ratio (*ϕ*, 1.6180) appears in pedigrees where every mating involves second cousins. While the presence of this number in the growth rate an inbred pedigree may seem unexpected, *ϕ* is often associated with recursive patterns. The ratio of consecutive terms of a Fibonacci sequence converges to *ϕ*, as does the rate of pedigree growth for X chromosome and haplodiploid pedigrees (Crow and Kimura, 1970; Livio, 2002). A generalized form of Fibonacci numbers involves summing *n* subsequent terms to generate the next term of a sequence. The ratio of subsequent terms converges to the Fibonacci *n*-step constants. The tribonacci (*n* = 3), tetranacci (*n* = 4), and pentanacci (*n* = 5) constants are equal to 1.8393, 1.9276, and 1.9659, respectively (Vajda, 1989). Inspection of Table 1 indicates that rates of growth for third, fourth, and fifth cousin pedigrees are Fibonacci *n*-step constants.

What are realistic levels of inbreeding? Historical data (Bittles and Egerbladh, 2005) yields =1.827 for a population in rural Sweden (Table 2). A large proportion of matings in this population are trans-generation. While other natural populations are expected to have different levels of inbreeding, this data indicates that ancestral pedigrees cannot be assumed to double in size each generation.

The effect of inbreeding on coalescence times is greater in large populations. This is because it takes longer for *A*_{t} to approach *N*_{t} in large populations, and pedigree size changes are compounded over multiple generations. is much less than two when inbreeding is prevalent and/or a high proportion of matings are trans-generation. Both of these conditions occur in natural populations, suggesting that previous studies underestimate the actual *T _{MRCA}*.

Approximately 206 million individuals lived 1500 years ago (Cavalli-Sforza *et al.*, 1994). Since then, the global population size of humanity has multiplied 32-fold. Population increases of 6% per generation are consistent with generation times of 25 years ( = 1.06).

All humanity shares common ancestry in the last few thousand years. Equation 20 gives *T _{MRCA}* as a function of global population structure, the mean rate of pedigree growth, mean fitness, and current population size. Global coalescence times for outbred populations are 90.2 generations. Incorporating inbreeding ( =1.827) yields biparental coalescence times of 102.5 generations. Thus, inbreeding increases

Using Equation 20, a general relationship for the relative coalescence times of inbred and outbred populations is obtained.

$$\frac{{T}_{\mathit{inbred}}}{{T}_{\mathit{outbred}}}=\frac{(R+\mathrm{\Delta}){log}_{\overline{r}\overline{w}}{N}_{0}}{(R+\mathrm{\Delta}){log}_{2\overline{w}}{N}_{0}}={log}_{\overline{r}\overline{w}}2\overline{w}$$

(21)

The ratio of inbred to outbred coalescence times is 1.1501 for constant-sized populations ( = 1.827, = 1.00), and 1.1369 for growing populations ( = 1.827, = 1.06). Equation 21 indicates that the relative *T _{MRCA}* of inbred populations is independent of deme size. In addition, graph size does not affect this ratio.

The Fibonacci sequence also appears in equations for the decay of heterozygosity under regular systems of mating (Jennings, 1914; Wright, 1921). Heterozygosity decays geometrically in inbred populations (Crow and Kimura, 1970), with the parameter *λ*_{g} indicating the proportion of heterozygosity retained each generation (Figure 3B). Note that these regular systems of mating in Figure 3B involve multiple instances of inbreeding (e.g. double-first cousin and quadruple-second cousin matings), while regular systems of mating in Figure 3C involve single instances of inbreeding (e.g. single-first cousin and single-second cousin matings). Interestingly, values of *r*_{g} and *λ*_{g} are related. Rates of pedigree growth are twice the proportion of heterozygosity that is retained, but with an offset of two generations (Figure 3A). For example, matings between second cousins result in pedigrees that grow at a rate of 1.6180, two times the proportion of heterozygosity retained each generation by sibling mating (0.8090).

$${r}_{g}=2{\lambda}_{g-2}$$

(22)

Equation 17 is a reasonable approximation for the time until the most recent common ancestor of a single deme (see Table 3). As confirmed by computer simulations, inbred populations have longer coalescence times than outbred populations. In both cases, it takes only a few generations for the entire population to share common ancestry. Discrepancies between analytic approximations and computer simulations arise because Equation 12 underestimates the number of ancestors when *t* is small, and it overestimates the number of ancestors when *t* is large.

Some parameter values fail to result in biparental coalescence. When the product of and is less than one, Equation 17 yields negative coalescence times. Looking backwards in time, this refers to a scenario where population size “expands” faster than an individual’s pedigree grows. However, populations where <1 are highly unlikely to be observed. This is because tends to be slightly less than two, and tends to be close to one.

Inbred populations exhibit greater homozygosity, and recessive lethality leads to increased mortality (Bittles and Neel, 1994; Charlesworth and Charlesworth, 1999). Populations with inbreeding depression have longer coalescence times than populations where such selection is absent, as there is a reduction in present-day individuals relative to the number of ancestors. Note, however, that greater reproductive success has been observed for 3^{rd} and 4^{th} cousins than outbred pairs in Iceland (Helgason *et al.*, 2008).

It is notable Equations 17 and 20 do not incorporate Wright’s inbreeding coefficient (*ƒ*), where *ƒ* is defined to be as the probability that two alleles in an individual are identical by descent (Crow and Kimura, 1970). This is because multiple systems of mating can have the same value of *ƒ* (Nordborg and Krone, 2002; Pollak, 1987; Slatkin, 1991). For example, populations where every mating involves first cousins possess the same inbreeding coefficient as populations where 25% of matings involve siblings and 75% of matings are outbred (*ƒ* = 0.0625). These two scenarios lead to different numbers of ancestors as a function of time. This indicates that multiple types of inbreeding must explicitly be considered when calculating the time until the MRCA of an inbred population (instead of using a summary statistic like *ƒ*).

Biparental coalescence times are remarkably recent (within the last 2500 years). This occurs regardless of the amount of inbreeding. It is likely that Figure 4 underestimates the true complexity of global population structure. However, increasing the size of this graph 4-fold still results in common ancestors who lived in a post-agricultural world. Increasing the number of demes also has the side effect of decreasing within-deme coalescence times. This is because population sizes of individual demes are inversely proportional to the number of demes. Population structure affects both uniparental and biparental coalescence times, and gene trees coalescence much slower than organismal lineages. A migrant has a good chance to become one of the organismal common ancestors of their new deme (so long as they manage to establish a foothold and leave a significant number of great grandchildren). However, the probability that a migrant will become the mtDNA or Y-chromosome common ancestor of a new deme is low (1/*N _{females,t}* or 1/

The net effect of inbreeding is an increase in global coalescence times. This is a direct consequence of smaller rates of pedigree growth for inbred populations. Inbreeding’s effect on global coalescence times is greatest in graphs composed of a relatively few number of large nodes, and is minimized in graphs composed of a large number of small nodes. Somewhat paradoxically, randomly selected pairs of individuals are more likely to be closely related in inbred populations. Randomly selected individuals in an inbred population exhibit greater variance in the degree of relatedness than randomly selected individuals in a panmictic population. When inbreeding is present, subsets of a population share recent common ancestry but coalescence times for the entire population are lengthened.

It is worth noting that the global coalescence times in this paper are crude estimates (as global levels of inbreeding are likely to differ from Swedish estimates and fine-scale population structure is ignored). In addition, high variance in male reproductive success exists in real-world populations (Segurel *et al.*, 2008). Sex-specific variance in reproductive success requires numbers of male and female ancestors to be tracked separately. This scenario causes there to be a reduction in effective population size, leading to a reduction in coalescence times. If levels of inbreeding change over time, generation-specific values of must be calculated. The number of ancestors *t* generations ago is equal to the product of generation-specific pedigree growth rates. This causes the overall rate of pedigree growth to be equal to the geometric mean of generation-specific rates of pedigree growth.

Ultimately, global coalescence times hinge upon within-deme population structure and between-deme patterns of migration. The number of ancestors in a family tree increases very quickly even when there is significant inbreeding. While inbreeding within local populations pushes the global *T _{MRCA}* back a significant number of years, the qualitative conclusions of previous studies hold. Present-day individuals share common ancestors that lived in the relatively recent past. Exceptions involve isolated demes. However, the majority of humanity is connected via migration. Coalescence times in this paper are most sensitive to the radius of the population graph (i.e. how far the central deme is from peripheral demes), and the number of individuals per deme at the time of the MRCA. Future estimates of biparental coalescence times will benefit from the inclusion of inbreeding, finer resolution evolutionary graphs, and more accurate migratory data. In the words of Alex Haley: “When you start about family, about lineage and ancestry, you are talking about every person on earth” (Marmon, 1977).

I thank members of Stony Brook University’s Department of Ecology and Evolution, A. Bittles, J. Crow, J. Flowers, L. Jung, A. Onstine, P. Ralph, J. True, S. Yeh, and R. Yukilevich for stimulating discussions and constructive criticism during the preparation of this manuscript. Additional thanks are directed towards my own recent ancestors. This work was supported by an NIH Predodoctoral Training Grant (5 T32 GM007964-24).

Each term of Equation 1 can be divided by *A _{t}*

$$\frac{{A}_{t}}{{A}_{t-1}}=\frac{2{A}_{t-1}}{{A}_{t-1}}-\frac{{A}_{t-g}}{{A}_{t-1}}$$

(A.1)

Substituting Equation 2 gives:

$${r}_{g}=2-\frac{{A}_{t-g}}{{A}_{t-1}}$$

(A.2)

*A _{t}*

$${r}_{g}=2-{{r}_{g}}^{-(g-1)}$$

(A.3)

After some algebra:

$${({r}_{g})}^{g-1}({r}_{g}-2)+1=0$$

(A.4)

The following appendix sketches how the proof of Theorem 1 in (Chang, 1999) can be extended to situations where pedigrees do not double every generation. Note that these arguments are only heuristic, as a complete extension of Chang’s proof is beyond the scope of this paper. Chang’s five-stage proof uses base 2 logarithms because outbred pedigrees double in size every generation. A number of differences arise when the base 2 logarithm of Chang’s proof is replaced with a different logarithm. However, the central logic of this proof remains unchanged. Stage-specific details are listed below.

For consistency with (Chang, 1999), the notation in this section differs from the main body of the paper. Population size at the time of the MRCA is represented by *n* and *t* is the number of generations since the MRCA. Equation 15 allows coalescence times in Appendix B to be translated into the notation used in the main body of this paper:

$${log}_{\overline{r}}n={log}_{\overline{r}\overline{w}}{N}_{o}$$

(A.5)

is assumed to be between 1.6180 and 2, and per-generation changes in population size are assumed to be small (i.e. is close to one). As will be shown below, it takes approximately log* _{} n* generations for an entire population to become related to the most recent common ancestor. Thus, biparental coalescence times follow

Start with a single individual labeled *I*. The number of descendants of individual *I* can be modeled as a Galton-Watson process. Here, the number of descendants of every individual is distributed as a Poisson random variable with a mean of *.* Note that successful establishment of a lineage is more likely in growing populations (there is a reduced chance of having zero descendants in a given generation). How long does it take for an individual to establish a lineage with at least (log* _{} n*)

During Stage 2, the number of descendents of individual *I* increases geometrically to at least *n ^{β}* (where

$${T}_{\text{Stage}2}\approx {log}_{\overline{r}}{n}^{\beta}=\beta {log}_{\overline{r}}n$$

(A.6)

Here, the number of individuals related to the MRCA increases from *n ^{β}* to

$${g}_{t+1}={g}_{t}(\overline{r}-{g}_{t})$$

(A.7)

As ≥ 1.618 and *g _{t}* ≤ 0.5, −

$${T}_{\text{Stage}3}\le {log}_{\overline{r}-{g}_{t}}\frac{n/2}{{n}^{\beta}}=(1-\beta ){log}_{\overline{r}-{g}_{t}}n-{log}_{\overline{r}-{g}_{t}}2$$

(A.8)

Note that log_{ −gt} < 5. Converting Equation A.8 to a base logarithm:

$${T}_{\text{Stage}3}<5[(1-\beta ){log}_{\overline{r}}n-{log}_{\overline{r}}2]$$

(A.9)

Recall that *β* is between 0 and 1. This indicates that Stage 3 will take at most ~ 5log* _{} n* generations. Setting

This stage is unchanged from Chang’s proof. Here, the perspective switches from the fraction of the population that is related to individual *I* to the fraction that is unrelated to individual *I* (defined as *B _{t}*). Since individuals that are unrelated to individual

$$({B}_{t+1}{B}_{t},{B}_{t-1},\dots )~\frac{1}{n}\mathit{Bin}(n,{B}_{t}^{2})$$

(A.10)

Note that this equation is the same as Equation 7 in (Chang, 1999). From Equation A.10, *B _{t}* is expected to square each generation. The net result of Stage 4 is that

The final stage takes a single generation, and is unchanged from Chang’s proof. If most of the population is related to individual *I* (i.e. *B _{t}* ≤

The upper and lower bounds of the time until the MRCA follow (Chang, 1999). The important point here is that the majority of time is spent in Stages 2 and 3. Since Stages 2 and 3 take order log* _{} n* generations, the coalescence times take approximately log

**Publisher's Disclaimer: **This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

- Ayala FJ. The Myth of Eve: Molecular Biology and Human Origins. Science. 1995;270:1930–6. [PubMed]
- Bittles AH. Consanguinity and its relevance to clinical genetics. Clinical Genetics. 2001;60:89–98. [PubMed]
- Bittles AH, Neel JV. The costs of human inbreeding and their implications for variations at the DNA level. Nat Genet. 1994;8:117–21. [PubMed]
- Bittles AH, Egerbladh I. The Influence of Past Endogamy and Consanguinity on Genetic Disorders in Northern Sweden. Annals of Human Genetics. 2005;69:549–558. [PubMed]
- Cann RL, Stoneking M, Wilson AC. Mitochondrial DNA and human evolution. Nature. 1987;325:31–6. [PubMed]
- Cavalli-Sforza LL, Menozzi P, Piazza A. The History and Geography of Human Genes. Princeton University Press; Princeton, N.J: 1994.
- Chang JT. Recent common ancestors of all present-day individuals. Adv Appl Probab. 1999;31:1002–1026.
- Charlesworth B, Charlesworth D. The genetic basis of inbreeding depression. Genet Res. 1999;74:329–40. [PubMed]
- Crow JF, Kimura M. An Introduction to Population Genetics Theory. Harper and Row; New York: 1970.
- Donnelly P, Wiuf C, Hein J, Slatkin M, Ewens WJ, Kingman JFC. Discussion: recent common ancestors of all present-day individuals. Adv Appl Probab. 1999;31:1027–1035.
- Hedrick PW, Kalinowski S. Inbreeding depression and conservation biology. Ann Rev Ecol Syst. 2000;31:139–162.
- Hein J. Human evolution: pedigrees for all humanity. Nature. 2004;431:518–9. [PubMed]
- Hein J, Schierup MH, Wiuf C. Gene Genealogies, Variation and Evolution: A Primer in Coalescent Theory. Oxford University Press; Oxford: 2005.
- Helgason A, Palsson S, Gudbjartsson DF, Kristjansson T, Stefansson K. An association between the kinship and fertility of human couples. Science. 2008;319:813–6. [PubMed]
- Jennings HS. Formulae for the results of inbreeding. Am Nat. 1914;48:693–696.
- Jobling MA, Hurles M, Tyler-Smith C. Human Evolutionary Genetics: Origins, Peoples and Disease. Garland Science; New York: 2003.
- Keller LF, Waller DM. Inbreeding effects in wild populations. Trends Ecol Evol. 2002;17:230–241.
- Kuhner MK, Yamato J, Felsenstein J. Maximum likelihood estimation of population growth rates based on the coalescent. Genetics. 1998;149:429–34. [PubMed]
- Livio M. The golden ratio: the story of phi, the world’s most astonishing number. Broadway Books; New York: 2002.
- Marmon W. Haley’s Rx: Talk, Write, Reunite. Time. 1977;109:72–75.
- Mathworks, MATLAB 7, Mathworks, Natick, MA 2005.
- Matsen FA, Evans SN. To what extent does genealogical ancestry imply genetic ancestry? Theor Popul Biol. 2008;74:182–90. [PubMed]
- Moore LT, McEvoy B, Cape E, Simms K, Bradley DG. A Y-Chromosome Signature of Hegemony in Gaelic Ireland. Am J Hum Genet. 2006;78:334–8. [PubMed]
- Nordborg M, Krone S. Separation of time scales and convergence to the coalescent in structured populations. In: Slatkin M, Veuille M, editors. Modern Developments in Theoretical Population Genetics. Oxford University press; Oxford: 2002. pp. 194–232.
- Ohno S. The Malthusian parameter of ascents: what prevents the exponential increase of one’s ancestors? Proc Natl Acad Sci USA. 1996;93:15276–8. [PubMed]
- Pollak E. On the theory of partially inbreeding finite populations. I. Partial selfing. Genetics. 1987;117:353–60. [PubMed]
- Rohde DL, Olson S, Chang JT. Modeling the recent common ancestry of all living humans. Nature. 2004;431:562–6. [PubMed]
- Santos EJ, Epplen JT, Epplen C. Extensive gene flow in human populations as revealed by protein and microsatellite DNA markers. Hum Hered. 1997;47:165–72. [PubMed]
- Segurel L, Martinez-Cruz B, Quintana-Murci L, Balaresque P, Georges M, Hegay T, Aldashev A, Nasyrova F, Jobling MA, Heyer E, Vitalis R. Sex-specific genetic structure and social organization in Central Asia: insights from a multi-locus study. PLoS Genet. 2008;4:e1000200. [PMC free article] [PubMed]
- Shoumatoff A. The Mountain of Names. Simon & Shuster; New York: 1985.
- Slatkin M. Inbreeding coefficients and coalescence times. Genet Res. 1991;58:167–75. [PubMed]
- Thomson R, Pritchard JK, Shen P, Oefner PJ, Feldman MW. Recent common ancestry of human Y chromosomes: Evidence from DNA sequence data. Proc Natl Acad Sci. 2000;97:7360–7365. [PubMed]
- Vajda S. Fibonacci and Lucas Numbers and the Golden Section: Theory and Application. Ellis Horwood Ltd; Chichester: 1989.
- Wright S. Systems of mating. II. The effects of inbreeding on the genetic composition of a population. Genetics. 1921;6:124–143. [PubMed]
- Zerjal T, Xue Y, Bertorelle G, Wells RS, Bao W, Zhu S, Qamar R, Ayub Q, Mohyuddin A, Fu S, Li P, Yuldasheva N, Ruzibakiev R, Xu J, Shu Q, Du R, Yang H, Hurles ME, Robinson E, Gerelsaikhan T, Dashnyam B, Mehdi SQ, Tyler-Smith C. The genetic legacy of the Mongols. Am J Hum Genet. 2003;72:717–21. [PubMed]

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |