|Home | About | Journals | Submit | Contact Us | Français|
Molecular dating has gained ever-increasing interest since the molecular clock hypothesis was proposed in the 1960s. Molecular dating provides detailed temporal frameworks for divergence events in phylogenetic trees, allowing diverse evolutionary questions to be addressed. The key aspect of the molecular clock hypothesis, namely that differences in DNA or protein sequence between two species are proportional to the time elapsed since they diverged, was soon shown to be untenable. Other approaches were proposed to take into account rate heterogeneity among lineages, but the calibration process, by which relative times are transformed into absolute ages, has received little attention until recently. New methods have now been proposed to resolve potential sources of error associated with the calibration of phylogenetic trees, particularly those involving use of the fossil record.
The use of the fossil record as a source of independent information in the calibration process is the main focus of this paper; other sources of calibration information are also discussed. Particularly error-prone aspects of fossil calibration are identified, such as fossil dating, the phylogenetic placement of the fossil and the incompleteness of the fossil record. Methods proposed to tackle one or more of these potential error sources are discussed (e.g. fossil cross-validation, prior distribution of calibration points and confidence intervals on the fossil record). In conclusion, the fossil record remains the most reliable source of information for the calibration of phylogenetic trees, although associated assumptions and potential bias must be taken into account.
The use of DNA sequences to estimate divergence times on phylogenetic trees (molecular dating) has gained increasing interest in the field of evolutionary biology in the past decade. The abundance of publications on the subject, the numerous alternative methods proposed and the often heated debates on various aspects of the discipline demonstrate the interest it generates. The molecular clock hypothesis was first proposed by Zuckerkandl and Pauling (1965); they proposed that differences in DNA (or protein) sequences between two species are proportional to the time elapsed since the divergence from their most recent common ancestor.
The subsequent inclusion of temporal frameworks in many evolutionary studies has influenced the way results are interpreted and significantly modified the way in which conclusions are drawn from these findings. Linking the evolution of particular morphological characters or key ecological innovations to geological, climatic or biotic events is much improved in the light of an evolutionary timescale. The development of molecular dating tools became particularly valuable to the discipline of historical biogeography; it added a temporal gauge to the directionality of events demonstrated by the topology of phylogenetic trees. Inferences on observed distribution patterns were rendered significantly more plausible under a temporal framework, even if only descriptive. Furthermore, new methods of biogeographical reconstruction have been developed such as Lagrange, which uses a likelihood framework to infer the evolution of geographical ranges and incorporates divergence times as well as constraining the connections between areas to specific times (Ree and Smith, 2008).
The rationale of the molecular clock hypothesis, that evolutionary rates are constant, was shown to be invalid in the majority of examined cases; the clock does not tick regularly. The heterogeneity of substitution rates among different lineages in a phylogenetic tree explains this irregularity (Britten, 1986) and is a result of species-specific factors such as generation time, metabolic rate, effective population size and mutation rates (see Rutschmann, 2006). The extent of influence of some such factors, however, remains in dispute (e.g. Whittle and Johnston, 2003).
Rutschmann (2006) classified the most commonly employed methods for estimating divergence times into three categories depending on how they handle rate heterogeneity, namely (1) assuming a global substitution rate (standard molecular clock); (2) correcting for rate heterogeneity (e.g. by deleting branches or incorporating several rates categories before the dating procedure), and (3) incorporating rate heterogeneity (i.e. integrating rate heterogeneity into the dating procedure using rate change models; relaxed molecular clock). The four most commonly used methods in the literature all fall into the third category; these are non-parametric rate smoothing (NPRS; Sanderson, 1997), penalized likelihood (PL; Sanderson, 2002), the Bayesian method implemented in the Multidivtime package (Thorne et al., 1998) and Bayesian evolutionary analysis by sampling trees (BEAST; Drummond and Rambaut, 2007). The first three of these methods assume rate changes between ancestral and descendant lineages are autocorrelated, i.e. that substitution rates in descendant lineages are to an extent inherited from ancestral lineages; these methods differ in the way that rate autocorrelation is handled. BEAST does not assume rate autocorrelation; instead, it samples rates from a distribution. Additional flexibility is found in BEAST in its optional tree topology requirement that can incorporate phylogenetic uncertainty, and the possibility of assigning distributions to the calibration process a priori (see below). More details on these methods and several others are available elsewhere (Rutschmann, 2006, and references therein).
Two main topics have fuelled the controversy associated with molecular clocks: these are how to handle rate heterogeneity and calibration. At its outset, the field of molecular dating was focused on circumvention of rate heterogeneity among lineages. Meanwhile calibration, the process by which relative time is transformed into absolute age (e.g. million of years) using information independent of the phylogenetic tree and its underlying data, was somewhat trivialized. This situation has changed in recent years and many studies have now addressed the numerous difficulties associated with calibration. This paper is focused on molecular clock calibration (particularly based on palaeontological data), the potential problems and source of error associated with it, and the various methods proposed to incorporate these uncertainties in molecular estimates of divergence times.
Information used to calibrate a phylogenetic tree is obtained from three principal sources: (1) geological events; (2) estimates from independent molecular dating studies; and (3) the fossil record. Information from palaeoclimatic data has also been used to calibrate trees (e.g. Baldwin and Sanderson, 1998), but its use is limited and will not be discussed further here. The fossil record is the most commonly employed source of information to calibrate phylogenetic trees and will receive most attention here.
Plate tectonics, the formation of oceanic islands of volcanic origin and the rise of mountain chains are examples of geological events that can be used to calibrate phylogenetic trees. The assignation of such calibration points to a given node assumes that the divergence at this node is the result of this new geographical barrier, through either vicariance (e.g. continental split) or dispersal (e.g. oceanic islands) events. This type of calibration must be used with care in studies examining biogeographical patterns to avoid circular reasoning. Despite appearing less prone to imprecision than the use of the fossil record, geological events have their own suite of potential and often intractable problems. The timings of continental splits are often reported as unique values, but the actual separation of two continental plates occurred over millions of years (and is a continuous process). Furthermore, as two land masses drift apart, biological exchanges between them are likely to continue for several million years depending on the dispersal abilities of the organisms involved. These two points render the use of continental splits as calibration points a choice rather difficult to justify. Similar problems can be attributed to the rise of mountain chains; these phenomena take place over a long period of time (several tens of millions of years in the case of the Andes; e.g. Garzione et al., 2008) and exchanges between each side of a new geographical barrier will continue for some time.
Species endemic to oceanic islands of volcanic origin and therefore of known age can be used to apply a maximum age constraint on the divergence between the endemic species and their closest continental relatives. This approach accounts for the likelihood that the ancestor of the island endemic species arrived at an unspecified time after the formation of the oceanic island. Present-day oceanic islands, however, might only be the most recent element of a series of oceanic island formation over time in a particular region (Heads, 2005), which would invalidate their use as reliable calibration constraints. For example, molecular dating of Galapagos endemic iguanas shows that their dispersal to the archipelago pre-dates the age of the current islands (Rassmann, 1997). Submerged islands found in the vicinity of the present day islands suggest that the Galapagos archipelago is in fact much older (10–15 Mya to 80–90 Mya) than the extant islands (Hickman and Lipps, 1985; Christie et al., 1992). A similar situation is observed in Hawaii, leading some to describe evolutionary history on this archipelago as on a ‘volcanic conveyer belt’. Calibration based on species endemism to volcanic islands can be difficult to justify and caution is advised.
The use of estimates derived from independent molecular dating studies (also referred to as secondary or indirect calibration points) is the only source of calibration information for many groups, particularly for those in which the fossil record is scarce or non-existent. The primary problem with this approach is that sources of error generated by the first dating exercise remain and are propagated and likely to be magnified in subsequent analyses. The use of secondary calibration points should be a last resort and, when used, care should be taken to include error associated with the primary molecular estimate in the subsequent analysis (e.g. using confidence intervals or standard deviation as minimum and maximum values on a given node, or using a prior distribution; see below). Failure to take this error into account can result in estimates of divergence times with broader uncertainty, and thus of little use or scientific value. The use of substitution rates from independent studies to calibrate a phylogenetic tree also falls under this category of calibration (e.g. Richardson et al., 2001) and suffers drawbacks similar to those described above.
There is general consensus that the fossil record provides by far the best information with which to transform relative time estimates into absolute ages (e.g. Magallón, 2004). As with other sources of calibration information, the use of fossilized remains has disadvantages and is subject to various sources of errors. Nevertheless, promising methods recently proposed attempt to tackle these issues. The focus of the following discussion is on calibration using palaeontological data, but many of the aspects addressed below are also applicable to geological events and secondary calibration points.
Sources of error in molecular inference of divergence time are numerous, including phylogenetic uncertainty, substitution noise and saturation, rate heterogeneity (among lineages, over time and between DNA regions), incomplete taxon sampling and incorrect branch length optimization (e.g. Sanderson and Doyle, 2001; Magallón and Sanderson, 2005). The calibration process is not exempt from potential sources of error either; these include erroneous fossil age estimates, the incompleteness of the fossil record and the placement of fossils on phylogenetic trees. Although often difficult to circumvent, much progress has recently been made in mitigating these factors.
Generally, a taxon's first appearance in the fossil record represents the time it became abundant rather than the time of its emergence (Magallón, 2004). Considering estimates from the fossil record as actual ages would underestimate the true age of the clade to which the fossil is assigned (Benton and Ayala, 2003; Conti et al., 2004). Older fossils assigned to a given group are likely to be discovered and to push back in time the earliest occurrence of a lineage; thus the age of a fossil is generally treated as a minimum constraint in calibration procedures (e.g. Benton and Ayala, 2003; Near et al., 2005). This means that the clade on which the constraint is applied cannot be younger than the fossil.
Fossil remains can be dated by use of stratigraphic correlations or radiometric dating. Uncertainty is introduced here as a result of any unreliability in the age assessment itself and the imprecision of the estimate when the fossil is assigned to a particular geological division (or stratum). For example, a fossil assigned to the Palaeocene can theoretically have any age between 55·8 and 65·5 Mya. Any time assigned to a calibration point within this epoch would be technically appropriate, but would result in significantly different estimates for the other nodes in the tree. To counter this, because the fossil represents a minimum age, it is preferable to use the upper boundary of the geological division (in this case 55·8 Mya) in a molecular dating study, once again as a minimum constraint. Some programs permit specification of a prior distribution on the age of a node which takes into account the uncertainty associated with the dating of a fossil (Drummond and Rambaut, 2007; see below).
The fragmentary nature of the fossil record and lineage extinction have important consequences for the accurate placement of fossil calibration points. Once a fossil has been accurately assigned to a group of extant taxa based on one or more synapomorphies, it is placed on the phylogenetic tree either with the stem group or with the crown group (Fig. 1). The crown group comprises all the extant taxa of a clade and their most recent common ancestor plus all the extinct taxa that diverged after the origin of the most recent common ancestor of the living taxa. The stem group comprises all the members of the crown group (extinct and extant) plus all the extinct taxa that diverged since the split of the crown group from its closest living relative (Fig. 1). In any rooted phylogenetic tree, all internal nodes are both stem group nodes and crown group nodes; the definition of stem and crown group nodes is relative to the other nodes in the tree (e.g. in Fig. 1, node 2 is the crown group node of clade B and the stem group node of clade A). Because the fossil record is fragmentary, one can never be certain that a given fossil will possess features that place it in the crown group rather than along the stem lineage leading to the crown group. Consequently, there can be large and difficult to quantify discrepancies between the time of divergence of a lineage, the time of appearance of a synapomorphy (a particular feature characterizing a clade) and the age of the oldest known fossil exhibiting this feature (Magallón, 2004; Fig. 1). This highlights the importance of taking the most conservative options (i.e. options resistant to subsequent changes that would invalidate the assumptions regarding the position of a fossil) in calibration by use of fossils as minimum constraints on the stem group node.
An exception to the rule of using fossils as minimum constraints can be applied to fossilized pollen grains. Pollen grains have a much higher fossilization potential than any other plant organs, but not all plant groups will have an extensive pollen fossil record or possess palynological features assigned with confidence to extant taxa. Tricolpate pollen grains (those with three apertures or colpi), for example, are unique to the eudicots in plants, and age estimates for these fossils place them in the Barremian and Aptian of the early Cretaceous (130–112 Mya; e.g. Doyle and Hotton, 1991); earlier occurrence is thought to be very unlikely. The abundance and widespread distribution of early tricolpate pollen fossils coupled to their easily identified features has led to their frequent use as a maximum constraint or fixed age in molecular dating of angiosperms (e.g. Anderson et al., 2005; Magallon and Castillo, 2009). It is only in such rare cases that fossils can be used as maximum constraints or fixed ages without serious risk of underestimating molecular ages.
The incompleteness of the fossil record also leads inevitably to the underestimation of node ages in a phylogenetic tree (Springer, 1995), presenting significant discrepancies between estimates obtained from the fossil record and molecular dating (e.g. Benton and Ayala, 2003). The selectivity of fossilization is largely responsible for this situation. Different plant groups (e.g. deciduous anemophilous trees are better represented in the fossil record than entomophilous/zoophilous herbs) and structures (e.g. pollen is more easily preserved than flowers) have different preservation potential (Herendeen and Crane, 1995); thus the fossil record is biased towards groups and structures more conducive to fossilization.
Several methods have been developed by which to estimate the extent of incompleteness of the fossil record. Earlier studies proposed statistical approaches to calculate confidence intervals on stratigraphic ranges; the earliest occurrence of a given group in the fossil record is estimated by use of the number of known fossils and the number and size of gaps in the stratigraphic column (Strauss and Sadler, 1989; Marshall, 1990, 1994). Because these methods do not take into account the quality and density of the fossil record, Marshall (1997) proposed an additional function that allows for bias linked to collecting and preservation potential. Subsequently, Foote and colleagues (e.g. Foote, 1997; Foote et al., 1999) estimated rates of extinction, origination and preservation from the fossil record to produce a measure of completeness [see Magallón (2004) for more details on these methods]. Tavaré et al. (2002) proposed a method based on an estimate of the proportion of preserved species in the fossil record and the diversification patterns of the group. More recently, Marshall (2008) developed a quantitative approach to estimate maximum age constraints of lineages using uncalibrated ultrametric trees (i.e. with relative branch length optimization) and multiple fossil calibration points. Assessing the fossil record of a group using the procedures outlined above would theoretically produce a realistic age estimate for this group. Furthermore, the resulting estimate of earliest occurrence can be used as a fixed age or maximum constraint in subsequent molecular dating studies, minimizing the uncertainty associated with the age of a given fossil.
Many early molecular dating studies used a single fossil as a calibration point; this practice is now believed to lead to strong bias in molecular age estimates (e.g. Graur and Martin, 2004; Reisz and Müller, 2004). Where possible, it is currently advocated that multiple fossils should be used in the calibration process (e.g. Conroy and Van Tuinen, 2003; Graur and Martin, 2004; Forest et al., 2005; Near et al., 2005; Benton and Donoghue, 2007; Rutschmann et al., 2007), although an extensive and reliable fossil record is not always available. However, if the intrinsic accuracy of a fossil is questionable (i.e. doubtful age estimate assigned to an extant group or representing a lineage with a large gap between its divergence and the first appearance of remains in the fossil record), it is better excluded from the analysis (see above; Near et al., 2005). Near et al. (2005) proposed a fossil cross-validation procedure that allows potentially inaccurate fossils to be identified when multiple fossils are used to calibrate a phylogenetic tree. This method compares the molecular age estimates produced by the calibration of the phylogenetic tree with one of the fossils with the age estimates from the fossil record for the other nodes used in the calibration procedure. Individual fossils that produce age estimates inconsistent with the remainder of the fossils used as calibration points are removed (Near et al., 2005) and the analysis is repeated, but including only reliable fossils as calibration points. Rutschmann et al. (2007) built on the fossil cross-validation method of Near et al. (2005) to address another problem: the multiple potential positions of a given fossil on the phylogenetic tree. They assessed the effect of the alternative positions of each fossil on the consistency of the age estimates in a set of calibration points. This allows the selection of the best position on the phylogenetic tree for each fossil given an a priori selected set of assignment possibilities. By use of this approach, no calibration information is removed from the dating procedure (contrary to the method of Near et al., 2005, in which ambiguous fossils are removed), and the method provides more precise estimates because the most coherent calibration sets produce lower standard deviation (Rutschmann et al., 2007). Concern has been raised that such cross-validation methods might lead to the exclusion or repositioning of fossils that are not necessarily misleading, but rather misinterpreted (e.g. placement, dating) or victims of bias in the dating procedure itself (Hugall et al., 2007; Parham and Irmis, 2008; Lee et al., 2009).
Lee et al. (2009) recognize the usefulness of the two methods mentioned above, but note that the imprecision surrounding the phylogenetic position of a given calibration is not calculated in these methods. They propose a new method that integrates this uncertainty. This approach allows the inclusion of fossils in a combined matrix of morphological and molecular characters analysed under a Bayesian framework, and the assessment of estimates among sampled trees based on the position of the fossil in each particular tree as determined by the analysis (Lee et al., 2009). They demonstrate that the uncertainty associated with the phylogenetic position of a fossil used as calibration point can result in molecular age estimates with large confidence intervals (Lee et al., 2009).
While the three above methods deal with uncertainty associated with the phylogenetic position of fossil calibration points, another recent method implemented in the program BEAST (Drummond and Rambaut, 2007) allows the user to include in addition a level of uncertainty on the age of a given fossil using a prior distribution. The prior distribution of the age is assigned to the most recent common ancestor of a group of taxa circumscribed by the user (Drummond et al., 2006); these prior distributions are various (e.g. normal, lognormal or uniform). Applying a normal distribution would assume that the age of this node is equally likely to be older or younger than the fossil, how much so being determined by the standard deviation specified. A normal prior distribution is more appropriate for calibration points based on estimates from independent molecular studies (secondary calibrations) and geological events such as oceanic islands, in which it can be argued that uncertainty is equally distributed on either side of the age used (the mean of the distribution). For a calibration point based on fossil remains, a lognormal prior distribution covering a longer period of time towards the past is more appropriate, allowing for uncertainty of the age estimate of the fossil and for error associated with the incompleteness of the fossil record. The method implemented in the program BEAST provides a significant improvement over other methods, particularly as it considers uncertainty associated with tree topology and calibration.
In recent literature, some authors have voiced their concerns regarding molecular dating methods in general and the calibration procedure in particular (e.g. Graur and Martin, 2004; Heads, 2005; Pulquério and Nichols, 2007). The importance of a carefully designed calibration scheme in a molecular dating study cannot be overemphasized; it is one of the most fundamental aspects of the methodology. The identification of reliable fossils is a crucial step in this procedure, but finding unequivocal fossils may prove to be a tedious task in some plant groups. The lack of fossils in a given group may prevent the use of molecular dating completely. However, fossils from closely related groups may be used as calibration points if taxa representing them are included in the phylogenetic tree, but the further the calibration point is positioned in relation to the node(s) of interest, the greater will be the uncertainty of the resulting age estimates. A good starting point in the search for fossil calibration points is the Plant Fossil Record online database maintained by the International Organisation of Palaeobotany (www.biologie.uni-hamburg.de/b-online/library/iopaleo/pfr.htm), which contains several thousand extinct taxa from both modern and extinct genera.
The development of methods addressing the potential problems affecting calibration, particularly based on fossil data, has elegantly addressed some of the criticisms mentioned above and provided new opportunities and tools for more reliable calibration of phylogenetic trees. The program BEAST (Drummond and Rambaut, 2007) is one of the most promising methods on account of its flexibility regarding uncertainty in fossil age estimates, mainly due to the dating of the fossil and the incompleteness of the fossil record. The next phase in software development for molecular dating would include programs allowing better estimation of uncertainty by incorporating fossil cross-validation procedures, taking into consideration fossil abundance data and integrating the calculation of confidence intervals on the fossil record.
The assumptions and bias inherent to aspects of the methodology are not the only obstacles to reliable and plausible timescales; dating results must be viewed in light of the information that was used to obtain them, and uncertainty around resulting age estimates must be considered. Molecular dating is a powerful tool and its use continues unabated because it offers a tantalizing and otherwise unavailable glimpse into the evolutionary history of a group.
I thank E. Lucas and two anonymous reviewers for constructive comments on earlier versions of this manuscript.