The biological assumptions underlying autocorrelated relaxed clocks warrant closer examination. The first assumption, that mutation rates are closely linked to heritable traits, receives support from studies of mammalian data. Nevertheless, even these trends differ between mammalian mitochondrial and nuclear genomes (Welch et al. 2008
). Studies of other taxa have indicated that the correlations observed in mammals cannot be readily extended to other metazoans (Thomas et al. 2006
; Lanfear et al. 2007
Another pertinent question, related to the first biological assumption, concerns the taxonomic scale of the sequence data that are being analysed with autocorrelated relaxed-clock models. In a study of the cytochrome b
gene in mammals, Nabholz et al. (2008)
found that family-level categorization explained the greatest amount of rate variation. Overall, one would predict the highest degree of autocorrelation to be observed at intermediate levels of the taxonomic hierarchy. At one extreme, we would expect a very high degree of underlying rate autocorrelation within a species, such that any rate variation among lineages would be primarily due to stochastic, uninherited factors (Drummond et al. 2006
); indeed, many population genetic and coalescent-based approaches assume a strict molecular clock.
At the other end of the continuum, autocorrelation in life-history traits (or any other factor that might be strongly correlated with mutation/substitution rates) would inevitably break down at higher taxonomic levels (Gittleman & Kot 1990
; Drummond et al. 2006
). The magnitude of the differences among lineages would be amplified if there is very incomplete taxon sampling, and the degree of autocorrelation would decrease as taxon sampling becomes more sparse. In cases where a dataset consists of distantly related taxa, there is little reason to expect any appreciable autocorrelation among the rates on different lineages. Consequently, it would be difficult to defend the validity of making a priori
assumptions about the manner in which the rates vary among lineages.
Autocorrelated rate methods have been used to analyse sequences at various taxonomic scales, ranging from viral sequences obtained from a single host, to sequences acquired from representatives of different kingdoms of life. To investigate the trends in the application of autocorrelated relaxed clocks, a survey was conducted of all 46 studies that used such methods and were published in Royal Society journals prior to November 2008 ().
Summary of all 46 studies that have used autocorrelated relaxed clocks and have been published in Royal Society journals.
The sequence data examined in these studies spanned a broad range of taxonomic levels (). Five studies analysed datasets in which the majority of nodes in the tree represented ordinal divergences or higher. At the other extreme, nine studies involved the analyses of datasets that included large numbers of sequences from conspecific individuals, with three conducted entirely at the population level.
Plot of the approximate taxonomic levels spanned by 48 datasets that have been analysed using autocorrelated relaxed-clock models. Details of the individual studies are given in table S1 in the electronic supplementary material.
For the methods of analysis to be applicable to all of these datasets, they would need to be sufficiently flexible such that they could accommodate widely varying levels of rate change and autocorrelation. For small, sparsely sampled datasets, it is doubtful whether there should be any expectation of rate autocorrelation at all.
The second assumption behind autocorrelated relaxed-clock models is that mutation and substitution rates are strongly correlated. This is reasonable for sequences that are evolving neutrally. In analyses of sequences under selection, however, such an assumption is far more questionable. This relates particularly to non-mammalian mitochondrial sequence data, of which the evolutionary history appears to have been driven substantially by adaptive evolution (Bazin et al. 2006
). If rates of adaptive substitution are not tied to inherited factors, then the presence of such substitutions can seriously weaken the link between life-history traits and substitution rates. As mentioned above, however, closely related species could experience similar selection intensities, as implied under covarion models of sequence evolution (e.g. Tuffley & Steel 1998
). The extent to which such processes could lead to rate autocorrelation among lineages is not known.
In a comprehensive study of mammals, no correlation was found between non-synonymous mitochondrial rates and life-history traits (Welch et al. 2008
). Indeed, this suggests that autocorrelated relaxed-clock models might be inappropriate for analyses of amino acid sequences. Thus, perhaps it would be desirable to employ separate autocorrelated and uncorrelated models of among-lineage rate variation for non-coding and coding or amino acid sequences, respectively.