Genomics is arguably one of the fastest evolving branches of modern science. Emerging new sequencing technologies enable analyses of large number of individuals from a species or to interrogate genetic diversity of a complex biological community. Yet, aside from the fields of microbial genomics and human population genomics, for most taxa on earth only a sparse amount of genomic resources are available.
Developing genome-scale markers from the majority of the diversity of life will have high payoff, facilitating ecological and evolutionary applications. Here we have illustrated that we can develop markers from specific genomic regions (such as non-coding, non-repetitive regions) utilizing genomic resources that are moderately divergent from target species. For example, studies estimate that the human/chimpanzee/macaque and the ring-tailed lemur has shared the last common ancestor up to 80 million years ago [32
]. Here we have generated markers from potentially neutral genomic regions in ring-tailed lemurs using primers based upon human-chimpanzee-macaque genome alignments.
As far as we are aware, data from non-coding regions of strepsirrhines are rare, and markers developed in this study have a great potential to be used in evolutionary studies of this group. Importantly, we demonstrated that we could potentially achieve genome scale marker generation by sampling genomes across the tree of life with moderate divergence times.
We chose Primates as the test group of organisms for this study because their phylogenetic relationships are well characterized [33
]. We felt that if we could replicate the known primate phylogeny using the markers we designed then we would have proof of principle that this type of marker was useful for phylogenetic analysis. Toward that end this study was successful. Using a subset of new markers, we have performed phylogenetic analyses. The resulting phylogenies from single markers were generally in accord with the accepted phylogenetic relationship among different primate species. In particular, no marker supported incorrect phylogeny with statistical significance. The only node that required several markers to be resolved was the relationship among human, chimpanzee, and gorilla, a notorious phylogenetic example that had previously been shown to require a large amount of data to be resolved [34
]. Given that there are several outstanding problems remaining in the field of primate phylogeny (e.g., [13
]), putatively neutral markers such as developed in this study potentially will be useful toward resolving these issues.
Previous work focusing on identifying conserved, ultraconserved, or lineage specific elements for their potential functionality (e.g. [37
]) have recognized the usefulness of utilizing regions located distantly from annotate genes as putatively 'neutral' standards to generate statistical 'background' for their analysis. Our approach, while seeking to identify regions of the genome that are not under functional constraint instead, have adopted such underlying logic and applied it to species whose genomic sequences are not yet available. It should be cautioned however that computational logics identifying putatively 'neutral' markers do not guarantee true neutrality: it remains as a prime challenge in modern genomics to experimentally establish neutrality or functionality of non-coding regions.
It has recently been demonstrated that protein coding regions are subject to frequent and widespread parallel evolution [6
]; therefore we attempted to choose regions that are less likely to be subject to homoplastic effects that could result in misleading phylogenies. That our results support the well-known topology of Primates (e.g. [33
]) is encouraging, because our understanding of primate phylogeny has taken over a century to get to the point is at today. There are still phylogenetic controversies (i.e. the hominoid trichotomy, the relationships among neotropical platyrrhines, and the phylogenetic placement of tarsiers) that still resist resolution besides decades of work.
Using the newly developed markers, we also observed substantial evolutionary rate variation among different primate lineages. We not only confirm the phenomenon of hominid rate slowdown [17
], but also uncovered several other intriguing patterns. In particular, we observe that since the divergence of rhesus macaque and anubis baboon (estimated to be approximately 6–8 millions of years ago, [28
]), the macaque lineage has accumulated almost twice as many mutations as the baboon lineage. Substantial rate variation between Old World monkey lineages could have contributed to earlier controversy over the degree of hominid rate slowdown. When the macaque lineage is used to compare evolutionary rates of hominids and Old World monkeys, the degree of rate slowdown is much stronger (Table ). We also observed a strong rate difference between marmoset and spider monkey, and to a lesser degree between tamarin and spider monkey. The marmoset lineage has accumulated almost 50% more mutations than the spider monkey lineage since the two lineages have split (Table ). Thus, evolutionary rate variation is a common phenomenon in putatively neutral genomic regions [18
The relative rate test shows that mutations have accumulated in the Old World monkey and New World monkey lineage at a rate significantly higher than in the hominids. We found no significant evolutionary rate difference between Old World monkeys and New World monkeys, in contrast to an earlier finding [28
]. However, we should be careful in concluding the patterns of rate variation between groups based upon data from a few lineages. First there is the issue of regional rate variation within a genome. Also, as we have witnessed above, rates can vary dramatically between closely related lineages (such as macaque and baboon). The earlier analysis on rate variation between marmoset and Old World monkeys [28
] was based upon a single, albeit long (~59.8 kbps), orthologous region. The New World monkey species used in the previous analysis was marmoset, a fast evolving lineage. Thus the previous finding of significant rate variation between Old World monkeys and New World monkeys [28
] may reflect the underlying molecular clock specific to that genomic regions and the sets of species. In this respect expanding the usage of non-coding, non-repetitive markers from many different genomic regions as developed here to other primate lineages will be highly useful to reconcile these conflicting results and to elucidate the patterns and causes of genomic neutral molecular clocks in primates.
The markers developed in this study [see Additional file 1
] should thus be of great use for ecological and evolutionary applications in primates. In particular a subset of markers from closely related species can be used as 'local' markers to elucidate recent evolutionary events, while a few markers that can be amplified throughout the evolution of primates (such as the 10 markers analyzed here) could be used as 'global' markers to analyze underlying trends in primate evolution.