In total 84 amplicons from 15 genes were sequenced across 14 representatives of the Bovinae subfamily, including members of the Bovini and Tragelaphini tribes, which we used to infer the sequence from the most recent common ancestor (see Methods). From the 84 amplicons 83 amplified in at least one sample. The relative proportions from the amplicons that successfully amplified in each species were particularly high, ranging from 0.76 in Eland to 0.95 in Gaur (additional file 1
). Surprisingly some of the wild relatives showed better amplification (Gaur, Yak, Mithan, Banteng and Bison) than different breeds of B. taurus
(Holstein and Tuli). All sequences were submitted to Genbank and their primers and accession numbers are also available in additional file 1
Genetic distances, divergence times and phylogenetic reconstruction
Both dI and dS are generally considered neutral substitutions and are often used for discriminating between phylogenetic lineages. Table shows pairwise comparisons for dI and dS between 15 Bovinae representatives, lower and upper diagonal respectively. The minimum dI is within breeds of Indian river buffalo, followed by breeds of Domestic cow. The largest dI is between species like Eland and the Domestic cow (Table , lower diagonal). Similarly, the minimum dS is found between breeds, and the maximum dS is between Eland and other members of the Bovidae, such as Bison, Gaur, Mithan, and Domestic cow, respectively (Table , upper diagonal). Interestingly, dS is often greater than dI. Subramanian and Kumar [24
] suggest that an elevation in GC content at synonymous sites may be responsible for this phenomenon.
Pairwise comparisons between members of the Bovini tribe for the number of silent substitutions per site summarised for all genes, with intronic substitutions (dI) below the diagonal and synonymous substitutions (dS) above the diagonal.
Table presents the divergence times estimated from the pairwise comparisons between all major lineages and Hereford (B. taurus) using Nei's D (equation 5, Methods). Nei has shown that genetic distance (D) is directly related to divergence time. However, the method may be susceptible to inaccurate estimates given the incorrect mutation rate. As a result we also estimated divergence times using a relaxed molecular clock using calibration points based on best estimates of divergence times in the fossil record for Bison and Yak (2.0 and 1.7 MYA, respectively). Members of the Bovini subtribe appear to have diverged from B. taurus approximately 2–3 MYA. The Bubalina subtribe has diverged from B. taurus some 5–9 MYA, while Eland appears to have diverged from the Domestic cow some 8–14 MYA. The breeds of B. taurus show some interesting results with African cattle (Tuli) seeming to have diverged ~100,000–200,000 years ago, which may be possible if African and European cattle have in fact been separately domesticated, or if Tuli contains genes originating in B. indicus. However, as more polymorphism has been detected within Holstein than between Hereford and Holstein, there has been some difficulty estimating divergence times between these two closely related breeds, which were most likely separated only a few hundred years ago.
Table 3 Estimate of divergence times in millions of years (MY) for pairwise comparisons (dI) between Hereford and members of the Bovinae subtribe using estimates from Nei (TN) and a relaxed molecular clock method (TC) using Bison and Yak divergence dates from (more ...)
No significant differences were detected between overall base frequencies (A, T, C and G) for any member of the Bovini tribe, and substitution rates between Ancestral and Eland sequences with all members of the Bovini appear to be largely the same. A small bias was detected for transition (ti) to transversion (tv) substitutions, with a ratio of ti/tv = 2.1 (results not shown). Therefore, Kimura's two parameter model should be appropriate for phylogenetic reconstruction. However, the relatively small genetic distances detected (dS and dI < 0.05) suggest that the p-distances calculated in table should also be accurate despite the transition bias [25
Figure shows a monophyletic neighbour joining tree, deduced using Kimura's two parameter model from an alignment of ~30,000 bases and 1,800 variable sites (additional file 2
). The placement of the inferred ancestral sequence is not surprising and may indicate its accuracy. Strong support was detected for a bifurcation between representatives of the Bovina and Bubalina subtribes. Within the Bubalina subtribe there is strong support for separation of the Syncerus
genera. Within the water buffalo strong stratification is also apparent. Surprisingly, one of the river buffalo (BubB) has a closer relationship to swamp buffalo (BubC) than to other members of the same species, the separation between buffalo may be associated with geographic patterns as two of our buffalo samples (Ind and Mur) originated from India, while the swamp buffalo are generally restricted to East Asia and are known to hybridise with river buffalo. Within the Bovina subtribe a distinct separation between Bison, Yak and Domestic cattle with the cattle of Indochina (Gaur/Mithan/Banteng) was found. However, the nodes leading to a Bison/Yak/Domestic cattle clade show less support with the Yak/Domestic cattle clade showing less than 60% support by bootstrap resampling. The lack of certainty in the tree is not surprising given the short branch lengths leading to the divergences among Domestic cattle-Yak-Bison-Banteng-Mithan/Gaur. Previous work has placed the lineage leading to Bison and Yak as a separate group, which diverged from the Banteng/Gaur/Mithan clade [6
], and more recently as a divergence from the B. taurus
Figure 1 Neighbour joining analysis for members of the Bovini tribe (Anc: Ancient, Ban: Banteng, Bis: Bison, BubB & BubC: Asian buffalo, water and swamp type, Ela: Eland, Gau: Gaur, Her: Hereford, Ind & Mur: Indian water buffalo, Mit: Mithan, Hol: (more ...)
To detect if recent gene flow or ancient polymorphisms were responsible for differences between our and other recently published phylogenies, we analysed all of the segregating sites from figure for the presence of anomalies. We reconstructed a number of phylogenetic trees to identify the tree that minimises the number of aberrant sites, which we defined as any site that must have arisen due to a double mutation or lineage sorting of an ancient polymorphism (see Methods). We identified 133 aberrant sites, of which 111 were not tree specific and were aberrant in all trees leaving 22 informative sites. From the 111 non-tree specific aberrant sites 53 ancient sites are segregating within both the Bovina and Bubalina lineages. A legend to all aberrant sites, their position and the relative age (i.e. approximately how long they have been segregating in the clade) is provided in additional file 3
To examine these aberrant sites in more detail we have analysed a number of phylogenetic trees that have been suggested due to the poor bootstrapping confidence in figure , and trees presented in recent work by Verkaar et al and Hassanin and Ropiquet [6
]. The phylogenetic tree that generates the smallest number of anomalies (125) places Yak off the lineage leading to Domestic cattle and Bison off the lineage leading to the Indochinese cattle ((B. taurus, Yak), (Bison, Indochinese cattle)); however, similar numbers (126) are also seen for the tree that places Bison and Yak as a lineage of the Indochinese clade ((B. taurus), ((Yak, Bison) Indochinese cattle)), which is only worse by one aberrant site. Therefore, some evidence exists for the trees recently proposed by Hassanin and Ropiquet [6
] and Verkaar et al. [11
]. However, it appears that a number of aberrant sites are generating an association between Yak and Domestic cattle, which has confounded the phylogenetic reconstruction. Removing all aberrant sites from the analysis leaves fewer substitutions to accurately estimate genetic distances between members of the Bovina subtribe and the tree is less certain (data not shown).
Introgression and ancient polymorphism
To identify the source of the aberrant substitutions within the Bovini tribe, we compared all variable sites in the alignment to find sites shared with B. taurus at normal and aberrant positions (Table ). We examined these positions within Gaur, Mithan, Banteng, Bison and Yak for similarities to Domestic cattle over extended haplotypes resolved from aberrant and all variable sites. We found that Yak shares the largest number of alleles in common with Domestic cattle for all variable and aberrant sites (Table ). Perhaps the most interesting finding is the high number of identical haplotypes Yak and Domestic cattle share over haplotypes resolved from the aberrant sites per amplicon (Table , Table ). When compared to haplotypes resolved from the all variable sites, Yak and Domestic cattle share the largest number in common. However, there are relatively few haplotypes that are similar for any of the species (Gaur, Mithan, Banteng, Bison and Yak) and Domestic cattle, and this is probably a function of the species specific mutations that have accumulated per amplicon since divergence from the MRCA.
Summary of the number of substitutions in the total alignment that are the same as Bos taurus compared between species at all variable sites (n = 1786), and at all aberrant sites (n = 133).
Summary of taurine haplotypes at aberrant sites for each amplicon determined from Hereford sequence and the species sharing the same haplotype (*).
To determine if the high proportion of haplotypes, resolved from aberrant sites, shared between Yak and Domestic cattle may be evidence of recent introgression, we looked for extended haplotypes that are shared between species. Thus, the extent of linkage between neighbouring amplicons and genes on the same chromosome was examined. Table shows the B. taurus haplotype per amplicon from all aberrant sites and the species that share the same haplotype. A number of haplotypes are shared with all species. Many of the haplotypes that are based on one aberrant site (PIT1_e1) may be similar by chance and therefore may not be very informative. However, by extending haplotypes to adjacent exons and genes more power to detect recent hybridisation should be available. But as haplotypes are extended to closely linked amplicons or neighbouring genes on the same chromosome this similarity rapidly disappears (Table ).
In a further attempt to identify recent introgression, sequence diversity estimates were examined between neighbouring amplicons and genes. Figure shows plots of sequence divergence at noncoding sites (dI) for Hereford vs Yak (A), Hereford vs Holstein (B) and Yak vs Gaur (C). A similar pattern is seen to that described in table . A number of amplicons show similar amounts of sequence divergence between Hereford vs Yak as is seen between Hereford vs Holstein (NRIP1 and PIT1). However, genetic diversity at closely linked amplicons remains low between breeds of B. taurus for closely spaced genes on chromosome 1 (PIT1, ITGB5) and chromosome 2 (GMEB1 and EGF), which suggests that these animals have recently shared a common ancestor. In contrast, comparisons for Domestic cattle vs Yak, and Yak vs Gaur (Figure and , respectively) rarely show low sequence diversity over similar linked regions, and thus, genetic associations are most likely not the result of recent introgression.
Plots of sequence similarity for noncoding substitutions per site (dI) for pairwise comparisons between Hereford vs Yak (A), Hereford vs Holstein (B) and Yak vs Gaur (C) for all amplicons all plots are in chromosomal order.
An examination of the river and swamp buffalo samples from Southeast Asia (BubB and BubC) has identified large segments of chromosome 1 and 2 that are identical. Thus BubB has a closer relationship to swamp buffalo (BubC) than to other members of river buffalo that originate from India, which may suggest some level of introgression in these two samples as large portions of chromosomes 1 and 2 in BubB appear to be of swamp buffalo origin (Figure ).
Plot of sequence similarity (dI) for genes in chromosomal order between BubB (River buffalo) and BubC (Swamp buffalo), showing high degrees of similarity over large regions of chromosomes 1 and 2.
Classifying aberrant sites: age and role of selection
A number of aberrant sites were identified that were inconsistent with any of the phylogenetic trees suggested from either molecular or morphological characteristics. In total 33 aberrant sites occurred in coding regions, of which 6 are Ka and 27 are Ks from 11,563 synonymous and 3,482 nonsynonymous sites, which resulted in a dN/dS ratio of 0.065. From the 33 aberrant sites 10 appear to be polymorphic since the common ancestor with buffalo, while the remaining 23 would have been polymorphic in the common ancestor of the Bos and Bison genera. Of the 10 ancient polymorphisms 3 are Ka and 7 are Ks. Therefore these 10 ancient polymorphisms were maintained for millions of years. They must have been polymorphic in the MRCA to the Bovina and Bubalina subtribes some 5–8 MYA and appear to have persisted to at least the divergence of the Bovina subtribe around 2 MYA. The small values estimated for the dN/dS ratios calculated suggest that these polymorphisms may have been neutral.