As part of a survey of British Y chromosome diversity, we recruited a set of 421 males who described themselves as British, and whose paternal grandfathers were born in Britain. The Y chromosomes of these males were typed using a set of 11 binary markers15
, including M145 (defining superhaplogroup DE), and M89 (defining superhaplogroup F). All chromosomes carried the derived allele at one or other of these two markers, with a single exception, in male GB1757, which could in principle belong to hgA, B, or C (see phylogeny in ). Further testing, including the markers M91 and M31, gave the surprising result that it belonged to hgA, within the sub-lineage A1.
Distribution of Y chromosomes belonging to haplogroup A
Haplogroup A is the deepest-rooting clade of the Y phylogeny, and shows a particularly specific localization to the African continent (), which is compatible with an African origin for modern human Y chromosomes. It constitutes 5.4% of a composite sample of 3551 Africans4,8,10,24-30
, while in non-African indigenous populations only seven cases have been described, from Turkey31
. Extensive surveys of western European populations have failed to find any examples of these chromosomes13,36-39
. The sub-haplogroup A1 was first reported in a single individual among a sample of 44 males from Mali4
. Subsequently, this scarce western African haplogroup has been found in only 25 more males (): 2/64 Moroccan Berbers24
, 3/766 African-Americans40,41
, 2/39 Mandinka from Gambia/Senegal, 1/55 Malian Dogon30
, 1/201 Cape Verde Islanders, 14/276 males from Guinea-Bissau10
, and 2/39 males from Niger (F.C. and R.S., unpublished data).
The British male carrying the hgA1 chromosome knew of no familial African connection, and he displays a typical European appearance. To investigate the relationship of his Y chromosome with African examples, we compared its 10-locus Y-STR (short tandem repeat) haplotype (see Supplementary information
) with those of ten other available hgA1 chromosomes10,24,40
(F.C. and R.S., unpublished data). shows a median-joining network of these haplotypes: they are diverse, and all are unique. Though the British haplotype is peripheral, it lies equidistant (4 mutational steps) from Niger and Guinea-Bissau haplotypes, and similar distances (2-4 steps) exist between other haplotypes in the network. This is compatible with a western African origin for the British chromosome, but does not point to a particular population. Using the British haplotype (11 loci) to search the Y Chromosome Haplotype Reference Database (http://www.yhrd.org
) finds no matches among 15,815 chromosomes worldwide, emphasising its rarity. Also, when the haplotypes of the other hgA1 chromosomes are used in similar searches, they find only self-matches in the populations from which they derive, underlining the scarcity and African-specificity of hgA1.
Diversity of Y-STR haplotypes of chromosomes belonging to haplogroup A1, and within the R surname
How long has this archetypically African Y chromosome been in England? To address this question our strategy was to seek patrilinearly related individuals who would share the haplogroup, but whose Y-STR haplotype diversity could be used to estimate a time-to-most-recent-common-ancestor (TMRCA). To do this we exploited the relationship between surnames and Y chromosome haplotypes15,42-44
, noting that the upper bound of any estimated age would be limited by the fact that hereditary English surnames did not exist prior to the eleventh century45
The hgA1-bearing male bears a locative surname, which we refer to here as R, deriving from an East Yorkshire village46
. Only 121 people carried this name in 1998 (http://www.spatial-literacy.org/uclnames
), and it still has a strong east Yorkshire focus. We recruited 18 apparently unrelated men carrying this name (or a close variant spelling, carried by 50 individuals) and typed a set of 11 binary markers and 17 Y-STRs15
, supplemented with the binary marker M31, allowing us to identify hgA1.
shows a median-joining network of 17-locus Y-STR haplotypes (see Supplementary information
) of Y chromosomes carried by 18 R-surnamed males. The chromosomes belong to three haplogroups, and include four clusters, indicating either multiple foundation or historical nonpaternity within the name. However, a total of seven of the males carry hgA1 chromosomes, belonging to three closely related Y-STR haplotypes, and, based on the rho statistic within Network, having a TMRCA of 440 ± 330 years.
As an empirical adjunct to TMRCA calculations, we undertook extensive genealogical research to ask if the seven R-surnamed males carrying hgA1 chromosomes could be connected into a single genealogy with a historically verifiable MRCA. This research resolved the males into two well-supported genealogies (), with MRCAs born in 1788 and 1789 respectively. However, although both of these ancestors were resident in Yorkshire, evidence could not be found for a familial relationship between them. Patterns of forename usage in the two genealogies are quite distinct, which argues against a very recent connection. We recruited 12 unrelated R-surnamed men from the USA, hoping that presence of hgA1 would indicate that the chromosome had been associated with the surname prior to emigration from Britain: however, none of these men carried a chromosome from this haplogroup (data not shown), so the approach was uninformative.
Genealogical relationships and Y-STR haplotypes of hgA1 R-surnamed men
Finally, we exploited a new resource of multiple novel Y-STRs47
in an attempt to refine TMRCA for the two genealogies. We typed the hgA1 chromosomes with an additional 60 Y-STRs (see Materials and Methods), bringing the total to 77. Surprisingly, this analysis revealed only one new mutation – a single repeat decrease at Y-STR DYS537 in male GB1758 (). Applying the rho statistic within Network, as described above, to the 73-locus Y-STR haplotypes (excluding the bilocal DYS385a/b and YCAIIa/b; see Supplementary information
) in the two genealogical clusters yields a TMRCA of 140 ± 80 years, which, adding the average age of the living individuals (52 years), equates to a likely oldest date of 1734 for the coalescence of the two genealogies. The TMRCA range overlaps that obtained using 17 markers (see above), and suggests that only a small number of generations separates the two genealogies from their common ancestor.