The Y-chromosome haplogroup N-M231 (Hg N) is distributed widely in eastern and central Asia, Siberia, as well as in eastern and northern Europe. Previous studies suggested a counterclockwise prehistoric migration of Hg N from eastern Asia to eastern and northern Europe. However, the root of this Y chromosome lineage and its detailed dispersal pattern across eastern Asia are still unclear. We analyzed haplogroup profiles and phylogeographic patterns of 1,570 Hg N individuals from 20,826 males in 359 populations across Eurasia. We first genotyped 6,371 males from 169 populations in China and Cambodia, and generated data of 360 Hg N individuals, and then combined published data on 1,210 Hg N individuals from Japanese, Southeast Asian, Siberian, European and Central Asian populations. The results showed that the sub-haplogroups of Hg N have a distinct geographical distribution. The highest Y-STR diversity of the ancestral Hg N sub-haplogroups was observed in the southern part of mainland East Asia, and further phylogeographic analyses supports an origin of Hg N in southern China. Combined with previous data, we propose that the early northward dispersal of Hg N started from southern China about 21 thousand years ago (kya), expanding into northern China 12–18 kya, and reaching further north to Siberia about 12–14 kya before a population expansion and westward migration into Central Asia and eastern/northern Europe around 8.0–10.0 kya. This northward migration of Hg N likewise coincides with retreating ice sheets after the Last Glacial Maximum (22–18 kya) in mainland East Asia.
Sakha – an area connecting South and Northeast Siberia – is significant for understanding the history of peopling of Northeast Eurasia and the Americas. Previous studies have shown a genetic contiguity between Siberia and East Asia and the key role of South Siberia in the colonization of Siberia.
We report the results of a high-resolution phylogenetic analysis of 701 mtDNAs and 318 Y chromosomes from five native populations of Sakha (Yakuts, Evenks, Evens, Yukaghirs and Dolgans) and of the analysis of more than 500,000 autosomal SNPs of 758 individuals from 55 populations, including 40 previously unpublished samples from Siberia. Phylogenetically terminal clades of East Asian mtDNA haplogroups C and D and Y-chromosome haplogroups N1c, N1b and C3, constituting the core of the gene pool of the native populations from Sakha, connect Sakha and South Siberia. Analysis of autosomal SNP data confirms the genetic continuity between Sakha and South Siberia. Maternal lineages D5a2a2, C4a1c, C4a2, C5b1b and the Yakut-specific STR sub-clade of Y-chromosome haplogroup N1c can be linked to a migration of Yakut ancestors, while the paternal lineage C3c was most likely carried to Sakha by the expansion of the Tungusic people. MtDNA haplogroups Z1a1b and Z1a3, present in Yukaghirs, Evens and Dolgans, show traces of different and probably more ancient migration(s). Analysis of both haploid loci and autosomal SNP data revealed only minor genetic components shared between Sakha and the extreme Northeast Siberia. Although the major part of West Eurasian maternal and paternal lineages in Sakha could originate from recent admixture with East Europeans, mtDNA haplogroups H8, H20a and HV1a1a, as well as Y-chromosome haplogroup J, more probably reflect an ancient gene flow from West Eurasia through Central Asia and South Siberia.
Our high-resolution phylogenetic dissection of mtDNA and Y-chromosome haplogroups as well as analysis of autosomal SNP data suggests that Sakha was colonized by repeated expansions from South Siberia with minor gene flow from the Lower Amur/Southern Okhotsk region and/or Kamchatka. The minor West Eurasian component in Sakha attests to both recent and ongoing admixture with East Europeans and an ancient gene flow from West Eurasia.
mtDNA; Y chromosome; Autosomal SNPs; Sakha
Human origins and migration models proposing the Horn of Africa as a prehistoric exit route to Asia have stimulated molecular genetic studies in the region using uniparental loci. However, from a Y-chromosome perspective, Saudi Arabia, the largest country of the region, has not yet been surveyed. To address this gap, a sample of 157 Saudi males was analyzed at high resolution using 67 Y-chromosome binary markers. In addition, haplotypic diversity for its most prominent J1-M267 lineage was estimated using a set of 17 Y-specific STR loci.
Saudi Arabia differentiates from other Arabian Peninsula countries by a higher presence of J2-M172 lineages. It is significantly different from Yemen mainly due to a comparative reduction of sub-Saharan Africa E1-M123 and Levantine J1-M267 male lineages. Around 14% of the Saudi Arabia Y-chromosome pool is typical of African biogeographic ancestry, 17% arrived to the area from the East across Iran, while the remainder 69% could be considered of direct or indirect Levantine ascription. Interestingly, basal E-M96* (n = 2) and J-M304* (n = 3) lineages have been detected, for the first time, in the Arabian Peninsula. Coalescence time for the most prominent J1-M267 haplogroup in Saudi Arabia (11.6 ± 1.9 ky) is similar to that obtained previously for Yemen (11.3 ± 2) but significantly older that those estimated for Qatar (7.3 ± 1.8) and UAE (6.8 ± 1.5).
The Y-chromosome genetic structure of the Arabian Peninsula seems to be mainly modulated by geography. The data confirm that this area has mainly been a recipient of gene flow from its African and Asian surrounding areas, probably mainly since the last Glacial maximum onwards. Although rare deep rooting lineages for Y-chromosome haplogroups E and J have been detected, the presence of more basal clades supportive of the southern exit route of modern humans to Eurasian, were not found.
North East Europe harbors a high diversity of cultures and languages, suggesting a complex genetic history. Archaeological, anthropological, and genetic research has revealed a series of influences from Western and Eastern Eurasia in the past. While genetic data from modern-day populations is commonly used to make inferences about their origins and past migrations, ancient DNA provides a powerful test of such hypotheses by giving a snapshot of the past genetic diversity. In order to better understand the dynamics that have shaped the gene pool of North East Europeans, we generated and analyzed 34 mitochondrial genotypes from the skeletal remains of three archaeological sites in northwest Russia. These sites were dated to the Mesolithic and the Early Metal Age (7,500 and 3,500 uncalibrated years Before Present). We applied a suite of population genetic analyses (principal component analysis, genetic distance mapping, haplotype sharing analyses) and compared past demographic models through coalescent simulations using Bayesian Serial SimCoal and Approximate Bayesian Computation. Comparisons of genetic data from ancient and modern-day populations revealed significant changes in the mitochondrial makeup of North East Europeans through time. Mesolithic foragers showed high frequencies and diversity of haplogroups U (U2e, U4, U5a), a pattern observed previously in European hunter-gatherers from Iberia to Scandinavia. In contrast, the presence of mitochondrial DNA haplogroups C, D, and Z in Early Metal Age individuals suggested discontinuity with Mesolithic hunter-gatherers and genetic influx from central/eastern Siberia. We identified remarkable genetic dissimilarities between prehistoric and modern-day North East Europeans/Saami, which suggests an important role of post-Mesolithic migrations from Western Europe and subsequent population replacement/extinctions. This work demonstrates how ancient DNA can improve our understanding of human population movements across Eurasia. It contributes to the description of the spatio-temporal distribution of mitochondrial diversity and will be of significance for future reconstructions of the history of Europeans.
The history of human populations can be retraced by studying the archaeological and anthropological record, but also by examining the current distribution of genetic markers, such as the maternally inherited mitochondrial DNA. Ancient DNA research allows the retrieval of DNA from ancient skeletal remains and contributes to the reconstruction of the human population history through the comparison of ancient and present-day genetic data. Here, we analysed the mitochondrial DNA of prehistoric remains from archaeological sites dated to 7,500 and 3,500 years Before Present. These sites are located in North East Europe, a region that displays a significant cultural and linguistic diversity today but for which no ancient human DNA was available before. We show that prehistoric hunter-gatherers of North East Europe were genetically similar to other European foragers. We also detected a prehistoric genetic input from Siberia, followed by migrations from Western Europe into North East Europe. Our research contributes to the understanding of the origins and past dynamics of human population in Europe.
To better define the structure and origin of the Bulgarian paternal gene pool, we have examined the Y-chromosome variation in 808 Bulgarian males. The analysis was performed by high-resolution genotyping of biallelic markers and by analyzing the STR variation within the most informative haplogroups. We found that the Y-chromosome gene pool in modern Bulgarians is primarily represented by Western Eurasian haplogroups with ∼ 40% belonging to haplogroups E-V13 and I-M423, and 20% to R-M17. Haplogroups common in the Middle East (J and G) and in South Western Asia (R-L23*) occur at frequencies of 19% and 5%, respectively. Haplogroups C, N and Q, distinctive for Altaic and Central Asian Turkic-speaking populations, occur at the negligible frequency of only 1.5%. Principal Component analyses group Bulgarians with European populations, apart from Central Asian Turkic-speaking groups and South Western Asia populations. Within the country, the genetic variation is structured in Western, Central and Eastern Bulgaria indicating that the Balkan Mountains have been permeable to human movements. The lineage analysis provided the following interesting results: (i) R-L23* is present in Eastern Bulgaria since the post glacial period; (ii) haplogroup E-V13 has a Mesolithic age in Bulgaria from where it expanded after the arrival of farming; (iii) haplogroup J-M241 probably reflects the Neolithic westward expansion of farmers from the earliest sites along the Black Sea. On the whole, in light of the most recent historical studies, which indicate a substantial proto-Bulgarian input to the contemporary Bulgarian people, our data suggest that a common paternal ancestry between the proto-Bulgarians and the Altaic and Central Asian Turkic-speaking populations either did not exist or was negligible.
The chamois, distributed over most of the medium to high altitude mountain ranges of southern Eurasia, provides an excellent model for exploring the effects of historical and evolutionary events on diversification. Populations have been grouped into two species, Rupicapra pyrenaica from southwestern Europe and R. rupicapra from eastern Europe. The study of matrilineal mitochondrial DNA (mtDNA) and biparentally inherited microsatellites showed that the two species are paraphyletic and indicated alternate events of population contraction and dispersal-hybridization in the diversification of chamois. Here we investigate the pattern of variation of the Y-chromosome to obtain information on the patrilineal phylogenetic position of the genus Rupicapra and on the male-specific dispersal of chamois across Europe.
We analyzed the Y-chromosome of 87 males covering the distribution range of the Rupicapra genus. We sequenced a fragment of the SRY gene promoter and characterized the male specific microsatellites UMN2303 and SRYM18. The SRY promoter sequences of two samples of Barbary sheep (Ammotragus lervia) were also determined and compared with the sequences of Bovidae available in the GenBank. Phylogenetic analysis of the alignment showed the clustering of Rupicapra with Capra and the Ammotragus sequence obtained in this study, different from the previously reported sequence of Ammotragus which groups with Ovis. Within Rupicapra, the combined data define 10 Y-chromosome haplotypes forming two haplogroups, which concur with taxonomic classification, instead of the three clades formed for mtDNA and nuclear microsatellites. The variation shows a west-to-east geographical cline of ancestral to derived alleles.
The phylogeny of the SRY-promoter shows an association between Rupicapra and Capra. The position of Ammotragus needs a reinvestigation. The study of ancestral and derived characters in the Y-chromosome suggests that, contrary to the presumed Asian origin, the paternal lineage of chamois originated in the Mediterranean, most probably in the Iberian Peninsula, and dispersed eastwards through serial funding events during the glacial-interglacial cycles of the Quaternary. The diversity of Y-chromosomes in chamois is very low. The differences in patterns of variation among Y-chromosome, mtDNA and biparental microsatellites reflect the evolutionary characteristics of the different markers as well as the effects of sex-biased dispersal and species phylogeography.
Although human Y chromosomes belonging to haplogroup R1b are quite rare in Africa, being found mainly in Asia and Europe, a group of chromosomes within the paragroup R-P25* are found concentrated in the central-western part of the African continent, where they can be detected at frequencies as high as 95%. Phylogenetic evidence and coalescence time estimates suggest that R-P25* chromosomes (or their phylogenetic ancestor) may have been carried to Africa by an Asia-to-Africa back migration in prehistoric times. Here, we describe six new mutations that define the relationships among the African R-P25* Y chromosomes and between these African chromosomes and earlier reported R-P25 Eurasian sub-lineages. The incorporation of these new mutations into a phylogeny of the R1b haplogroup led to the identification of a new clade (R1b1a or R-V88) encompassing all the African R-P25* and about half of the few European/west Asian R-P25* chromosomes. A worldwide phylogeographic analysis of the R1b haplogroup provided strong support to the Asia-to-Africa back-migration hypothesis. The analysis of the distribution of the R-V88 haplogroup in >1800 males from 69 African populations revealed a striking genetic contiguity between the Chadic-speaking peoples from the central Sahel and several other Afroasiatic-speaking groups from North Africa. The R-V88 coalescence time was estimated at 9200–5600 kya, in the early mid Holocene. We suggest that R-V88 is a paternal genetic record of the proposed mid-Holocene migration of proto-Chadic Afroasiatic speakers through the Central Sahara into the Lake Chad Basin, and geomorphological evidence is consistent with this view.
Y chromosome haplogroups; human migrations; Holocene; Africa; Chadic-speaking populations
The genetic impact associated to the Neolithic spread in Europe has been widely debated over the last 20 years. Within this context, ancient DNA studies have provided a more reliable picture by directly analyzing the protagonist populations at different regions in Europe. However, the lack of available data from the original Near Eastern farmers has limited the achieved conclusions, preventing the formulation of continental models of Neolithic expansion. Here we address this issue by presenting mitochondrial DNA data of the original Near-Eastern Neolithic communities with the aim of providing the adequate background for the interpretation of Neolithic genetic data from European samples. Sixty-three skeletons from the Pre Pottery Neolithic B (PPNB) sites of Tell Halula, Tell Ramad and Dja'de El Mughara dating between 8,700–6,600 cal. B.C. were analyzed, and 15 validated mitochondrial DNA profiles were recovered. In order to estimate the demographic contribution of the first farmers to both Central European and Western Mediterranean Neolithic cultures, haplotype and haplogroup diversities in the PPNB sample were compared using phylogeographic and population genetic analyses to available ancient DNA data from human remains belonging to the Linearbandkeramik-Alföldi Vonaldiszes Kerámia and Cardial/Epicardial cultures. We also searched for possible signatures of the original Neolithic expansion over the modern Near Eastern and South European genetic pools, and tried to infer possible routes of expansion by comparing the obtained results to a database of 60 modern populations from both regions. Comparisons performed among the 3 ancient datasets allowed us to identify K and N-derived mitochondrial DNA haplogroups as potential markers of the Neolithic expansion, whose genetic signature would have reached both the Iberian coasts and the Central European plain. Moreover, the observed genetic affinities between the PPNB samples and the modern populations of Cyprus and Crete seem to suggest that the Neolithic was first introduced into Europe through pioneer seafaring colonization.
Since the original human expansions out of Africa 200,000 years ago, different prehistoric and historic migration events have taken place in Europe. Considering that the movement of the people implies a consequent movement of their genes, it is possible to estimate the impact of these migrations through the genetic analysis of human populations. Agricultural and husbandry practices originated 10,000 years ago in a region of the Near East known as the Fertile Crescent. According to the archaeological record this phenomenon, known as “Neolithic”, rapidly expanded from these territories into Europe. However, whether this diffusion was accompanied or not by human migrations is greatly debated. In the present work, mitochondrial DNA –a type of maternally inherited DNA located in the cell cytoplasm- from the first Near Eastern Neolithic populations was recovered and compared to available data from other Neolithic populations in Europe and also to modern populations from South Eastern Europe and the Near East. The obtained results show that substantial human migrations were involved in the Neolithic spread and suggest that the first Neolithic farmers entered Europe following a maritime route through Cyprus and the Aegean Islands.
Numerous studies of human populations in Europe and Asia have revealed a concordance between their extant genetic structure and the prevailing regional pattern of geography and language. For native South Americans, however, such evidence has been lacking so far. Therefore, we examined the relationship between Y-chromosomal genotype on the one hand, and male geographic origin and linguistic affiliation on the other, in the largest study of South American natives to date in terms of sampled individuals and populations. A total of 1,011 individuals, representing 50 tribal populations from 81 settlements, were genotyped for up to 17 short tandem repeat (STR) markers and 16 single nucleotide polymorphisms (Y-SNPs), the latter resolving phylogenetic lineages Q and C. Virtually no structure became apparent for the extant Y-chromosomal genetic variation of South American males that could sensibly be related to their inter-tribal geographic and linguistic relationships. This continent-wide decoupling is consistent with a rapid peopling of the continent followed by long periods of isolation in small groups. Furthermore, for the first time, we identified a distinct geographical cluster of Y-SNP lineages C-M217 (C3*) in South America. Such haplotypes are virtually absent from North and Central America, but occur at high frequency in Asia. Together with the locally confined Y-STR autocorrelation observed in our study as a whole, the available data therefore suggest a late introduction of C3* into South America no more than 6,000 years ago, perhaps via coastal or trans-Pacific routes. Extensive simulations revealed that the observed lack of haplogroup C3* among extant North and Central American natives is only compatible with low levels of migration between the ancestor populations of C3* carriers and non-carriers. In summary, our data highlight the fact that a pronounced correlation between genetic and geographic/cultural structure can only be expected under very specific conditions, most of which are likely not to have been met by the ancestors of native South Americans.
In the largest population genetic study of South Americans to date, we analyzed the Y-chromosomal makeup of more than 1,000 male natives. We found that the male-specific genetic variation of Native Americans lacks any clear structure that could sensibly be related to their geographic and/or linguistic relationships. This finding is consistent with a rapid initial peopling of South America, followed by long periods of isolation in small tribal groups. The observed continent-wide decoupling of geography, spoken language, and genetics contrasts strikingly with previous reports of such correlation from many parts of Europe and Asia. Moreover, we identified a cluster of Native American founding lineages of Y chromosomes, called C-M217 (C3*), within a restricted area of Ecuador in North-Western South America. The same haplogroup occurs at high frequency in Central, East, and North East Asia, but is virtually absent from North (except Alaska) and Central America. Possible scenarios for the introduction of C-M217 (C3*) into Ecuador may thus include a coastal or trans-Pacific route, an idea also supported by occasional archeological evidence and the recent coalescence of the C3* haplotypes, estimated from our data to have occurred some 6,000 years ago.
The Turkic peoples represent a diverse collection of ethnic groups defined by the Turkic languages. These groups have dispersed across a vast area, including Siberia, Northwest China, Central Asia, East Europe, the Caucasus, Anatolia, the Middle East, and Afghanistan. The origin and early dispersal history of the Turkic peoples is disputed, with candidates for their ancient homeland ranging from the Transcaspian steppe to Manchuria in Northeast Asia. Previous genetic studies have not identified a clear-cut unifying genetic signal for the Turkic peoples, which lends support for language replacement rather than demic diffusion as the model for the Turkic language’s expansion. We addressed the genetic origin of 373 individuals from 22 Turkic-speaking populations, representing their current geographic range, by analyzing genome-wide high-density genotype data. In agreement with the elite dominance model of language expansion most of the Turkic peoples studied genetically resemble their geographic neighbors. However, western Turkic peoples sampled across West Eurasia shared an excess of long chromosomal tracts that are identical by descent (IBD) with populations from present-day South Siberia and Mongolia (SSM), an area where historians center a series of early Turkic and non-Turkic steppe polities. While SSM matching IBD tracts (> 1cM) are also observed in non-Turkic populations, Turkic peoples demonstrate a higher percentage of such tracts (p-values ≤ 0.01) compared to their non-Turkic neighbors. Finally, we used the ALDER method and inferred admixture dates (~9th–17th centuries) that overlap with the Turkic migrations of the 5th–16th centuries. Thus, our results indicate historical admixture among Turkic peoples, and the recent shared ancestry with modern populations in SSM supports one of the hypothesized homelands for their nomadic Turkic and related Mongolic ancestors.
Centuries of nomadic migrations have ultimately resulted in the distribution of Turkic languages over a large area ranging from Siberia, across Central Asia to Eastern Europe and the Middle East. Despite the profound cultural impact left by these nomadic peoples, little is known about their prehistoric origins. Moreover, because contemporary Turkic speakers tend to genetically resemble their geographic neighbors, it is not clear whether their nomadic ancestors left an identifiable genetic trace. In this study, we show that Turkic-speaking peoples sampled across the Middle East, Caucasus, East Europe, and Central Asia share varying proportions of Asian ancestry that originate in a single area, southern Siberia and Mongolia. Mongolic- and Turkic-speaking populations from this area bear an unusually high number of long chromosomal tracts that are identical by descent with Turkic peoples from across west Eurasia. Admixture induced linkage disequilibrium decay across chromosomes in these populations indicates that admixture occurred during the 9th–17th centuries, in agreement with the historically recorded Turkic nomadic migrations and later Mongol expansion. Thus, our findings reveal genetic traces of recent large-scale nomadic migrations and map their source to a previously hypothesized area of Mongolia and southern Siberia.
More than a half of the northern Asian pool of human mitochondrial DNA (mtDNA) is fragmented into a number of subclades of haplogroups C and D, two of the most frequent haplogroups throughout northern, eastern, central Asia and America. While there has been considerable recent progress in studying mitochondrial variation in eastern Asia and America at the complete genome resolution, little comparable data is available for regions such as southern Siberia – the area where most of northern Asian haplogroups, including C and D, likely diversified. This gap in our knowledge causes a serious barrier for progress in understanding the demographic pre-history of northern Eurasia in general. Here we describe the phylogeography of haplogroups C and D in the populations of northern and eastern Asia. We have analyzed 770 samples from haplogroups C and D (174 and 596, respectively) at high resolution, including 182 novel complete mtDNA sequences representing haplogroups C and D (83 and 99, respectively). The present-day variation of haplogroups C and D suggests that these mtDNA clades expanded before the Last Glacial Maximum (LGM), with their oldest lineages being present in the eastern Asia. Unlike in eastern Asia, most of the northern Asian variants of haplogroups C and D began the expansion after the LGM, thus pointing to post-glacial re-colonization of northern Asia. Our results show that both haplogroups were involved in migrations, from eastern Asia and southern Siberia to eastern and northeastern Europe, likely during the middle Holocene.
The picture of dog mtDNA diversity, as obtained from geographically wide samplings but from a small number of individuals per region or breed, has revealed weak geographic correlation and high degree of haplotype sharing between very distant breeds. We aimed at a more detailed picture through extensive sampling (n = 143) of four Portuguese autochthonous breeds – Castro Laboreiro Dog, Serra da Estrela Mountain Dog, Portuguese Sheepdog and Azores Cattle Dog-and comparatively reanalysing published worldwide data.
Fifteen haplotypes belonging to four major haplogroups were found in these breeds, of which five are newly reported. The Castro Laboreiro Dog presented a 95% frequency of a new A haplotype, while all other breeds contained a diverse pool of existing lineages. The Serra da Estrela Mountain Dog, the most heterogeneous of the four Portuguese breeds, shared haplotypes with the other mainland breeds, while Azores Cattle Dog shared no haplotypes with the other Portuguese breeds.
A review of mtDNA haplotypes in dogs across the world revealed that: (a) breeds tend to display haplotypes belonging to different haplogroups; (b) haplogroup A is present in all breeds, and even uncommon haplogroups are highly dispersed among breeds and continental areas; (c) haplotype sharing between breeds of the same region is lower than between breeds of different regions and (d) genetic distances between breeds do not correlate with geography.
MtDNA haplotype sharing occurred between Serra da Estrela Mountain dogs (with putative origin in the centre of Portugal) and two breeds in the north and south of the country-with the Castro Laboreiro Dog (which behaves, at the mtDNA level, as a sub-sample of the Serra da Estrela Mountain Dog) and the southern Portuguese Sheepdog. In contrast, the Azores Cattle Dog did not share any haplotypes with the other Portuguese breeds, but with dogs sampled in Northern Europe. This suggested that the Azores Cattle Dog descended maternally from Northern European dogs rather than Portuguese mainland dogs. A review of published mtDNA haplotypes identified thirteen non-Portuguese breeds with sufficient data for comparison. Comparisons between these thirteen breeds, and the four Portuguese breeds, demonstrated widespread haplotype sharing, with the greatest diversity among Asian dogs, in accordance with the central role of Asia in canine domestication.
Koreans are generally considered a Northeast Asian group, thought to be related to Altaic-language-speaking populations. However, recent findings have indicated that the peopling of Korea might have been more complex, involving dual origins from both southern and northern parts of East Asia. To understand the male lineage history of Korea, more data from informative genetic markers from Korea and its surrounding regions are necessary. In this study, 25 Y-chromosome single nucleotide polymorphism markers and 17 Y-chromosome short tandem repeat (Y-STR) loci were genotyped in 1,108 males from several populations in East Asia.
In general, we found East Asian populations to be characterized by male haplogroup homogeneity, showing major Y-chromosomal expansions of haplogroup O-M175 lineages. Interestingly, a high frequency (31.4%) of haplogroup O2b-SRY465 (and its sublineage) is characteristic of male Koreans, whereas the haplogroup distribution elsewhere in East Asian populations is patchy. The ages of the haplogroup O2b-SRY465 lineages (~9,900 years) and the pattern of variation within the lineages suggested an ancient origin in a nearby part of northeastern Asia, followed by an expansion in the vicinity of the Korean Peninsula. In addition, the coalescence time (~4,400 years) for the age of haplogroup O2b1-47z, and its Y-STR diversity, suggest that this lineage probably originated in Korea. Further studies with sufficiently large sample sizes to cover the vast East Asian region and using genomewide genotyping should provide further insights.
These findings are consistent with linguistic, archaeological and historical evidence, which suggest that the direct ancestors of Koreans were proto-Koreans who inhabited the northeastern region of China and the Korean Peninsula during the Neolithic (8,000-1,000 BC) and Bronze (1,500-400 BC) Ages.
South Asia possesses a significant amount of genetic diversity due to considerable intergroup differences in culture and language. There have been numerous reports on the genetic structure of Asian Indians, although these have mostly relied on genotyping microarrays or targeted sequencing of the mitochondria and Y chromosomes. Asian Indians in Singapore are primarily descendants of immigrants from Dravidian-language–speaking states in south India, and 38 individuals from the general population underwent deep whole-genome sequencing with a target coverage of 30X as part of the Singapore Sequencing Indian Project (SSIP). The genetic structure and diversity of these samples were compared against samples from the Singapore Sequencing Malay Project and populations in Phase 1 of the 1,000 Genomes Project (1 KGP). SSIP samples exhibited greater intra-population genetic diversity and possessed higher heterozygous-to-homozygous genotype ratio than other Asian populations. When compared against a panel of well-defined Asian Indians, the genetic makeup of the SSIP samples was closely related to South Indians. However, even though the SSIP samples clustered distinctly from the Europeans in the global population structure analysis with autosomal SNPs, eight samples were assigned to mitochondrial haplogroups that were predominantly present in Europeans and possessed higher European admixture than the remaining samples. An analysis of the relative relatedness between SSIP with two archaic hominins (Denisovan, Neanderthal) identified higher ancient admixture in East Asian populations than in SSIP. The data resource for these samples is publicly available and is expected to serve as a valuable complement to the South Asian samples in Phase 3 of 1 KGP.
Indians of South Asia has long been a population of interest to a wide audience, due to its unique diversity. We have deep-sequenced 38 individuals of Indian descent residing in Singapore (SSIP) in an effort to illustrate their diversity from a whole-genome standpoint. Indeed, among Asians in our population panel, SSIP was most diverse, followed by the Malays in Singapore (SSMP). Their diversity is further observed in the population's chromosome Y haplogroup and mitochondria haplogroup profiles; individuals with European-dominant haplogroups had greater proportion of European admixture. Among variants (single nucleotide polymorphism and small insertions/deletions) discovered in SSIP, 21.69% were novel with respect to previous sequencing projects. In addition, some 14 loss-of-function variants (LOFs) were associated to cancer, Type II diabetes, and cholesterol levels. Finally, D statistic test with ancient hominids concurred that there was gene flow to East Asians compared to South Asians.
The origins of the First Americans remain contentious. Although Native Americans
seem to be genetically most closely related to east Asians1–3, there is no
consensus with regard to which specific Old World populations they are closest
to4–8. Here we sequence the draft genome of an approximately 24,000-year-old
individual (MA-1), from Mal’ta in south-central Siberia9, to an average depth of 13. To our knowledge this is the
oldest anatomically modern human genome reported to date. The MA-1 mitochondrial genome
belongs to haplogroup U, which has also been found at high frequency among Upper
Palaeolithic and Mesolithic European hunter-gatherers10–12, and the Y
chromosome of MA-1 is basal to modern-day western Eurasians and near the root of most
Native American lineages5. Similarly, we
find autosomal evidence that MA-1 is basal to modern-day western Eurasians and genetically
closely related to modern-day Native Americans, with no close affinity to east Asians.
This suggests that populations related to contemporary western Eurasians had a more
north-easterly distribution 24,000 years ago than commonly thought. Furthermore, we
estimate that 14 to 38% of Native American ancestry may originate through gene
flow from this ancient population. This is likely to have occurred after the divergence of
Native American ancestors from east Asian ancestors, but before the diversification of
Native American populations in the New World. Gene flow from the MA-1 lineage into Native
American ancestors could explain why several crania from the First Americans have been
reported as bearing morphological characteristics that do not resemble those of east
Asians2,13. Sequencing of another south-central Siberian, Afontova Gora-2 dating
to approximately 17,000 years ago14,
revealed similar autosomal genetic signatures as MA-1, suggesting that the region was
continuously occupied by humans throughout the Last Glacial Maximum. Our findings reveal
that western Eurasian genetic signatures in modern-day Native Americans derive not only
from post-Columbian admixture, as commonly thought, but also from a mixed ancestry of the
Ethnic Belarusians make up more than 80% of the nine and half million people inhabiting the Republic of Belarus. Belarusians together with Ukrainians and Russians represent the East Slavic linguistic group, largest both in numbers and territory, inhabiting East Europe alongside Baltic-, Finno-Permic- and Turkic-speaking people. Till date, only a limited number of low resolution genetic studies have been performed on this population. Therefore, with the phylogeographic analysis of 565 Y-chromosomes and 267 mitochondrial DNAs from six well covered geographic sub-regions of Belarus we strove to complement the existing genetic profile of eastern Europeans. Our results reveal that around 80% of the paternal Belarusian gene pool is composed of R1a, I2a and N1c Y-chromosome haplogroups – a profile which is very similar to the two other eastern European populations – Ukrainians and Russians. The maternal Belarusian gene pool encompasses a full range of West Eurasian haplogroups and agrees well with the genetic structure of central-east European populations. Our data attest that latitudinal gradients characterize the variation of the uniparentally transmitted gene pools of modern Belarusians. In particular, the Y-chromosome reflects movements of people in central-east Europe, starting probably as early as the beginning of the Holocene. Furthermore, the matrilineal legacy of Belarusians retains two rare mitochondrial DNA haplogroups, N1a3 and N3, whose phylogeographies were explored in detail after de novo sequencing of 20 and 13 complete mitogenomes, respectively, from all over Eurasia. Our phylogeographic analyses reveal that two mitochondrial DNA lineages, N3 and N1a3, both of Middle Eastern origin, might mark distinct events of matrilineal gene flow to Europe: during the mid-Holocene period and around the Pleistocene-Holocene transition, respectively.
Human Y chromosomes belonging to the haplogroup R1b1-P25, although very common in Europe, are usually rare in Africa. However, recently published studies have reported high frequencies of this haplogroup in the central-western region of the African continent and proposed that this represents a ‘back-to-Africa' migration during prehistoric times. To obtain a deeper insight into the history of these lineages, we characterised the paternal genetic background of a population in Equatorial Guinea, a Central-West African country located near the region in which the highest frequencies of the R1b1 haplogroup in Africa have been found to date. In our sample, the large majority (78.6%) of the sequences belong to subclades in haplogroup E, which are the most frequent in Bantu groups. However, the frequency of the R1b1 haplogroup in our sample (17.0%) was higher than that previously observed for the majority of the African continent. Of these R1b1 samples, nine are defined by the V88 marker, which was recently discovered in Africa. As high microsatellite variance was found inside this haplogroup in Central-West Africa and a decrease in this variance was observed towards Northeast Africa, our findings do not support the previously hypothesised movement of Chadic-speaking people from the North across the Sahara as the explanation for these R1b1 lineages in Central-West Africa. The present findings are also compatible with an origin of the V88-derived allele in the Central-West Africa, and its presence in North Africa may be better explained as the result of a migration from the south during the mid-Holocene.
Central-West Africa; Equatorial Guinea; human male lineages; Y chromosome; haplogroup R-V88; back to Africa hypothesis
Australia was one of the earliest regions outside Africa to be colonized by fully modern humans, with archaeological evidence for human presence by 47,000 years ago (47 kya) widely accepted [1, 2]. However, the extent of subsequent human entry before the European colonial age is less clear. The dingo reached Australia about 4 kya, indirectly implying human contact, which some have linked to changes in language and stone tool technology to suggest substantial cultural changes at the same time . Genetic data of two kinds have been proposed to support gene flow from the Indian subcontinent to Australia at this time, as well: first, signs of South Asian admixture in Aboriginal Australian genomes have been reported on the basis of genome-wide SNP data ; and second, a Y chromosome lineage designated haplogroup C∗, present in both India and Australia, was estimated to have a most recent common ancestor around 5 kya and to have entered Australia from India . Here, we sequence 13 Aboriginal Australian Y chromosomes to re-investigate their divergence times from Y chromosomes in other continents, including a comparison of Aboriginal Australian and South Asian haplogroup C chromosomes. We find divergence times dating back to ∼50 kya, thus excluding the Y chromosome as providing evidence for recent gene flow from India into Australia.
•We have sequenced 13 Aboriginal Australian Y chromosomes•These diverged from Y chromosomes in other continents around 50,000 years ago•They diverged from Papua New Guinean Y chromosomes soon after this•We find no evidence for Holocene male gene flow to Australia from South Asia
Bergström et al. show that Aboriginal Australian Y chromosomes diverged from Eurasian, including South Asian, Y chromosomes ∼50,000 years ago. This is around the time that Australia was first populated and thus disproves the previous hypothesis of prehistoric Y chromosome gene flow from India ∼5,000 years ago.
Linguistic and genetic studies on Roma populations inhabited in Europe have unequivocally traced these populations to the Indian subcontinent. However, the exact parental population group and time of the out-of-India dispersal have remained disputed. In the absence of archaeological records and with only scanty historical documentation of the Roma, comparative linguistic studies were the first to identify their Indian origin. Recently, molecular studies on the basis of disease-causing mutations and haploid DNA markers (i.e. mtDNA and Y-chromosome) supported the linguistic view. The presence of Indian-specific Y-chromosome haplogroup H1a1a-M82 and mtDNA haplogroups M5a1, M18 and M35b among Roma has corroborated that their South Asian origins and later admixture with Near Eastern and European populations. However, previous studies have left unanswered questions about the exact parental population groups in South Asia. Here we present a detailed phylogeographical study of Y-chromosomal haplogroup H1a1a-M82 in a data set of more than 10,000 global samples to discern a more precise ancestral source of European Romani populations. The phylogeographical patterns and diversity estimates indicate an early origin of this haplogroup in the Indian subcontinent and its further expansion to other regions. Tellingly, the short tandem repeat (STR) based network of H1a1a-M82 lineages displayed the closest connection of Romani haplotypes with the traditional scheduled caste and scheduled tribe population groups of northwestern India.
Human genetic diversity observed in Indian subcontinent is second only to that of Africa. This implies an early settlement and demographic growth soon after the first 'Out-of-Africa' dispersal of anatomically modern humans in Late Pleistocene. In contrast to this perspective, linguistic diversity in India has been thought to derive from more recent population movements and episodes of contact. With the exception of Dravidian, which origin and relatedness to other language phyla is obscure, all the language families in India can be linked to language families spoken in different regions of Eurasia. Mitochondrial DNA and Y chromosome evidence has supported largely local evolution of the genetic lineages of the majority of Dravidian and Indo-European speaking populations, but there is no consensus yet on the question of whether the Munda (Austro-Asiatic) speaking populations originated in India or derive from a relatively recent migration from further East.
Here, we report the analysis of 35 novel complete mtDNA sequences from India which refine the structure of Indian-specific varieties of haplogroup R. Detailed analysis of haplogroup R7, coupled with a survey of ~12,000 mtDNAs from caste and tribal groups over the entire Indian subcontinent, reveals that one of its more recently derived branches (R7a1), is particularly frequent among Munda-speaking tribal groups. This branch is nested within diverse R7 lineages found among Dravidian and Indo-European speakers of India. We have inferred from this that a subset of Munda-speaking groups have acquired R7 relatively recently. Furthermore, we find that the distribution of R7a1 within the Munda-speakers is largely restricted to one of the sub-branches (Kherwari) of northern Munda languages. This evidence does not support the hypothesis that the Austro-Asiatic speakers are the primary source of the R7 variation. Statistical analyses suggest a significant correlation between genetic variation and geography, rather than between genes and languages.
Our high-resolution phylogeographic study, involving diverse linguistic groups in India, suggests that the high frequency of mtDNA haplogroup R7 among Munda speaking populations of India can be explained best by gene flow from linguistically different populations of Indian subcontinent. The conclusion is based on the observation that among Indo-Europeans, and particularly in Dravidians, the haplogroup is, despite its lower frequency, phylogenetically more divergent, while among the Munda speakers only one sub-clade of R7, i.e. R7a1, can be observed. It is noteworthy that though R7 is autochthonous to India, and arises from the root of hg R, its distribution and phylogeography in India is not uniform. This suggests the more ancient establishment of an autochthonous matrilineal genetic structure, and that isolation in the Pleistocene, lineage loss through drift, and endogamy of prehistoric and historic groups have greatly inhibited genetic homogenization and geographical uniformity.
Huntington disease (HD) results from CAG expansion in the huntingtin (HTT) gene. Although HD occurs worldwide, there are large geographic differences in its prevalence. The prevalence in populations derived from Europe is 10–100 times greater than in East Asia. The European general population chromosomes can be grouped into three major haplogroups (group of similar haplotypes): A, B and C. The majority of HD chromosomes in Europe are found on haplogroup A. However, in the East-Asian populations of China and Japan, we find the majority of HD chromosomes are associated with haplogroup C. The highest risk HD haplotypes (A1 and A2), are absent from the general and HD populations of China and Japan, and therefore provide an explanation for why HD prevalence is low in East Asia. Interestingly, both East-Asian and European populations share a similar low level of HD on haplogroup C. Our data are consistent with the hypothesis that different HTT haplotypes have different mutation rates, and geographic differences in HTT haplotypes explain the difference in HD prevalence. Further, the bias for expansion on haplogroup C in the East-Asian population cannot be explained by a higher average CAG size, as haplogroup C has a lower average CAG size in the general East-Asian population compared with other haplogroups. This finding suggests that CAG-tract size is not the only factor important for CAG instability. Instead, the expansion bias may be because of genetic cis-elements within the haplotype that influence CAG instability in HTT, possibly through different mutational mechanisms for the different haplogroups.
Huntington disease; prevalence; CAG expansion; CAG instability; haplotypes; Cis-elements
North Africa is considered a distinct geographic and ethnic entity within Africa. Although modern humans originated in this Continent, studies of mitochondrial DNA (mtDNA) and Y-chromosome genealogical markers provide evidence that the North African gene pool has been shaped by the back-migration of several Eurasian lineages in Paleolithic and Neolithic times. More recent influences from sub-Saharan Africa and Mediterranean Europe are also evident. The presence of East-West and North-South haplogroup frequency gradients strongly reinforces the genetic complexity of this region. However, this genetic scenario is beset with a notable gap, which is the lack of consistent information for Algeria, the largest country in the Maghreb. To fill this gap, we analyzed a sample of 240 unrelated subjects from a northwest Algeria cosmopolitan population using mtDNA sequences and Y-chromosome biallelic polymorphisms, focusing on the fine dissection of haplogroups E and R, which are the most prevalent in North Africa and Europe respectively. The Eurasian component in Algeria reached 80% for mtDNA and 90% for Y-chromosome. However, within them, the North African genetic component for mtDNA (U6 and M1; 20%) is significantly smaller than the paternal (E-M81 and E-V65; 70%). The unexpected presence of the European-derived Y-chromosome lineages R-M412, R-S116, R-U152 and R-M529 in Algeria and the rest of the Maghreb could be the counterparts of the mtDNA H1, H3 and V subgroups, pointing to direct maritime contacts between the European and North African sides of the western Mediterranean. Female influx of sub-Saharan Africans into Algeria (20%) is also significantly greater than the male (10%). In spite of these sexual asymmetries, the Algerian uniparental profiles faithfully correlate between each other and with the geography.
R1a-M420 is one of the most widely spread Y-chromosome haplogroups; however, its substructure within Europe and Asia has remained poorly characterized. Using a panel of 16 244 male subjects from 126 populations sampled across Eurasia, we identified 2923 R1a-M420 Y-chromosomes and analyzed them to a highly granular phylogeographic resolution. Whole Y-chromosome sequence analysis of eight R1a and five R1b individuals suggests a divergence time of ∼25 000 (95% CI: 21 300–29 000) years ago and a coalescence time within R1a-M417 of ∼5800 (95% CI: 4800–6800) years. The spatial frequency distributions of R1a sub-haplogroups conclusively indicate two major groups, one found primarily in Europe and the other confined to Central and South Asia. Beyond the major European versus Asian dichotomy, we describe several younger sub-haplogroups. Based on spatial distributions and diversity patterns within the R1a-M420 clade, particularly rare basal branches detected primarily within Iran and eastern Turkey, we conclude that the initial episodes of haplogroup R1a diversification likely occurred in the vicinity of present-day Iran.
The Asian origin of Native Americans is largely accepted. However uncertainties persist regarding the source population(s) within Asia, the divergence and arrival time(s) of the founder groups, the number of expansion events, and migration routes into the New World. mtDNA data, presented over the past two decades, have been used to suggest a single-migration model for which the Beringian land mass plays an important role.
In our analysis of 568 mitochondrial genomes, the coalescent age estimates of shared roots between Native American and Siberian-Asian lineages, calculated using two different mutation rates, are A4 (27.5 ± 6.8 kya/22.7 ± 7.4 kya), C1 (21.4 ± 2.7 kya/16.4 ± 1.5 kya), C4 (21.0 ± 4.6 kya/20.0 ± 6.4 kya), and D4e1 (24.1 ± 9.0 kya/17.9 ± 10.0 kya). The coalescent age estimates of pan-American haplogroups calculated using the same two mutation rates (A2:19.5 ± 1.3 kya/16.1 ± 1.5 kya, B2:20.8 ± 2.0 kya/18.1 ± 2.4 kya, C1:21.4 ± 2.7 kya/16.4 ± 1.5 kya and D1:17.2 ± 2.0 kya/14.9 ± 2.2 kya) and estimates of population expansions within America (~21-16 kya), support the pre-Clovis occupation of the New World. The phylogeography of sublineages within American haplogroups A2, B2, D1 and the C1b, C1c andC1d subhaplogroups of C1 are complex and largely specific to geographical North, Central and South America. However some sub-branches (B2b, C1b, C1c, C1d and D1f) already existed in American founder haplogroups before expansion into the America.
Our results suggest that Native American founders diverged from their Siberian-Asian progenitors sometime during the last glacial maximum (LGM) and expanded into America soon after the LGM peak (~20-16 kya). The phylogeography of haplogroup C1 suggest that this American founder haplogroup differentiated in Siberia-Asia. The situation is less clear for haplogroup B2, however haplogroups A2 and D1 may have differentiated soon after the Native American founders divergence. A moderate population bottle neck in American founder populations just before the expansion most plausibly resulted in few founder types in America. The similar estimates of the diversity indices and Bayesian skyline analysis in North America, Central America and South America suggest almost simultaneous (~ 2.0 ky from South to North America) colonization of these geographical regions with rapid population expansion differentiating into more or less regional branches across the pan-American haplogroups.
Most present-day European men inherited their Y chromosomes from the farmers who spread from the Near East 10,000 years ago, rather than from the hunter-gatherers of the Paleolithic.
The relative contributions to modern European populations of Paleolithic hunter-gatherers and Neolithic farmers from the Near East have been intensely debated. Haplogroup R1b1b2 (R-M269) is the commonest European Y-chromosomal lineage, increasing in frequency from east to west, and carried by 110 million European men. Previous studies suggested a Paleolithic origin, but here we show that the geographical distribution of its microsatellite diversity is best explained by spread from a single source in the Near East via Anatolia during the Neolithic. Taken with evidence on the origins of other haplogroups, this indicates that most European Y chromosomes originate in the Neolithic expansion. This reinterpretation makes Europe a prime example of how technological and cultural change is linked with the expansion of a Y-chromosomal lineage, and the contrast of this pattern with that shown by maternally inherited mitochondrial DNA suggests a unique role for males in the transition.
Arguably the most important cultural transition in the history of modern humans was the development of farming, since it heralded the population growth that culminated in our current massive population size. The genetic diversity of modern populations retains the traces of such past events, and can therefore be studied to illuminate the demographic processes involved in past events. Much debate has focused on the origins of agriculture in Europe some 10,000 years ago, and in particular whether its westerly spread from the Near East was driven by farmers themselves migrating, or by the transmission of ideas and technologies to indigenous hunter-gatherers. This study examines the diversity of the paternally inherited Y chromosome, focusing on the commonest lineage in Europe. The distribution of this lineage, the diversity within it, and estimates of its age all suggest that it spread with farming from the Near East. Taken with evidence on the origins of other lineages, this indicates that most European Y chromosomes descend from Near Eastern farmers. In contrast, most maternal lineages descend from hunter-gatherers, suggesting a reproductive advantage for farming males over indigenous hunter-gatherer males during the cultural transition from hunting-gathering to farming.