The Asian origin of Native Americans is largely accepted. However uncertainties persist regarding the source population(s) within Asia, the divergence and arrival time(s) of the founder groups, the number of expansion events, and migration routes into the New World. mtDNA data, presented over the past two decades, have been used to suggest a single-migration model for which the Beringian land mass plays an important role.
In our analysis of 568 mitochondrial genomes, the coalescent age estimates of shared roots between Native American and Siberian-Asian lineages, calculated using two different mutation rates, are A4 (27.5 ± 6.8 kya/22.7 ± 7.4 kya), C1 (21.4 ± 2.7 kya/16.4 ± 1.5 kya), C4 (21.0 ± 4.6 kya/20.0 ± 6.4 kya), and D4e1 (24.1 ± 9.0 kya/17.9 ± 10.0 kya). The coalescent age estimates of pan-American haplogroups calculated using the same two mutation rates (A2:19.5 ± 1.3 kya/16.1 ± 1.5 kya, B2:20.8 ± 2.0 kya/18.1 ± 2.4 kya, C1:21.4 ± 2.7 kya/16.4 ± 1.5 kya and D1:17.2 ± 2.0 kya/14.9 ± 2.2 kya) and estimates of population expansions within America (~21-16 kya), support the pre-Clovis occupation of the New World. The phylogeography of sublineages within American haplogroups A2, B2, D1 and the C1b, C1c andC1d subhaplogroups of C1 are complex and largely specific to geographical North, Central and South America. However some sub-branches (B2b, C1b, C1c, C1d and D1f) already existed in American founder haplogroups before expansion into the America.
Our results suggest that Native American founders diverged from their Siberian-Asian progenitors sometime during the last glacial maximum (LGM) and expanded into America soon after the LGM peak (~20-16 kya). The phylogeography of haplogroup C1 suggest that this American founder haplogroup differentiated in Siberia-Asia. The situation is less clear for haplogroup B2, however haplogroups A2 and D1 may have differentiated soon after the Native American founders divergence. A moderate population bottle neck in American founder populations just before the expansion most plausibly resulted in few founder types in America. The similar estimates of the diversity indices and Bayesian skyline analysis in North America, Central America and South America suggest almost simultaneous (~ 2.0 ky from South to North America) colonization of these geographical regions with rapid population expansion differentiating into more or less regional branches across the pan-American haplogroups.
Human genetic diversity observed in Indian subcontinent is second only to that of Africa. This implies an early settlement and demographic growth soon after the first 'Out-of-Africa' dispersal of anatomically modern humans in Late Pleistocene. In contrast to this perspective, linguistic diversity in India has been thought to derive from more recent population movements and episodes of contact. With the exception of Dravidian, which origin and relatedness to other language phyla is obscure, all the language families in India can be linked to language families spoken in different regions of Eurasia. Mitochondrial DNA and Y chromosome evidence has supported largely local evolution of the genetic lineages of the majority of Dravidian and Indo-European speaking populations, but there is no consensus yet on the question of whether the Munda (Austro-Asiatic) speaking populations originated in India or derive from a relatively recent migration from further East.
Here, we report the analysis of 35 novel complete mtDNA sequences from India which refine the structure of Indian-specific varieties of haplogroup R. Detailed analysis of haplogroup R7, coupled with a survey of ~12,000 mtDNAs from caste and tribal groups over the entire Indian subcontinent, reveals that one of its more recently derived branches (R7a1), is particularly frequent among Munda-speaking tribal groups. This branch is nested within diverse R7 lineages found among Dravidian and Indo-European speakers of India. We have inferred from this that a subset of Munda-speaking groups have acquired R7 relatively recently. Furthermore, we find that the distribution of R7a1 within the Munda-speakers is largely restricted to one of the sub-branches (Kherwari) of northern Munda languages. This evidence does not support the hypothesis that the Austro-Asiatic speakers are the primary source of the R7 variation. Statistical analyses suggest a significant correlation between genetic variation and geography, rather than between genes and languages.
Our high-resolution phylogeographic study, involving diverse linguistic groups in India, suggests that the high frequency of mtDNA haplogroup R7 among Munda speaking populations of India can be explained best by gene flow from linguistically different populations of Indian subcontinent. The conclusion is based on the observation that among Indo-Europeans, and particularly in Dravidians, the haplogroup is, despite its lower frequency, phylogenetically more divergent, while among the Munda speakers only one sub-clade of R7, i.e. R7a1, can be observed. It is noteworthy that though R7 is autochthonous to India, and arises from the root of hg R, its distribution and phylogeography in India is not uniform. This suggests the more ancient establishment of an autochthonous matrilineal genetic structure, and that isolation in the Pleistocene, lineage loss through drift, and endogamy of prehistoric and historic groups have greatly inhibited genetic homogenization and geographical uniformity.
Macrohaplogroups 'M' and 'N' have evolved almost in parallel from a founder haplogroup L3. Macrohaplogroup N in India has already been defined in previous studies and recently the macrohaplogroup M among the Indian populations has been characterized. In this study, we attempted to reconstruct and re-evaluate the phylogeny of Macrohaplogroup M, which harbors more than 60% of the Indian mtDNA lineage, and to shed light on the origin of its deep rooting haplogroups.
Using 11 whole mtDNA and 2231 partial coding sequence of Indian M lineage selected from 8670 HVS1 sequences across India, we have reconstructed the tree including Andamanese-specific lineage M31 and calculated the time depth of all the nodes. We defined one novel haplogroup M41, and revised the classification of haplogroups M3, M18, and M31.
Our result indicates that the Indian mtDNA pool consists of several deep rooting lineages of macrohaplogroup 'M' suggesting in-situ origin of these haplogroups in South Asia, most likely in the India. These deep rooting lineages are not language specific and spread over all the language groups in India. Moreover, our reanalysis of the Andamanese-specific lineage M31 suggests population specific two clear-cut subclades (M31a1 and M31a2). Onge and Jarwa share M31a1 branch while M31a2 clade is present in only Great Andamanese individuals. Overall our study supported the one wave, rapid dispersal theory of modern humans along the Asian coast.
We have analyzed 7,137 samples from 125 different caste, tribal and religious groups of India and 99 samples from three populations of Nepal for the length variation in the COII/tRNALys region of mtDNA. Samples showing length variation were subjected to detailed phylogenetic analysis based on HVS-I and informative coding region sequence variation. The overall frequencies of the 9-bp deletion and insertion variants in South Asia were 1.9 and 0.6%, respectively. We have also defined a novel deep-rooting haplogroup M43 and identified the rare haplogroup H14 in Indian populations carrying the 9-bp deletion by complete mtDNA sequencing. Moreover, we redefined haplogroup M6 and dissected it into two well-defined subclades. The presence of haplogroups F1 and B5a in Uttar Pradesh suggests minor maternal contribution from Southeast Asia to Northern India. The occurrence of haplogroup F1 in the Nepalese sample implies that Nepal might have served as a bridge for the flow of eastern lineages to India. The presence of R6 in the Nepalese, on the other hand, suggests that the gene flow between India and Nepal has been reciprocal.
South Asia; 9bp indel; mtDNA; Haplogroup
We have analyzed 7137 samples from 125 different caste, tribal and religious groups of India and 99 samples from three populations of Nepal for the length variation in the COII/tRNALys region of mtDNA. Samples showing length variation were subjected to detailed phylogenetic analysis based on HVS-I and informative coding region sequence variation. The overall frequencies of the 9-bp deletion and insertion variants in South Asia were 1.8% and 0.5%, respectively. We have also defined a novel deep-rooting haplogroup M43 and identified the rare haplogroup H14 in Indian populations carrying the 9bp-deletion by complete mtDNA sequencing. Moreover, we redefined haplogroup M6 and dissected it into two well-defined subclades. The presence of haplogroups F1 and B5a in Uttar Pradesh suggests minor maternal contribution from Southeast Asia to Northern India. The occurrence of haplogroup F1 in the Nepalese sample implies that Nepal might have served as a bridge for the flow of eastern lineages to India. The presence of R6 in the Nepalese, on the other hand, suggests that the gene flow between India and Nepal has been reciprocal.
South Asia; 9bp indel; mtDNA; Haplogroup
The Austro-Asiatic linguistic family, which is considered to be the oldest of all the families in India, has a substantial presence in Southeast Asia. However, the possibility of any genetic link among the linguistic sub-families of the Indian Austro-Asiatics on the one hand and between the Indian and the Southeast Asian Austro-Asiatics on the other has not been explored till now. Therefore, to trace the origin and historic expansion of Austro-Asiatic groups of India, we analysed Y-chromosome SNP and STR data of the 1222 individuals from 25 Indian populations, covering all the three branches of Austro-Asiatic tribes, viz. Mundari, Khasi-Khmuic and Mon-Khmer, along with the previously published data on 214 relevant populations from Asia and Oceania.
Our results suggest a strong paternal genetic link, not only among the subgroups of Indian Austro-Asiatic populations but also with those of Southeast Asia. However, maternal link based on mtDNA is not evident. The results also indicate that the haplogroup O-M95 had originated in the Indian Austro-Asiatic populations ~65,000 yrs BP (95% C.I. 25,442 – 132,230) and their ancestors carried it further to Southeast Asia via the Northeast Indian corridor. Subsequently, in the process of expansion, the Mon-Khmer populations from Southeast Asia seem to have migrated and colonized Andaman and Nicobar Islands at a much later point of time.
Our findings are consistent with the linguistic evidence, which suggests that the linguistic ancestors of the Austro-Asiatic populations have originated in India and then migrated to Southeast Asia.
Recent advances in the understanding of the maternal and paternal heritage of south and southwest Asian populations have highlighted their role in the colonization of Eurasia by anatomically modern humans. Further understanding requires a deeper insight into the topology of the branches of the Indian mtDNA phylogenetic tree, which should be contextualized within the phylogeography of the neighboring regional mtDNA variation. Accordingly, we have analyzed mtDNA control and coding region variation in 796 Indian (including both tribal and caste populations from different parts of India) and 436 Iranian mtDNAs. The results were integrated and analyzed together with published data from South, Southeast Asia and West Eurasia.
Four new Indian-specific haplogroup M sub-clades were defined. These, in combination with two previously described haplogroups, encompass approximately one third of the haplogroup M mtDNAs in India. Their phylogeography and spread among different linguistic phyla and social strata was investigated in detail. Furthermore, the analysis of the Iranian mtDNA pool revealed patterns of limited reciprocal gene flow between Iran and the Indian sub-continent and allowed the identification of different assemblies of shared mtDNA sub-clades.
Since the initial peopling of South and West Asia by anatomically modern humans, when this region may well have provided the initial settlers who colonized much of the rest of Eurasia, the gene flow in and out of India of the maternally transmitted mtDNA has been surprisingly limited. Specifically, our analysis of the mtDNA haplogroups, which are shared between Indian and Iranian populations and exhibit coalescence ages corresponding to around the early Upper Paleolithic, indicates that they are present in India largely as Indian-specific sub-lineages. In contrast, other ancient Indian-specific variants of M and R are very rare outside the sub-continent.
Sakha – an area connecting South and Northeast Siberia – is significant for understanding the history of peopling of Northeast Eurasia and the Americas. Previous studies have shown a genetic contiguity between Siberia and East Asia and the key role of South Siberia in the colonization of Siberia.
We report the results of a high-resolution phylogenetic analysis of 701 mtDNAs and 318 Y chromosomes from five native populations of Sakha (Yakuts, Evenks, Evens, Yukaghirs and Dolgans) and of the analysis of more than 500,000 autosomal SNPs of 758 individuals from 55 populations, including 40 previously unpublished samples from Siberia. Phylogenetically terminal clades of East Asian mtDNA haplogroups C and D and Y-chromosome haplogroups N1c, N1b and C3, constituting the core of the gene pool of the native populations from Sakha, connect Sakha and South Siberia. Analysis of autosomal SNP data confirms the genetic continuity between Sakha and South Siberia. Maternal lineages D5a2a2, C4a1c, C4a2, C5b1b and the Yakut-specific STR sub-clade of Y-chromosome haplogroup N1c can be linked to a migration of Yakut ancestors, while the paternal lineage C3c was most likely carried to Sakha by the expansion of the Tungusic people. MtDNA haplogroups Z1a1b and Z1a3, present in Yukaghirs, Evens and Dolgans, show traces of different and probably more ancient migration(s). Analysis of both haploid loci and autosomal SNP data revealed only minor genetic components shared between Sakha and the extreme Northeast Siberia. Although the major part of West Eurasian maternal and paternal lineages in Sakha could originate from recent admixture with East Europeans, mtDNA haplogroups H8, H20a and HV1a1a, as well as Y-chromosome haplogroup J, more probably reflect an ancient gene flow from West Eurasia through Central Asia and South Siberia.
Our high-resolution phylogenetic dissection of mtDNA and Y-chromosome haplogroups as well as analysis of autosomal SNP data suggests that Sakha was colonized by repeated expansions from South Siberia with minor gene flow from the Lower Amur/Southern Okhotsk region and/or Kamchatka. The minor West Eurasian component in Sakha attests to both recent and ongoing admixture with East Europeans and an ancient gene flow from West Eurasia.
mtDNA; Y chromosome; Autosomal SNPs; Sakha
A Neolithic domestication of taurine cattle in the Fertile Crescent from local aurochsen (Bos primigenius) is generally accepted, but a genetic contribution from European aurochsen has been proposed. Here we performed a survey of a large number of taurine cattle mitochondrial DNA (mtDNA) control regions from numerous European breeds confirming the overall clustering within haplogroups (T1, T2 and T3) of Near Eastern ancestry, but also identifying eight mtDNAs (1.3%) that did not fit in haplogroup T. Sequencing of the entire mitochondrial genome showed that four mtDNAs formed a novel branch (haplogroup R) which, after the deep bifurcation that gave rise to the taurine and zebuine lineages, constitutes the earliest known split in the mtDNA phylogeny of B. primigenius. The remaining four mtDNAs were members of the recently discovered haplogroup Q. Phylogeographic data indicate that R mtDNAs were derived from female European aurochsen, possibly in the Italian Peninsula, and sporadically included in domestic herds. In contrast, the available data suggest that Q mtDNAs and T subclades were involved in the same Neolithic event of domestication in the Near East. Thus, the existence of novel (and rare) taurine haplogroups highlights a multifaceted genetic legacy from distinct B. primigenius populations. Taking into account that the maternally transmitted mtDNA tends to underestimate the extent of gene flow from European aurochsen, the detection of the R mtDNAs in autochthonous breeds, some of which are endangered, identifies an unexpected reservoir of genetic variation that should be carefully preserved.
The geographical position of Maharashtra state makes it rather essential to study the dispersal of modern humans in South Asia. Several hypotheses have been proposed to explain the cultural, linguistic and geographical affinity of the populations living in Maharashtra state with other South Asian populations. The genetic origin of populations living in this state is poorly understood and hitherto been described at low molecular resolution level.
To address this issue, we have analyzed the mitochondrial DNA (mtDNA) of 185 individuals and NRY (non-recombining region of Y chromosome) of 98 individuals belonging to two major tribal populations of Maharashtra, and compared their molecular variations with that of 54 South Asian contemporary populations of adjacent states. Inter and intra population comparisons reveal that the maternal gene pool of Maharashtra state populations is composed of mainly South Asian haplogroups with traces of east and west Eurasian haplogroups, while the paternal haplogroups comprise the South Asian as well as signature of near eastern specific haplogroup J2a.
Our analysis suggests that Indian populations, including Maharashtra state, are largely derived from Paleolithic ancient settlers; however, a more recent (∼10 Ky older) detectable paternal gene flow from west Asia is well reflected in the present study. These findings reveal movement of populations to Maharashtra through the western coast rather than mainland where Western Ghats-Vindhya Mountains and Narmada-Tapti rivers might have acted as a natural barrier. Comparing the Maharastrian populations with other South Asian populations reveals that they have a closer affinity with the South Indian than with the Central Indian populations.
Genetic affinities between aboriginal Taiwanese and populations from Oceania and Southeast Asia have previously been explored through analyses of mitochondrial DNA (mtDNA), Y chromosomal DNA, and human leukocyte antigen loci. Recent genetic studies have supported the “slow boat” and “entangled bank” models according to which the Polynesian migration can be seen as an expansion from Melanesia without any major direct genetic thread leading back to its initiation from Taiwan. We assessed mtDNA variation in 640 individuals from nine tribes of the central mountain ranges and east coast regions of Taiwan. In contrast to the Han populations, the tribes showed a low frequency of haplogroups D4 and G, and an absence of haplogroups A, C, Z, M9, and M10. Also, more than 85% of the maternal lineages were nested within haplogroups B4, B5a, F1a, F3b, E, and M7. Although indicating a common origin of the populations of insular Southeast Asia and Oceania, most mtDNA lineages in Taiwanese aboriginal populations are grouped separately from those found in China and the Taiwan general (Han) population, suggesting a prevalence in the Taiwanese aboriginal gene pool of its initial late Pleistocene settlers. Interestingly, from complete mtDNA sequencing information, most B4a lineages were associated with three coding region substitutions, defining a new subclade, B4a1a, that endorses the origin of Polynesian migration from Taiwan. Coalescence times of B4a1a were 13.2 ± 3.8 thousand years (or 9.3 ± 2.5 thousand years in Papuans and Polynesians). Considering the lack of a common specific Y chromosomal element shared by the Taiwanese aboriginals and Polynesians, the mtDNA evidence provided here is also consistent with the suggestion that the proto-Oceanic societies would have been mainly matrilocal.
An extensive phylogenetic analysis of mtDNA from nine Taiwanese tribes reveals an unambiguous genetic link between aboriginal Taiwanese and Polynesian populations, to the exclusion of mainland Asians.
R0 embraces the most common mitochondrial DNA (mtDNA) lineage in West Eurasia, namely, haplogroup H (∼40%). R0 sub-lineages are badly defined in the control region and therefore, the analysis of diagnostic coding region polymorphisms is needed in order to gain resolution in population and medical studies.
We sequenced the first hypervariable segment (HVS-I) of 518 individuals from different North Iberian regions. The mtDNAs belonging to R0 (∼57%) were further genotyped for a set of 71 coding region SNPs characterizing major and minor branches of R0. We found that the North Iberian Peninsula shows moderate levels of population stratification; for instance, haplogroup V reaches the highest frequency in Cantabria (north-central Iberia), but lower in Galicia (northwest Iberia) and Catalonia (northeast Iberia). When compared to other European and Middle East populations, haplogroups H1, H3 and H5a show frequency peaks in the Franco-Cantabrian region, declining from West towards the East and South Europe. In addition, we have characterized, by way of complete genome sequencing, a new autochthonous clade of haplogroup H in the Basque country, named H2a5. Its coalescence age, 15.6±8 thousand years ago (kya), dates to the period immediately after the Last Glacial Maximum (LGM).
In contrast to other H lineages that experienced re-expansion outside the Franco-Cantabrian refuge after the LGM (e.g. H1 and H3), H2a5 most likely remained confined to this area till present days.
More than a half of the northern Asian pool of human mitochondrial DNA (mtDNA) is fragmented into a number of subclades of haplogroups C and D, two of the most frequent haplogroups throughout northern, eastern, central Asia and America. While there has been considerable recent progress in studying mitochondrial variation in eastern Asia and America at the complete genome resolution, little comparable data is available for regions such as southern Siberia – the area where most of northern Asian haplogroups, including C and D, likely diversified. This gap in our knowledge causes a serious barrier for progress in understanding the demographic pre-history of northern Eurasia in general. Here we describe the phylogeography of haplogroups C and D in the populations of northern and eastern Asia. We have analyzed 770 samples from haplogroups C and D (174 and 596, respectively) at high resolution, including 182 novel complete mtDNA sequences representing haplogroups C and D (83 and 99, respectively). The present-day variation of haplogroups C and D suggests that these mtDNA clades expanded before the Last Glacial Maximum (LGM), with their oldest lineages being present in the eastern Asia. Unlike in eastern Asia, most of the northern Asian variants of haplogroups C and D began the expansion after the LGM, thus pointing to post-glacial re-colonization of northern Asia. Our results show that both haplogroups were involved in migrations, from eastern Asia and southern Siberia to eastern and northeastern Europe, likely during the middle Holocene.
The picture of dog mtDNA diversity, as obtained from geographically wide samplings but from a small number of individuals per region or breed, has revealed weak geographic correlation and high degree of haplotype sharing between very distant breeds. We aimed at a more detailed picture through extensive sampling (n = 143) of four Portuguese autochthonous breeds – Castro Laboreiro Dog, Serra da Estrela Mountain Dog, Portuguese Sheepdog and Azores Cattle Dog-and comparatively reanalysing published worldwide data.
Fifteen haplotypes belonging to four major haplogroups were found in these breeds, of which five are newly reported. The Castro Laboreiro Dog presented a 95% frequency of a new A haplotype, while all other breeds contained a diverse pool of existing lineages. The Serra da Estrela Mountain Dog, the most heterogeneous of the four Portuguese breeds, shared haplotypes with the other mainland breeds, while Azores Cattle Dog shared no haplotypes with the other Portuguese breeds.
A review of mtDNA haplotypes in dogs across the world revealed that: (a) breeds tend to display haplotypes belonging to different haplogroups; (b) haplogroup A is present in all breeds, and even uncommon haplogroups are highly dispersed among breeds and continental areas; (c) haplotype sharing between breeds of the same region is lower than between breeds of different regions and (d) genetic distances between breeds do not correlate with geography.
MtDNA haplotype sharing occurred between Serra da Estrela Mountain dogs (with putative origin in the centre of Portugal) and two breeds in the north and south of the country-with the Castro Laboreiro Dog (which behaves, at the mtDNA level, as a sub-sample of the Serra da Estrela Mountain Dog) and the southern Portuguese Sheepdog. In contrast, the Azores Cattle Dog did not share any haplotypes with the other Portuguese breeds, but with dogs sampled in Northern Europe. This suggested that the Azores Cattle Dog descended maternally from Northern European dogs rather than Portuguese mainland dogs. A review of published mtDNA haplotypes identified thirteen non-Portuguese breeds with sufficient data for comparison. Comparisons between these thirteen breeds, and the four Portuguese breeds, demonstrated widespread haplotype sharing, with the greatest diversity among Asian dogs, in accordance with the central role of Asia in canine domestication.
The out of Africa hypothesis has gained generalized consensus. However, many specific questions remain unsettled. To know whether the two M and N macrohaplogroups that colonized Eurasia were already present in Africa before the exit is puzzling. It has been proposed that the east African clade M1 supports a single origin of haplogroup M in Africa. To test the validity of that hypothesis, the phylogeographic analysis of 13 complete mitochondrial DNA (mtDNA) sequences and 261 partial sequences belonging to haplogroup M1 was carried out.
The coalescence age of the African haplogroup M1 is younger than those for other M Asiatic clades. In contradiction to the hypothesis of an eastern Africa origin for modern human expansions out of Africa, the most ancestral M1 lineages have been found in Northwest Africa and in the Near East, instead of in East Africa. The M1 geographic distribution and the relative ages of its different subclades clearly correlate with those of haplogroup U6, for which an Eurasian ancestor has been demonstrated.
This study provides evidence that M1, or its ancestor, had an Asiatic origin. The earliest M1 expansion into Africa occurred in northwestern instead of eastern areas; this early spread reached the Iberian Peninsula even affecting the Basques. The majority of the M1a lineages found outside and inside Africa had a more recent eastern Africa origin. Both western and eastern M1 lineages participated in the Neolithic colonization of the Sahara. The striking parallelism between subclade ages and geographic distribution of M1 and its North African U6 counterpart strongly reinforces this scenario. Finally, a relevant fraction of M1a lineages present today in the European Continent and nearby islands possibly had a Jewish instead of the commonly proposed Arab/Berber maternal ascendance.
To shed more light on the processes leading to crystallization of a Slavic identity, we investigated variability of complete mitochondrial genomes belonging to haplogroups H5 and H6 (63 mtDNA genomes) from the populations of Eastern and Western Slavs, including new samples of Poles, Ukrainians and Czechs presented here. Molecular dating implies formation of H5 approximately 11.5–16 thousand years ago (kya) in the areas of southern Europe. Within ancient haplogroup H6, dated at around 15–28 kya, there is a subhaplogroup H6c, which probably survived the last glaciation in Europe and has undergone expansion only 3–4 kya, together with the ancestors of some European groups, including the Slavs, because H6c has been detected in Czechs, Poles and Slovaks. Detailed analysis of complete mtDNAs allowed us to identify a number of lineages that seem specific for Central and Eastern Europe (H5a1f, H5a2, H5a1r, H5a1s, H5b4, H5e1a, H5u1, some subbranches of H5a1a and H6a1a9). Some of them could possibly be traced back to at least ∼4 kya, which indicates that some of the ancestors of today's Slavs (Poles, Czechs, Slovaks, Ukrainians and Russians) inhabited areas of Central and Eastern Europe much earlier than it was estimated on the basis of archaeological and historical data. We also sequenced entire mitochondrial genomes of several non-European lineages (A, C, D, G, L) found in contemporary populations of Poland and Ukraine. The analysis of these haplogroups confirms the presence of Siberian (C5c1, A8a1) and Ashkenazi-specific (L2a1l2a) mtDNA lineages in Slavic populations. Moreover, we were able to pinpoint some lineages which could possibly reflect the relatively recent contacts of Slavs with nomadic Altaic peoples (C4a1a, G2a, D5a2a1a1).
The European genetic landscape has been shaped by several human migrations occurred since Paleolithic times. The accumulation of archaeological records and the concordance of different lines of genetic evidence during the last two decades have triggered an interesting debate concerning the role of ancient settlers from the Franco-Cantabrian region in the postglacial resettlement of Europe. Among the Franco-Cantabrian populations, Basques are regarded as one of the oldest and more intriguing human groups of Europe. Recent data on complete mitochondrial DNA genomes focused on macrohaplogroup R0 revealed that Basques harbor some autochthonous lineages, suggesting a genetic continuity since pre-Neolithic times. However, excluding haplogroup H, the most representative lineage of macrohaplogroup R0, the majority of maternal lineages of this area remains virtually unexplored, so that further refinement of the mtDNA phylogeny based on analyses at the highest level of resolution is crucial for a better understanding of the European prehistory. We thus explored the maternal ancestry of 548 autochthonous individuals from various Franco-Cantabrian populations and sequenced 76 mitogenomes of the most representative lineages. Interestingly, we identified three mtDNA haplogroups, U5b1f, J1c5c1 and V22, that proved to be representative of Franco-Cantabria, notably of the Basque population. The seclusion and diversity of these female genetic lineages support a local origin in the Franco-Cantabrian area during the Mesolithic of southwestern Europe, ∼10,000 years before present (YBP), with signals of expansions at ∼3,500 YBP. These findings provide robust evidence of a partial genetic continuity between contemporary autochthonous populations from the Franco-Cantabrian region, specifically the Basques, and Paleolithic/Mesolithic hunter-gatherer groups. Furthermore, our results raise the current proportion (≈15%) of the Franco-Cantabrian maternal gene pool with a putative pre-Neolithic origin to ≈35%, further supporting the notion of a predominant Paleolithic genetic substrate in extant European populations.
Human settlement and migrations along sides of Bay-of-Bengal have played a vital role in shaping the genetic landscape of Bangladesh, Eastern India and Southeast Asia. Bangladesh and Northeast India form the vital land bridge between the South and Southeast Asia. To reconstruct the population history of this region and to see whether this diverse region geographically acted as a corridor or barrier for human interaction between South Asia and Southeast Asia, we, for the first time analyzed high resolution uniparental (mtDNA and Y chromosome) and biparental autosomal genetic markers among aboriginal Bangladesh tribes currently speaking Tibeto-Burman language. All the three studied populations; Chakma, Marma and Tripura from Bangladesh showed strikingly high homogeneity among themselves and strong affinities to Northeast Indian Tibeto-Burman groups. However, they show substantially higher molecular diversity than Northeast Indian populations. Unlike Austroasiatic (Munda) speakers of India, we observed equal role of both males and females in shaping the Tibeto-Burman expansion in Southern Asia. Moreover, it is noteworthy that in admixture proportion, TB populations of Bangladesh carry substantially higher mainland Indian ancestry component than Northeast Indian Tibeto-Burmans. Largely similar expansion ages of two major paternal haplogroups (O2a and O3a3c), suggested that they arose before the differentiation of any language group and approximately at the same time. Contrary to the scenario proposed for colonization of Northeast India as male founder effect that occurred within the past 4,000 years, we suggest a significantly deep colonization of this region. Overall, our extensive analysis revealed that the population history of South Asian Tibeto-Burman speakers is more complex than it was suggested before.
A phylogenetic tree constructed from sequencing information of 24 whole human mtDNA genomes revealed novel substitutions in the previously defined M2a and M6 lineages in East Asia. Seven new basal mutations and fourteen lineages that substantially contribute to the present understanding of superhaplogroup M were identified.
Phylogenetic analysis of human complete mitochondrial DNA sequences has largely contributed to resolving phylogenies and antiquity of different lineages belonging to the majorhaplogroups L, N and M (East-Asian lineages). In the absence of whole mtDNA sequence information of M lineages reported in India that exhibits highest diversity within the sub-continent, the present study was undertaken to provide a detailed analysis of this haplogroup to precisely characterize the lineages and unravel their intricate phylogeny.
The phylogenetic tree constructed from sequencing information of twenty four whole mtDNA genome revealed novel substitutions in the previously defined M2a and M6 lineages. The most striking feature of this phylogenetic tree is the formulation of a new lineage M30, distinguished by the presence of 12007 transition, and comprises of the recently defined M18 and a potential new sub-lineage possessing substitution at 16223 and 16300. M30 further branches into M30a sub-lineage, defined by 15431 and 195A substitution. The age of M30 lineage was estimated at 33,042 YBP, indicating a more recent expansion time than M2 (49,686 YBP). Contradictory to earlier reports, the M5 lineage does not always include a 12477 substitution, and is more appropriately defined by a transversion at 10986A. The phylogenetic tree also identifies a potential new lineage M* with HVSI sequence 16223,16325. No new substitutions were found in M25 and the M3 mt DNA genome could only be tentatively rooted by 16126 mutation. M4 and M*(16251, 16267) lineages could not be resolved distinctly.
This study describes seven new basal mutations and fourteen lineages that substantially contribute to the present understanding of superhaplogroup M. The phylogenetic tree supported by median-joining network helps in distinctly identifying the genetic relation between different M lineages that could not be achieved solely by control region sequence information. Although high control region diversity has been reported in the different M lineages distributed in India, complete sequencing of M* and defined lineages suggests that these mt DNA genomes emerged from a limited number of branches arising from the M trunk.
The Y-chromosome haplogroup N-M231 (Hg N) is distributed widely in eastern and central Asia, Siberia, as well as in eastern and northern Europe. Previous studies suggested a counterclockwise prehistoric migration of Hg N from eastern Asia to eastern and northern Europe. However, the root of this Y chromosome lineage and its detailed dispersal pattern across eastern Asia are still unclear. We analyzed haplogroup profiles and phylogeographic patterns of 1,570 Hg N individuals from 20,826 males in 359 populations across Eurasia. We first genotyped 6,371 males from 169 populations in China and Cambodia, and generated data of 360 Hg N individuals, and then combined published data on 1,210 Hg N individuals from Japanese, Southeast Asian, Siberian, European and Central Asian populations. The results showed that the sub-haplogroups of Hg N have a distinct geographical distribution. The highest Y-STR diversity of the ancestral Hg N sub-haplogroups was observed in the southern part of mainland East Asia, and further phylogeographic analyses supports an origin of Hg N in southern China. Combined with previous data, we propose that the early northward dispersal of Hg N started from southern China about 21 thousand years ago (kya), expanding into northern China 12–18 kya, and reaching further north to Siberia about 12–14 kya before a population expansion and westward migration into Central Asia and eastern/northern Europe around 8.0–10.0 kya. This northward migration of Hg N likewise coincides with retreating ice sheets after the Last Glacial Maximum (22–18 kya) in mainland East Asia.
To construct maternal phylogeny and prehistoric dispersals of modern human being in the Indian sub continent, a diverse subset of 641 complete mitochondrial DNA (mtDNA) genomes belonging to macrohaplogroup M was chosen from a total collection of 2,783 control-region sequences, sampled from 26 selected tribal populations of India. On the basis of complete mtDNA sequencing, we identified 12 new haplogroups - M53 to M64; redefined/ascertained and characterized haplogroups M2, M3, M4, M5, M6, M8′C′Z, M9, M10, M11, M12-G, D, M18, M30, M33, M35, M37, M38, M39, M40, M41, M43, M45 and M49, which were previously described by control and/or coding-region polymorphisms. Our results indicate that the mtDNA lineages reported in the present study (except East Asian lineages M8′C′Z, M9, M10, M11, M12-G, D ) are restricted to Indian region.The deep rooted lineages of macrohaplogroup ‘M’ suggest in-situ origin of these haplogroups in India. Most of these deep rooting lineages are represented by multiple ethnic/linguist groups of India. Hierarchical analysis of molecular variation (AMOVA) shows substantial subdivisions among the tribes of India (Fst = 0.16164). The current Indian mtDNA gene pool was shaped by the initial settlers and was galvanized by minor events of gene flow from the east and west to the restricted zones. Northeast Indian mtDNA pool harbors region specific lineages, other Indian lineages and East Asian lineages. We also suggest the establishment of an East Asian gene in North East India through admixture rather than replacement.
Only a few genetic studies have been carried out to date in Bolivia. However, some of the most important (pre)historical enclaves of South America were located in these territories. Thus, the (sub)-Andean region of Bolivia was part of the Inca Empire, the largest state in Pre-Columbian America. We have genotyped the first hypervariable region (HVS-I) of 720 samples representing the main regions in Bolivia, and these data have been analyzed in the context of other pan-American samples (>19,000 HVS-I mtDNAs). Entire mtDNA genome sequencing was also undertaken on selected Native American lineages. Additionally, a panel of 46 Ancestry Informative Markers (AIMs) was genotyped in a sub-set of samples. The vast majority of the Bolivian mtDNAs (98.4%) were found to belong to the main Native American haplogroups (A: 14.3%, B: 52.6%, C: 21.9%, D: 9.6%), with little indication of sub-Saharan and/or European lineages; however, marked patterns of haplogroup frequencies between main regions exist (e.g. haplogroup B: Andean [71%], Sub-Andean [61%], Llanos [32%]). Analysis of entire genomes unraveled the phylogenetic characteristics of three Native haplogroups: the pan-American haplogroup B2b (originated ∼21.4 thousand years ago [kya]), A2ah (∼5.2 kya), and B2o (∼2.6 kya). The data suggest that B2b could have arisen in North California (an origin even in the north most region of the American continent cannot be disregarded), moved southward following the Pacific coastline and crossed Meso-America. Then, it most likely spread into South America following two routes: the Pacific path towards Peru and Bolivia (arriving here at about ∼15.2 kya), and the Amazonian route of Venezuela and Brazil southwards. In contrast to the mtDNA, Ancestry Informative Markers (AIMs) reveal a higher (although geographically variable) European introgression in Bolivians (25%). Bolivia shows a decreasing autosomal molecular diversity pattern along the longitudinal axis, from the Altiplano to the lowlands. Both autosomes and mtDNA revealed a low impact (1–2%) of a sub-Saharan component in Bolivians.
For millennia, the southern part of the Mesopotamia has been a wetland region generated by the Tigris and Euphrates rivers before flowing into the Gulf. This area has been occupied by human communities since ancient times and the present-day inhabitants, the Marsh Arabs, are considered the population with the strongest link to ancient Sumerians. Popular tradition, however, considers the Marsh Arabs as a foreign group, of unknown origin, which arrived in the marshlands when the rearing of water buffalo was introduced to the region.
To shed some light on the paternal and maternal origin of this population, Y chromosome and mitochondrial DNA (mtDNA) variation was surveyed in 143 Marsh Arabs and in a large sample of Iraqi controls. Analyses of the haplogroups and sub-haplogroups observed in the Marsh Arabs revealed a prevalent autochthonous Middle Eastern component for both male and female gene pools, with weak South-West Asian and African contributions, more evident in mtDNA. A higher male than female homogeneity is characteristic of the Marsh Arab gene pool, likely due to a strong male genetic drift determined by socio-cultural factors (patrilocality, polygamy, unequal male and female migration rates).
Evidence of genetic stratification ascribable to the Sumerian development was provided by the Y-chromosome data where the J1-Page08 branch reveals a local expansion, almost contemporary with the Sumerian City State period that characterized Southern Mesopotamia. On the other hand, a more ancient background shared with Northern Mesopotamia is revealed by the less represented Y-chromosome lineage J1-M267*. Overall our results indicate that the introduction of water buffalo breeding and rice farming, most likely from the Indian sub-continent, only marginally affected the gene pool of autochthonous people of the region. Furthermore, a prevalent Middle Eastern ancestry of the modern population of the marshes of southern Iraq implies that if the Marsh Arabs are descendants of the ancient Sumerians, also the Sumerians were most likely autochthonous and not of Indian or South Asian ancestry.
A fine-grained mitochondrial DNA phylogenomic analysis was conducted in domestic pigs and wild boars, revealing that pig domestication in East Asia occurred in the Mekong and the middle and downstream regions of the Yangtze river.
Previously reported evidence indicates that pigs were independently domesticated in multiple places throughout the world. However, a detailed picture of the origin and dispersal of domestic pigs in East Asia has not yet been reported.
Population phylogenomic analysis was conducted in domestic pigs and wild boars by screening the haplogroup-specific mutation motifs inferred from a phylogenetic tree of pig complete mitochondrial DNA (mtDNA) sequences. All domestic pigs are clustered into single clade D (which contains subclades D1, D2, D3, and D4), with wild boars from East Asia being interspersed. Three haplogroups within D1 are dominant in the Mekong region (D1a2 and D1b) and the middle and downstream regions of the Yangtze River (D1a1a), and may represent independent founders of domestic pigs. None of the domestic pig samples from North East Asia, the Yellow River region, and the upstream region of the Yangtze River share the same haplogroup status with the local wild boars. The limited regional distributions of haplogroups D1 (including its subhaplogroups), D2, D3, and D4 in domestic pigs suggest at least two different in situ domestication events.
The use of fine-grained mtDNA phylogenomic analysis of wild boars and domestic pigs is a powerful tool with which to discern the origin of domestic pigs. Our findings show that pig domestication in East Asia mainly occurred in the Mekong region and the middle and downstream regions of the Yangtze River.
The domestic pig currently indigenous to the Tibetan highlands is supposed to have been introduced during a continuous period of colonization by the ancestors of modern Tibetans. However, there is no direct genetic evidence of either the local origin or exotic migration of the Tibetan pig.
Methods and Findings
We analyzed mtDNA hypervariable segment I (HVI) variation of 218 individuals from seven Tibetan pig populations and 1,737 reported mtDNA sequences from domestic pigs and wild boars across Asia. The Bayesian consensus tree revealed a main haplogroup M and twelve minor haplogroups, which suggested a large number of small scale in situ domestication episodes. In particular, haplogroups D1 and D6 represented two highly divergent lineages in the Tibetan highlands and Island Southeastern Asia, respectively. Network analysis of haplogroup M further revealed one main subhaplogroup M1 and two minor subhaplogroups M2 and M3. Intriguingly, M2 was mainly distributed in Southeastern Asia, suggesting for a local origin. Similar with haplogroup D6, M3 was mainly restricted in Island Southeastern Asia. This pattern suggested that Island Southeastern Asia, but not Southeastern Asia, might be the center of domestication of the so-called Pacific clade (M3 and D6 here) described in previous studies. Diversity gradient analysis of major subhaplogroup M1 suggested three local origins in Southeastern Asia, the middle and downstream regions of the Yangtze River, and the Tibetan highlands, respectively.
We identified two new origin centers for domestic pigs in the Tibetan highlands and in the Island Southeastern Asian region.