We have analyzed 7137 samples from 125 different caste, tribal and religious groups of India and 99 samples from three populations of Nepal for the length variation in the COII/tRNALys region of mtDNA. Samples showing length variation were subjected to detailed phylogenetic analysis based on HVS-I and informative coding region sequence variation. The overall frequencies of the 9-bp deletion and insertion variants in South Asia were 1.8% and 0.5%, respectively. We have also defined a novel deep-rooting haplogroup M43 and identified the rare haplogroup H14 in Indian populations carrying the 9bp-deletion by complete mtDNA sequencing. Moreover, we redefined haplogroup M6 and dissected it into two well-defined subclades. The presence of haplogroups F1 and B5a in Uttar Pradesh suggests minor maternal contribution from Southeast Asia to Northern India. The occurrence of haplogroup F1 in the Nepalese sample implies that Nepal might have served as a bridge for the flow of eastern lineages to India. The presence of R6 in the Nepalese, on the other hand, suggests that the gene flow between India and Nepal has been reciprocal.
South Asia; 9bp indel; mtDNA; Haplogroup
Recent advances in the understanding of the maternal and paternal heritage of south and southwest Asian populations have highlighted their role in the colonization of Eurasia by anatomically modern humans. Further understanding requires a deeper insight into the topology of the branches of the Indian mtDNA phylogenetic tree, which should be contextualized within the phylogeography of the neighboring regional mtDNA variation. Accordingly, we have analyzed mtDNA control and coding region variation in 796 Indian (including both tribal and caste populations from different parts of India) and 436 Iranian mtDNAs. The results were integrated and analyzed together with published data from South, Southeast Asia and West Eurasia.
Four new Indian-specific haplogroup M sub-clades were defined. These, in combination with two previously described haplogroups, encompass approximately one third of the haplogroup M mtDNAs in India. Their phylogeography and spread among different linguistic phyla and social strata was investigated in detail. Furthermore, the analysis of the Iranian mtDNA pool revealed patterns of limited reciprocal gene flow between Iran and the Indian sub-continent and allowed the identification of different assemblies of shared mtDNA sub-clades.
Since the initial peopling of South and West Asia by anatomically modern humans, when this region may well have provided the initial settlers who colonized much of the rest of Eurasia, the gene flow in and out of India of the maternally transmitted mtDNA has been surprisingly limited. Specifically, our analysis of the mtDNA haplogroups, which are shared between Indian and Iranian populations and exhibit coalescence ages corresponding to around the early Upper Paleolithic, indicates that they are present in India largely as Indian-specific sub-lineages. In contrast, other ancient Indian-specific variants of M and R are very rare outside the sub-continent.
Central Asia and the Indian subcontinent represent an area considered as a source and a reservoir for human genetic diversity, with many markers taking root here, most of which are the ancestral state of eastern and western haplogroups, while others are local. Between these two regions, Terai (Nepal) is a pivotal passageway allowing, in different times, multiple population interactions, although because of its highly malarial environment, it was scarcely inhabited until a few decades ago, when malaria was eradicated. One of the oldest and the largest indigenous people of Terai is represented by the malaria resistant Tharus, whose gene pool could still retain traces of ancient complex interactions. Until now, however, investigations on their genetic structure have been scarce mainly identifying East Asian signatures.
High-resolution analyses of mitochondrial-DNA (including 34 complete sequences) and Y-chromosome (67 SNPs and 12 STRs) variations carried out in 173 Tharus (two groups from Central and one from Eastern Terai), and 104 Indians (Hindus from Terai and New Delhi and tribals from Andhra Pradesh) allowed the identification of three principal components: East Asian, West Eurasian and Indian, the last including both local and inter-regional sub-components, at least for the Y chromosome.
Although remarkable quantitative and qualitative differences appear among the various population groups and also between sexes within the same group, many mitochondrial-DNA and Y-chromosome lineages are shared or derived from ancient Indian haplogroups, thus revealing a deep shared ancestry between Tharus and Indians. Interestingly, the local Y-chromosome Indian component observed in the Andhra-Pradesh tribals is present in all Tharu groups, whereas the inter-regional component strongly prevails in the two Hindu samples and other Nepalese populations.
The complete sequencing of mtDNAs from unresolved haplogroups also provided informative markers that greatly improved the mtDNA phylogeny and allowed the identification of ancient relationships between Tharus and Malaysia, the Andaman Islands and Japan as well as between India and North and East Africa. Overall, this study gives a paradigmatic example of the importance of genetic isolates in revealing variants not easily detectable in the general population.
To construct maternal phylogeny and prehistoric dispersals of modern human being in the Indian sub continent, a diverse subset of 641 complete mitochondrial DNA (mtDNA) genomes belonging to macrohaplogroup M was chosen from a total collection of 2,783 control-region sequences, sampled from 26 selected tribal populations of India. On the basis of complete mtDNA sequencing, we identified 12 new haplogroups - M53 to M64; redefined/ascertained and characterized haplogroups M2, M3, M4, M5, M6, M8′C′Z, M9, M10, M11, M12-G, D, M18, M30, M33, M35, M37, M38, M39, M40, M41, M43, M45 and M49, which were previously described by control and/or coding-region polymorphisms. Our results indicate that the mtDNA lineages reported in the present study (except East Asian lineages M8′C′Z, M9, M10, M11, M12-G, D ) are restricted to Indian region.The deep rooted lineages of macrohaplogroup ‘M’ suggest in-situ origin of these haplogroups in India. Most of these deep rooting lineages are represented by multiple ethnic/linguist groups of India. Hierarchical analysis of molecular variation (AMOVA) shows substantial subdivisions among the tribes of India (Fst = 0.16164). The current Indian mtDNA gene pool was shaped by the initial settlers and was galvanized by minor events of gene flow from the east and west to the restricted zones. Northeast Indian mtDNA pool harbors region specific lineages, other Indian lineages and East Asian lineages. We also suggest the establishment of an East Asian gene in North East India through admixture rather than replacement.
Genetic affinities between aboriginal Taiwanese and populations from Oceania and Southeast Asia have previously been explored through analyses of mitochondrial DNA (mtDNA), Y chromosomal DNA, and human leukocyte antigen loci. Recent genetic studies have supported the “slow boat” and “entangled bank” models according to which the Polynesian migration can be seen as an expansion from Melanesia without any major direct genetic thread leading back to its initiation from Taiwan. We assessed mtDNA variation in 640 individuals from nine tribes of the central mountain ranges and east coast regions of Taiwan. In contrast to the Han populations, the tribes showed a low frequency of haplogroups D4 and G, and an absence of haplogroups A, C, Z, M9, and M10. Also, more than 85% of the maternal lineages were nested within haplogroups B4, B5a, F1a, F3b, E, and M7. Although indicating a common origin of the populations of insular Southeast Asia and Oceania, most mtDNA lineages in Taiwanese aboriginal populations are grouped separately from those found in China and the Taiwan general (Han) population, suggesting a prevalence in the Taiwanese aboriginal gene pool of its initial late Pleistocene settlers. Interestingly, from complete mtDNA sequencing information, most B4a lineages were associated with three coding region substitutions, defining a new subclade, B4a1a, that endorses the origin of Polynesian migration from Taiwan. Coalescence times of B4a1a were 13.2 ± 3.8 thousand years (or 9.3 ± 2.5 thousand years in Papuans and Polynesians). Considering the lack of a common specific Y chromosomal element shared by the Taiwanese aboriginals and Polynesians, the mtDNA evidence provided here is also consistent with the suggestion that the proto-Oceanic societies would have been mainly matrilocal.
An extensive phylogenetic analysis of mtDNA from nine Taiwanese tribes reveals an unambiguous genetic link between aboriginal Taiwanese and Polynesian populations, to the exclusion of mainland Asians.
The highly structured distribution of Y-chromosome haplogroups suggests that current patterns of variation may be informative of past population processes. However, limited phylogenetic resolution, particularly of subclades within haplogroup K, has obscured the relationships of lineages that are common across Eurasia. Here we genotype 13 new highly informative single-nucleotide polymorphisms in a worldwide sample of 4413 males that carry the derived allele at M526, and reconstruct an NRY haplogroup tree with significantly higher resolution for the major clade within haplogroup K, K-M526. Although K-M526 was previously characterized by a single polytomy of eight major branches, the phylogenetic structure of haplogroup K-M526 is now resolved into four major subclades (K2a–d). The largest of these subclades, K2b, is divided into two clusters: K2b1 and K2b2. K2b1 combines the previously known haplogroups M, S, K-P60 and K-P79, whereas K2b2 comprises haplogroups P and its subhaplogroups Q and R. Interestingly, the monophyletic group formed by haplogroups R and Q, which make up the majority of paternal lineages in Europe, Central Asia and the Americas, represents the only subclade with K2b that is not geographically restricted to Southeast Asia and Oceania. Estimates of the interval times for the branching events between M9 and P295 point to an initial rapid diversification process of K-M526 that likely occurred in Southeast Asia, with subsequent westward expansions of the ancestors of haplogroups R and Q.
More than a half of the northern Asian pool of human mitochondrial DNA (mtDNA) is fragmented into a number of subclades of haplogroups C and D, two of the most frequent haplogroups throughout northern, eastern, central Asia and America. While there has been considerable recent progress in studying mitochondrial variation in eastern Asia and America at the complete genome resolution, little comparable data is available for regions such as southern Siberia – the area where most of northern Asian haplogroups, including C and D, likely diversified. This gap in our knowledge causes a serious barrier for progress in understanding the demographic pre-history of northern Eurasia in general. Here we describe the phylogeography of haplogroups C and D in the populations of northern and eastern Asia. We have analyzed 770 samples from haplogroups C and D (174 and 596, respectively) at high resolution, including 182 novel complete mtDNA sequences representing haplogroups C and D (83 and 99, respectively). The present-day variation of haplogroups C and D suggests that these mtDNA clades expanded before the Last Glacial Maximum (LGM), with their oldest lineages being present in the eastern Asia. Unlike in eastern Asia, most of the northern Asian variants of haplogroups C and D began the expansion after the LGM, thus pointing to post-glacial re-colonization of northern Asia. Our results show that both haplogroups were involved in migrations, from eastern Asia and southern Siberia to eastern and northeastern Europe, likely during the middle Holocene.
Hainan Island is located around the conjunction of East Asia and Southeast Asia, and during the Last Glacial Maximum (LGM) was connected with the mainland. This provided an opportunity for the colonization of Hainan Island by modern human in the Upper Pleistocene. Whether the ancient dispersal left any footprints in the contemporary gene pool of Hainan islanders is debatable.
We collected samples from 285 Li individuals and analyzed mitochondrial DNA (mtDNA) variations of hypervariable sequence I and II (HVS-I and II), as well as partial coding regions. By incorporating previously reported data, the phylogeny of Hainan islanders was reconstructed. We found that Hainan islanders showed a close relationship with the populations in mainland southern China, especially from Guangxi. Haplotype sharing analyses suggested that the recent gene flow from the mainland might play important roles in shaping the maternal pool of Hainan islanders. More importantly, haplogroups M12, M7e, and M7c1* might represent the genetic relics of the ancient population that populated this region; thus, 14 representative complete mtDNA genomes were further sequenced.
The detailed phylogeographic analyses of haplogroups M12, M7e, and M7c1* indicated that the early peopling of Hainan Island by modern human could be traced back to the early Holocene and/or even the late Upper Pleistocene, around 7 - 27 kya. These results correspond to both Y-chromosome and archaeological studies.
Human genetic diversity observed in Indian subcontinent is second only to that of Africa. This implies an early settlement and demographic growth soon after the first 'Out-of-Africa' dispersal of anatomically modern humans in Late Pleistocene. In contrast to this perspective, linguistic diversity in India has been thought to derive from more recent population movements and episodes of contact. With the exception of Dravidian, which origin and relatedness to other language phyla is obscure, all the language families in India can be linked to language families spoken in different regions of Eurasia. Mitochondrial DNA and Y chromosome evidence has supported largely local evolution of the genetic lineages of the majority of Dravidian and Indo-European speaking populations, but there is no consensus yet on the question of whether the Munda (Austro-Asiatic) speaking populations originated in India or derive from a relatively recent migration from further East.
Here, we report the analysis of 35 novel complete mtDNA sequences from India which refine the structure of Indian-specific varieties of haplogroup R. Detailed analysis of haplogroup R7, coupled with a survey of ~12,000 mtDNAs from caste and tribal groups over the entire Indian subcontinent, reveals that one of its more recently derived branches (R7a1), is particularly frequent among Munda-speaking tribal groups. This branch is nested within diverse R7 lineages found among Dravidian and Indo-European speakers of India. We have inferred from this that a subset of Munda-speaking groups have acquired R7 relatively recently. Furthermore, we find that the distribution of R7a1 within the Munda-speakers is largely restricted to one of the sub-branches (Kherwari) of northern Munda languages. This evidence does not support the hypothesis that the Austro-Asiatic speakers are the primary source of the R7 variation. Statistical analyses suggest a significant correlation between genetic variation and geography, rather than between genes and languages.
Our high-resolution phylogeographic study, involving diverse linguistic groups in India, suggests that the high frequency of mtDNA haplogroup R7 among Munda speaking populations of India can be explained best by gene flow from linguistically different populations of Indian subcontinent. The conclusion is based on the observation that among Indo-Europeans, and particularly in Dravidians, the haplogroup is, despite its lower frequency, phylogenetically more divergent, while among the Munda speakers only one sub-clade of R7, i.e. R7a1, can be observed. It is noteworthy that though R7 is autochthonous to India, and arises from the root of hg R, its distribution and phylogeography in India is not uniform. This suggests the more ancient establishment of an autochthonous matrilineal genetic structure, and that isolation in the Pleistocene, lineage loss through drift, and endogamy of prehistoric and historic groups have greatly inhibited genetic homogenization and geographical uniformity.
The phylogeny of the indigenous Indian-specific mitochondrial DNA (mtDNA) haplogroups have been determined and refined in previous reports. Similar to mtDNA superhaplogroups M and N, a profusion of reports are also available for superhaplogroup R. However, there is a dearth of information on South Asian subhaplogroups in particular, including R8. Therefore, we ought to access the genealogy and pre-historic expansion of haplogroup R8 which is considered one of the autochthonous lineages of South Asia.
Upon screening the mtDNA of 5,836 individuals belonging to 104 distinct ethnic populations of the Indian subcontinent, we found 54 individuals with the HVS-I motif that defines the R8 haplogroup. Complete mtDNA sequencing of these 54 individuals revealed two deep-rooted subclades: R8a and R8b. Furthermore, these subclades split into several fine subclades. An isofrequency contour map detected the highest frequency of R8 in the state of Orissa. Spearman's rank correlation analysis suggests significant correlation of R8 occurrence with geography.
The coalescent age of newly-characterized subclades of R8, R8a (15.4±7.2 Kya) and R8b (25.7±10.2 Kya) indicates that the initial maternal colonization of this haplogroup occurred during the middle and upper Paleolithic period, roughly around 40 to 45 Kya. These results signify that the southern part of Orissa currently inhabited by Munda speakers is likely the origin of these autochthonous maternal deep-rooted haplogroups. Our high-resolution study on the genesis of R8 haplogroup provides ample evidence of its deep-rooted ancestry among the Orissa (Austro-Asiatic) tribes.
Macrohaplogroups 'M' and 'N' have evolved almost in parallel from a founder haplogroup L3. Macrohaplogroup N in India has already been defined in previous studies and recently the macrohaplogroup M among the Indian populations has been characterized. In this study, we attempted to reconstruct and re-evaluate the phylogeny of Macrohaplogroup M, which harbors more than 60% of the Indian mtDNA lineage, and to shed light on the origin of its deep rooting haplogroups.
Using 11 whole mtDNA and 2231 partial coding sequence of Indian M lineage selected from 8670 HVS1 sequences across India, we have reconstructed the tree including Andamanese-specific lineage M31 and calculated the time depth of all the nodes. We defined one novel haplogroup M41, and revised the classification of haplogroups M3, M18, and M31.
Our result indicates that the Indian mtDNA pool consists of several deep rooting lineages of macrohaplogroup 'M' suggesting in-situ origin of these haplogroups in South Asia, most likely in the India. These deep rooting lineages are not language specific and spread over all the language groups in India. Moreover, our reanalysis of the Andamanese-specific lineage M31 suggests population specific two clear-cut subclades (M31a1 and M31a2). Onge and Jarwa share M31a1 branch while M31a2 clade is present in only Great Andamanese individuals. Overall our study supported the one wave, rapid dispersal theory of modern humans along the Asian coast.
Myanmar is the largest country in mainland Southeast Asia with a population of 55 million people subdivided into more than 100 ethnic groups. Ruled by changing kingdoms and dynasties and lying on the trade route between India and China, Myanmar was influenced by numerous cultures. Since its independence from British occupation, tensions between the ruling Bamar and ethnic minorities increased.
Our aim was to search for genetic footprints of Myanmar’s geographic, historic and sociocultural characteristics and to contribute to the picture of human colonization by describing and dating of new mitochondrial DNA (mtDNA) haplogroups. Therefore, we sequenced the mtDNA control region of 327 unrelated donors and the complete mitochondrial genome of 44 selected individuals according to highest quality standards.
Phylogenetic analyses of the entire mtDNA genomes uncovered eight new haplogroups and three unclassified basal M-lineages. The multi-ethnic population and the complex history of Myanmar were reflected in its mtDNA heterogeneity. Population genetic analyses of Burmese control region sequences combined with population data from neighboring countries revealed that the Myanmar haplogroup distribution showed a typical Southeast Asian pattern, but also Northeast Asian and Indian influences. The population structure of the extraordinarily diverse Bamar differed from that of the Karen people who displayed signs of genetic isolation. Migration analyses indicated a considerable genetic exchange with an overall positive migration balance from Myanmar to neighboring countries. Age estimates of the newly described haplogroups point to the existence of evolutionary windows where climatic and cultural changes gave rise to mitochondrial haplogroup diversification in Asia.
Haplogroup; Complete mtDNA genome; Control region; Population genetics; Migration; Gene flow; Burma; Southeast Asia; Karen; Bamar; Demographic history
Sakha – an area connecting South and Northeast Siberia – is significant for understanding the history of peopling of Northeast Eurasia and the Americas. Previous studies have shown a genetic contiguity between Siberia and East Asia and the key role of South Siberia in the colonization of Siberia.
We report the results of a high-resolution phylogenetic analysis of 701 mtDNAs and 318 Y chromosomes from five native populations of Sakha (Yakuts, Evenks, Evens, Yukaghirs and Dolgans) and of the analysis of more than 500,000 autosomal SNPs of 758 individuals from 55 populations, including 40 previously unpublished samples from Siberia. Phylogenetically terminal clades of East Asian mtDNA haplogroups C and D and Y-chromosome haplogroups N1c, N1b and C3, constituting the core of the gene pool of the native populations from Sakha, connect Sakha and South Siberia. Analysis of autosomal SNP data confirms the genetic continuity between Sakha and South Siberia. Maternal lineages D5a2a2, C4a1c, C4a2, C5b1b and the Yakut-specific STR sub-clade of Y-chromosome haplogroup N1c can be linked to a migration of Yakut ancestors, while the paternal lineage C3c was most likely carried to Sakha by the expansion of the Tungusic people. MtDNA haplogroups Z1a1b and Z1a3, present in Yukaghirs, Evens and Dolgans, show traces of different and probably more ancient migration(s). Analysis of both haploid loci and autosomal SNP data revealed only minor genetic components shared between Sakha and the extreme Northeast Siberia. Although the major part of West Eurasian maternal and paternal lineages in Sakha could originate from recent admixture with East Europeans, mtDNA haplogroups H8, H20a and HV1a1a, as well as Y-chromosome haplogroup J, more probably reflect an ancient gene flow from West Eurasia through Central Asia and South Siberia.
Our high-resolution phylogenetic dissection of mtDNA and Y-chromosome haplogroups as well as analysis of autosomal SNP data suggests that Sakha was colonized by repeated expansions from South Siberia with minor gene flow from the Lower Amur/Southern Okhotsk region and/or Kamchatka. The minor West Eurasian component in Sakha attests to both recent and ongoing admixture with East Europeans and an ancient gene flow from West Eurasia.
mtDNA; Y chromosome; Autosomal SNPs; Sakha
The Maldives are an 850 km-long string of atolls located centrally in the northern Indian Ocean basin. Because of this geographic situation, the present-day Maldivian population has potential for uncovering genetic signatures of historic migration events in the region. We therefore studied autosomal DNA-, mitochondrial DNA-, and Y-chromosomal DNA markers in a representative sample of 141 unrelated Maldivians, with 119 from six major settlements. We found a total of 63 different mtDNA haplotypes that could be allocated to 29 mtDNA haplogroups, mostly within the M, R, and U clades. We found 66 different Y-STR haplotypes in 10 Y-chromosome haplogroups, predominantly H1, J2, L, R1a1a, and R2. Parental admixture analysis for mtDNA- and Y-haplogroup data indicates a strong genetic link between the Maldive Islands and mainland South Asia, and excludes significant gene flow from Southeast Asia. Paternal admixture from West Asia is detected, but cannot be distinguished from admixture from South Asia. Maternal admixture from West Asia is excluded. Within the Maldives, we find a subtle genetic substructure in all marker systems that is not directly related to geographic distance or linguistic dialect. We found reduced Y-STR diversity and reduced male-mediated gene flow between atolls, suggesting independent male founder effects for each atoll. Detected reduced female-mediated gene flow between atolls confirms a Maldives-specific history of matrilocality. In conclusion, our new genetic data agree with the commonly reported Maldivian ancestry in South Asia, but furthermore suggest multiple, independent immigration events and asymmetrical migration of females and males across the archipelago. Am J Phys Anthropol 151:58–67, 2013. © 2013 Wiley Periodicals, Inc.
Y chromosome; mitochondrial DNA; migration; Indo-Aryan languages; South Asia
A Neolithic domestication of taurine cattle in the Fertile Crescent from local aurochsen (Bos primigenius) is generally accepted, but a genetic contribution from European aurochsen has been proposed. Here we performed a survey of a large number of taurine cattle mitochondrial DNA (mtDNA) control regions from numerous European breeds confirming the overall clustering within haplogroups (T1, T2 and T3) of Near Eastern ancestry, but also identifying eight mtDNAs (1.3%) that did not fit in haplogroup T. Sequencing of the entire mitochondrial genome showed that four mtDNAs formed a novel branch (haplogroup R) which, after the deep bifurcation that gave rise to the taurine and zebuine lineages, constitutes the earliest known split in the mtDNA phylogeny of B. primigenius. The remaining four mtDNAs were members of the recently discovered haplogroup Q. Phylogeographic data indicate that R mtDNAs were derived from female European aurochsen, possibly in the Italian Peninsula, and sporadically included in domestic herds. In contrast, the available data suggest that Q mtDNAs and T subclades were involved in the same Neolithic event of domestication in the Near East. Thus, the existence of novel (and rare) taurine haplogroups highlights a multifaceted genetic legacy from distinct B. primigenius populations. Taking into account that the maternally transmitted mtDNA tends to underestimate the extent of gene flow from European aurochsen, the detection of the R mtDNAs in autochthonous breeds, some of which are endangered, identifies an unexpected reservoir of genetic variation that should be carefully preserved.
The geographical position of Maharashtra state makes it rather essential to study the dispersal of modern humans in South Asia. Several hypotheses have been proposed to explain the cultural, linguistic and geographical affinity of the populations living in Maharashtra state with other South Asian populations. The genetic origin of populations living in this state is poorly understood and hitherto been described at low molecular resolution level.
To address this issue, we have analyzed the mitochondrial DNA (mtDNA) of 185 individuals and NRY (non-recombining region of Y chromosome) of 98 individuals belonging to two major tribal populations of Maharashtra, and compared their molecular variations with that of 54 South Asian contemporary populations of adjacent states. Inter and intra population comparisons reveal that the maternal gene pool of Maharashtra state populations is composed of mainly South Asian haplogroups with traces of east and west Eurasian haplogroups, while the paternal haplogroups comprise the South Asian as well as signature of near eastern specific haplogroup J2a.
Our analysis suggests that Indian populations, including Maharashtra state, are largely derived from Paleolithic ancient settlers; however, a more recent (∼10 Ky older) detectable paternal gene flow from west Asia is well reflected in the present study. These findings reveal movement of populations to Maharashtra through the western coast rather than mainland where Western Ghats-Vindhya Mountains and Narmada-Tapti rivers might have acted as a natural barrier. Comparing the Maharastrian populations with other South Asian populations reveals that they have a closer affinity with the South Indian than with the Central Indian populations.
South Asia possesses a significant amount of genetic diversity due to considerable intergroup differences in culture and language. There have been numerous reports on the genetic structure of Asian Indians, although these have mostly relied on genotyping microarrays or targeted sequencing of the mitochondria and Y chromosomes. Asian Indians in Singapore are primarily descendants of immigrants from Dravidian-language–speaking states in south India, and 38 individuals from the general population underwent deep whole-genome sequencing with a target coverage of 30X as part of the Singapore Sequencing Indian Project (SSIP). The genetic structure and diversity of these samples were compared against samples from the Singapore Sequencing Malay Project and populations in Phase 1 of the 1,000 Genomes Project (1 KGP). SSIP samples exhibited greater intra-population genetic diversity and possessed higher heterozygous-to-homozygous genotype ratio than other Asian populations. When compared against a panel of well-defined Asian Indians, the genetic makeup of the SSIP samples was closely related to South Indians. However, even though the SSIP samples clustered distinctly from the Europeans in the global population structure analysis with autosomal SNPs, eight samples were assigned to mitochondrial haplogroups that were predominantly present in Europeans and possessed higher European admixture than the remaining samples. An analysis of the relative relatedness between SSIP with two archaic hominins (Denisovan, Neanderthal) identified higher ancient admixture in East Asian populations than in SSIP. The data resource for these samples is publicly available and is expected to serve as a valuable complement to the South Asian samples in Phase 3 of 1 KGP.
Indians of South Asia has long been a population of interest to a wide audience, due to its unique diversity. We have deep-sequenced 38 individuals of Indian descent residing in Singapore (SSIP) in an effort to illustrate their diversity from a whole-genome standpoint. Indeed, among Asians in our population panel, SSIP was most diverse, followed by the Malays in Singapore (SSMP). Their diversity is further observed in the population's chromosome Y haplogroup and mitochondria haplogroup profiles; individuals with European-dominant haplogroups had greater proportion of European admixture. Among variants (single nucleotide polymorphism and small insertions/deletions) discovered in SSIP, 21.69% were novel with respect to previous sequencing projects. In addition, some 14 loss-of-function variants (LOFs) were associated to cancer, Type II diabetes, and cholesterol levels. Finally, D statistic test with ancient hominids concurred that there was gene flow to East Asians compared to South Asians.
Although the functional consequences of mitochondrial DNA (mtDNA) genetic backgrounds (haplotypes, haplogroups) have been demonstrated by both disease association studies and cell culture experiments, it is not clear which of the mutations within the haplogroup carry functional implications and which are “evolutionary silent hitchhikers”. We set forth to study the functionality of haplogroup-defining mutations within the mtDNA transcription/replication regulatory region by in vitro transcription, hypothesizing that haplogroup-defining mutations occurring within regulatory motifs of mtDNA could affect these processes. We thus screened >2500 complete human mtDNAs representing all major populations worldwide for natural variation in experimentally established protein binding sites and regulatory regions comprising a total of 241 bp in each mtDNA. Our screen revealed 77/241 sites showing point mutations that could be divided into non-fixed (57/77, 74%) and haplogroup/sub-haplogroup-defining changes (i.e., population fixed changes, 20/77, 26%). The variant defining Caucasian haplogroup J (C295T) increased the binding of TFAM (Electro Mobility Shift Assay) and the capacity of in vitro L-strand transcription, especially of a shorter transcript that maps immediately upstream of conserved sequence block 1 (CSB1), a region associated with RNA priming of mtDNA replication. Consistent with this finding, cybrids (i.e., cells sharing the same nuclear genetic background but differing in their mtDNA backgrounds) harboring haplogroup J mtDNA had a >2 fold increase in mtDNA copy number, as compared to cybrids containing haplogroup H, with no apparent differences in steady state levels of mtDNA-encoded transcripts. Hence, a haplogroup J regulatory region mutation affects mtDNA replication or stability, which may partially account for the phenotypic impact of this haplogroup. Our analysis thus demonstrates, for the first time, the functional impact of particular mtDNA haplogroup-defining control region mutations, paving the path towards assessing the functionality of both fixed and un-fixed genetic variants in the mitochondrial genome.
Mitochondria, the ‘power plant’ of the cell, have their own distinct genome (mtDNA), whose sequence varies among individuals around the globe. This variation, which was formed by the accumulation of mutations (variants) during the course of evolution, appears to alter the susceptibility to common complex diseases (such as Parkinson's disease and diabetes). However, since the accumulation of mtDNA mutations over time results in the formation of new combinations (genetic backgrounds), it is not clear which of the mutations are functional and which are “evolutionary silent hitchhikers”. Thus we aimed at assessing the functionality of mtDNA genetic variants, focusing on variants within the mtDNA regulatory region, hypothesizing that they could affect mtDNA activity and maintenance. We found that a variant defining mtDNA genetic background ‘J’ significantly increased the transcriptional efficiency and elevated mtDNA copy numbers in cells, as compared to other genetic backgrounds. Hence, mtDNA regulatory region variants can affect mtDNA maintenance, which may partially account for the involvement of this genetic background in disease susceptibility. Our analysis demonstrates, for the first time, the functional impact of a particular mtDNA variant that was fixed during evolution. Moreover, our findings underline the functionality of mtDNA variants in the evolutionary variable regulatory region.
Domestic chickens (Gallus gallus domesticus) fulfill various roles ranging from food and entertainment to religion and ornamentation. To survey its genetic diversity and trace the history of domestication, we investigated a total of 4938 mitochondrial DNA (mtDNA) fragments including 2843 previously published and 2095 de novo units from 2044 domestic chickens and 51 red junglefowl (Gallus gallus). To obtain the highest possible level of molecular resolution, 50 representative samples were further selected for total mtDNA genome sequencing. A fine-gained mtDNA phylogeny was investigated by defining haplogroups A–I and W–Z. Common haplogroups A–G were shared by domestic chickens and red junglefowl. Rare haplogroups H–I and W–Z were specific to domestic chickens and red junglefowl, respectively. We re-evaluated the global mtDNA profiles of chickens. The geographic distribution for each of major haplogroups was examined. Our results revealed new complexities of history in chicken domestication because in the phylogeny lineages from the red junglefowl were mingled with those of the domestic chickens. Several local domestication events in South Asia, Southwest China and Southeast Asia were identified. The assessment of chicken mtDNA data also facilitated our understanding about the Austronesian settlement in the Pacific.
chicken; mtDNA; domestication; phylogeny; Austronesian
Mitochondrial DNA (mtDNA) haplogroups are valuable for investigations in forensic science, molecular anthropology, and human genetics. In this study, we developed a custom panel of 61 mtDNA markers for high-throughput classification of European, African, and Native American/Asian mitochondrial haplogroup lineages. Using these mtDNA markers we constructed a mitochondrial haplogroup classification tree and classified 18,832 participants from the National Health and Nutrition Examination Surveys (NHANES). To our knowledge, this is the largest study to date characterizing mitochondrial haplogroups in a population-based sample from the United States, and the first study characterizing mitochondrial haplogroup distributions in self-identified Mexican Americans separately from Hispanic Americans of other descent. We observed clear differences in the distribution of maternal genetic ancestry consistent with proposed admixture models for these subpopulations, underscoring the genetic heterogeneity of the United States Hispanic population. The mitochondrial haplogroup distributions in the other self-identified racial/ethnic groups within NHANES were largely comparable to previous studies. Mitochondrial haplogroup classification was highly concordant with self-identified race/ethnicity (SIRE) in non-Hispanic whites (94.8%), but was considerably lower in admixed populations including non-Hispanic blacks (88.3%), Mexican Americans (81.8%), and other Hispanics (61.6%), suggesting SIRE does not accurately reflect maternal genetic ancestry, particularly in populations with greater proportions of admixture. Thus, it is important to consider inconsistencies between SIRE and genetic ancestry when performing genetic association studies. The mitochondrial haplogroup data that we have generated, coupled with the epidemiologic variables in NHANES, is a valuable resource for future studies investigating the contribution of mtDNA variation to human health and disease.
mitochondrial haplogroups; NHANES; mitochondrial genetic variation; Sequenom; multiplex genotyping
The Austro-Asiatic linguistic family, which is considered to be the oldest of all the families in India, has a substantial presence in Southeast Asia. However, the possibility of any genetic link among the linguistic sub-families of the Indian Austro-Asiatics on the one hand and between the Indian and the Southeast Asian Austro-Asiatics on the other has not been explored till now. Therefore, to trace the origin and historic expansion of Austro-Asiatic groups of India, we analysed Y-chromosome SNP and STR data of the 1222 individuals from 25 Indian populations, covering all the three branches of Austro-Asiatic tribes, viz. Mundari, Khasi-Khmuic and Mon-Khmer, along with the previously published data on 214 relevant populations from Asia and Oceania.
Our results suggest a strong paternal genetic link, not only among the subgroups of Indian Austro-Asiatic populations but also with those of Southeast Asia. However, maternal link based on mtDNA is not evident. The results also indicate that the haplogroup O-M95 had originated in the Indian Austro-Asiatic populations ~65,000 yrs BP (95% C.I. 25,442 – 132,230) and their ancestors carried it further to Southeast Asia via the Northeast Indian corridor. Subsequently, in the process of expansion, the Mon-Khmer populations from Southeast Asia seem to have migrated and colonized Andaman and Nicobar Islands at a much later point of time.
Our findings are consistent with the linguistic evidence, which suggests that the linguistic ancestors of the Austro-Asiatic populations have originated in India and then migrated to Southeast Asia.
Much of the data resolution of the haploid non-recombining Y chromosome (NRY) haplogroup O in East Asia are still rudimentary and could be an explanatory factor for current debates on the settlement history of Island Southeast Asia (ISEA). Here, 81 slowly evolving markers (mostly SNPs) and 17 Y-chromosomal short tandem repeats were used to achieve higher level molecular resolution. Our aim is to investigate if the distribution of NRY DNA variation in Taiwan and ISEA is consistent with a single pre-Neolithic expansion scenario from Southeast China to all ISEA, or if it better fits an expansion model from Taiwan (the OOT model), or whether a more complex history of settlement and dispersals throughout ISEA should be envisioned.
We examined DNA samples from 1658 individuals from Vietnam, Thailand, Fujian, Taiwan (Han, plain tribes and 14 indigenous groups), the Philippines and Indonesia. While haplogroups O1a*-M119, O1a1*-P203, O1a2-M50 and O3a2-P201 follow a decreasing cline from Taiwan towards Western Indonesia, O2a1-M95/M88, O3a*-M324, O3a1c-IMS-JST002611 and O3a2c1a-M133 decline northward from Western Indonesia towards Taiwan. Compared to the Taiwan plain tribe minority groups the Taiwanese Austronesian speaking groups show little genetic paternal contribution from Han. They are also characterized by low Y-chromosome diversity, thus testifying for fast drift in these populations. However, in contrast to data provided from other regions of the genome, Y-chromosome gene diversity in Taiwan mountain tribes significantly increases from North to South.
The geographic distribution and the diversity accumulated in the O1a*-M119, O1a1*-P203, O1a2-M50 and O3a2-P201 haplogroups on one hand, and in the O2a1-M95/M88, O3a*-M324, O3a1c-IMS-JST002611 and O3a2c1a-M133 haplogroups on the other, support a pincer model of dispersals and gene flow from the mainland to the islands which likely started during the late upper Paleolithic, 18,000 to 15,000 years ago. The branches of the pincer contributed separately to the paternal gene pool of the Philippines and conjointly to the gene pools of Madagascar and the Solomon Islands. The North to South increase in diversity found for Taiwanese Austronesian speaking groups contrasts with observations based on mitochondrial DNA, thus hinting to a differentiated demographic history of men and women in these populations.
Y chromosome; Y-STR; Y-SNP; Austronesian migration; Taiwan; Island Southeast Asia; Haplogroup O1a
Hypertrophic cardiomyopathy (HCM) is a genetic disorder caused by mutations in genes coding for proteins involved in sarcomere function. The disease is associated with mitochondrial dysfunction. Evolutionarily developed variation in mitochondrial DNA (mtDNA), defining mtDNA haplogroups and haplogroup clusters, is associated with functional differences in mitochondrial function and susceptibility to various diseases, including ischemic cardiomyopathy. We hypothesized that mtDNA haplogroups, in particular H, J and K, might modify disease susceptibility to HCM. Mitochondrial DNA, isolated from blood, was sequenced and haplogroups identified in 91 probands with HCM. The association with HCM was ascertained using two Danish control populations. Haplogroup H was more prevalent in HCM patients, 60% versus 46% (p = 0.006) and 41% (p = 0.003), in the two control populations. Haplogroup J was less prevalent, 3% vs. 12.4% (p = 0.017) and 9.1%, (p = 0.06). Likewise, the UK haplogroup cluster was less prevalent in HCM, 11% vs. 22.1% (p = 0.02) and 22.8% (p = 0.04). These results indicate that haplogroup H constitutes a susceptibility factor and that haplogroup J and haplogroup cluster UK are protective factors in the development of HCM. Thus, constitutive differences in mitochondrial function may influence the occurrence and clinical presentation of HCM. This could explain some of the phenotypic variability in HCM. The fact that haplogroup H and J are also modifying factors in ischemic cardiomyopathy suggests that mtDNA haplotypes may be of significance in determining whether a physiological hypertrophy develops into myopathy. mtDNA haplotypes may have the potential of becoming significant biomarkers in cardiomyopathy.
Archaeological studies have revealed a series of cultural changes around the Last Glacial Maximum in East Asia; whether these changes left any signatures in the gene pool of East Asians remains poorly indicated. To achieve deeper insights into the demographic history of modern humans in East Asia around the Last Glacial Maximum, we extensively analyzed mitochondrial DNA haplogroup M9a'b, a specific haplogroup that was suggested to have some potential for tracing the migration around the Last Glacial Maximum in East Eurasia.
A total of 837 M9a'b mitochondrial DNAs (583 from the literature, while the remaining 254 were newly collected in this study) pinpointed from over 28,000 subjects residing across East Eurasia were studied here. Fifty-nine representative samples were further selected for total mitochondrial DNA sequencing so we could better understand the phylogeny within M9a'b. Based on the updated phylogeny, an extensive phylogeographic analysis was carried out to reveal the differentiation of haplogroup M9a'b and to reconstruct the dispersal histories.
Our results indicated that southern China and/or Southeast Asia likely served as the source of some post-Last Glacial Maximum dispersal(s). The detailed dissection of haplogroup M9a'b revealed the existence of an inland dispersal in mainland East Asia during the post-glacial period. It was this dispersal that expanded not only to western China but also to northeast India and the south Himalaya region. A similar phylogeographic distribution pattern was also observed for haplogroup F1c, thus substantiating our proposition. This inland post-glacial dispersal was in agreement with the spread of the Mesolithic culture originating in South China and northern Vietnam.
The modern human colonization of Eurasia and Australia is mostly explained by a single-out-of-Africa exit following a southern coastal route throughout Arabia and India. However, dispersal across the Levant would better explain the introgression with Neanderthals, and more than one exit would fit better with the different ancient genomic components discovered in indigenous Australians and in ancient Europeans. The existence of an additional Northern route used by modern humans to reach Australia was previously deduced from the phylogeography of mtDNA macrohaplogroup N. Here, we present new mtDNA data and new multidisciplinary information that add more support to this northern route.
MtDNA hypervariable segments and haplogroup diagnostic coding positions were analyzed in 2,278 Saudi Arabs, from which 1,725 are new samples. Besides, we used 623 published mtDNA genomes belonging to macrohaplogroup N, but not R, to build updated phylogenetic trees to calculate their coalescence ages, and more than 70,000 partial mtDNA sequences were screened to establish their respective geographic ranges.
The Saudi mtDNA profile confirms the absence of autochthonous mtDNA lineages in Arabia with coalescence ages deep enough to support population continuity in the region since the out-of-Africa episode. In contrast to Australia, where N(xR) haplogroups are found in high frequency and with deep coalescence ages, there are not autochthonous N(xR) lineages in India nor N(xR) branches with coalescence ages as deep as those found in Australia. These patterns are at odds with the supposition that Australian colonizers harboring N(xR) lineages used a route involving India as a stage. The most ancient N(xR) lineages in Eurasia are found in China, and inconsistently with the coastal route, N(xR) haplogroups with the southernmost geographical range have all more recent radiations than the Australians.
Apart from a single migration event via a southern route, phylogeny and phylogeography of N(xR) lineages support that people carrying mtDNA N lineages could have reach Australia following a northern route through Asia. Data from other disciplines also support this scenario.