In agreement with historical documentation, several genetic studies have revealed ancestral links between the European Romani and India. The entire mitochondrial DNA (mtDNA) of 27 Spanish Romani was sequenced in order to shed further light on the origins of this population. The data were analyzed together with a large published dataset (mainly hypervariable region I [HVS-I] haplotypes) of Romani (N = 1,353) and non-Romani worldwide populations (N>150,000). Analysis of mitogenomes allowed the characterization of various Romani-specific clades. M5a1b1a1 is the most distinctive European Romani haplogroup; it is present in all Romani groups at variable frequencies (with only sporadic findings in non-Romani) and represents 18% of their mtDNA pool. Its phylogeographic features indicate that M5a1b1a1 originated 1.5 thousand years ago (kya; 95% CI: 1.3–1.8) in a proto-Romani population living in Northwest India. U3 represents the most characteristic Romani haplogroup of European/Near Eastern origin (12.4%); it appears at dissimilar frequencies across the continent (Iberia: ∼31%; Eastern/Central Europe: ∼13%). All U3 mitogenomes of our Iberian Romani sample fall within a new sub-clade, U3b1c, which can be dated to 0.5 kya (95% CI: 0.3–0.7); therefore, signaling a lower bound for the founder event that followed admixture in Europe/Near East. Other minor European/Near Eastern haplogroups (e.g. H24, H88a) were also assimilated into the Romani by introgression with neighboring populations during their diaspora into Europe; yet some show a differentiation from the phylogenetically closest non-Romani counterpart. The phylogeny of Romani mitogenomes shows clear signatures of low effective population sizes and founder effects. Overall, these results are in good agreement with historical documentation, suggesting that cultural identity and relative isolation have allowed the Romani to preserve a distinctive mtDNA heritage, with some features linking them unequivocally to their ancestral Indian homeland.
Human genetic diversity observed in Indian subcontinent is second only to that of Africa. This implies an early settlement and demographic growth soon after the first 'Out-of-Africa' dispersal of anatomically modern humans in Late Pleistocene. In contrast to this perspective, linguistic diversity in India has been thought to derive from more recent population movements and episodes of contact. With the exception of Dravidian, which origin and relatedness to other language phyla is obscure, all the language families in India can be linked to language families spoken in different regions of Eurasia. Mitochondrial DNA and Y chromosome evidence has supported largely local evolution of the genetic lineages of the majority of Dravidian and Indo-European speaking populations, but there is no consensus yet on the question of whether the Munda (Austro-Asiatic) speaking populations originated in India or derive from a relatively recent migration from further East.
Here, we report the analysis of 35 novel complete mtDNA sequences from India which refine the structure of Indian-specific varieties of haplogroup R. Detailed analysis of haplogroup R7, coupled with a survey of ~12,000 mtDNAs from caste and tribal groups over the entire Indian subcontinent, reveals that one of its more recently derived branches (R7a1), is particularly frequent among Munda-speaking tribal groups. This branch is nested within diverse R7 lineages found among Dravidian and Indo-European speakers of India. We have inferred from this that a subset of Munda-speaking groups have acquired R7 relatively recently. Furthermore, we find that the distribution of R7a1 within the Munda-speakers is largely restricted to one of the sub-branches (Kherwari) of northern Munda languages. This evidence does not support the hypothesis that the Austro-Asiatic speakers are the primary source of the R7 variation. Statistical analyses suggest a significant correlation between genetic variation and geography, rather than between genes and languages.
Our high-resolution phylogeographic study, involving diverse linguistic groups in India, suggests that the high frequency of mtDNA haplogroup R7 among Munda speaking populations of India can be explained best by gene flow from linguistically different populations of Indian subcontinent. The conclusion is based on the observation that among Indo-Europeans, and particularly in Dravidians, the haplogroup is, despite its lower frequency, phylogenetically more divergent, while among the Munda speakers only one sub-clade of R7, i.e. R7a1, can be observed. It is noteworthy that though R7 is autochthonous to India, and arises from the root of hg R, its distribution and phylogeography in India is not uniform. This suggests the more ancient establishment of an autochthonous matrilineal genetic structure, and that isolation in the Pleistocene, lineage loss through drift, and endogamy of prehistoric and historic groups have greatly inhibited genetic homogenization and geographical uniformity.
The Roma people, living throughout Europe and West Asia, are a diverse population linked by the Romani language and culture. Previous linguistic and genetic studies have suggested that the Roma migrated into Europe from South Asia about 1,000–1,500 years ago. Genetic inferences about Roma history have mostly focused on the Y chromosome and mitochondrial DNA. To explore what additional information can be learned from genome-wide data, we analyzed data from six Roma groups that we genotyped at hundreds of thousands of single nucleotide polymorphisms (SNPs). We estimate that the Roma harbor about 80% West Eurasian ancestry–derived from a combination of European and South Asian sources–and that the date of admixture of South Asian and European ancestry was about 850 years before present. We provide evidence for Eastern Europe being a major source of European ancestry, and North-west India being a major source of the South Asian ancestry in the Roma. By computing allele sharing as a measure of linkage disequilibrium, we estimate that the migration of Roma out of the Indian subcontinent was accompanied by a severe founder event, which appears to have been followed by a major demographic expansion after the arrival in Europe.
Previous genetic, anthropological and linguistic studies have shown that Roma (Gypsies) constitute a founder population dispersed throughout Europe whose origins might be traced to the Indian subcontinent. Linguistic and anthropological evidence point to Indo-Aryan ethnic groups from North-western India as the ancestral parental population of Roma. Recently, a strong genetic hint supporting this theory came from a study of a private mutation causing primary congenital glaucoma. In the present study, complete mitochondrial control sequences of Iberian Roma and previously published maternal lineages of other European Roma were analyzed in order to establish the genetic affinities among Roma groups, determine the degree of admixture with neighbouring populations, infer the migration routes followed since the first arrival to Europe, and survey the origin of Roma within the Indian subcontinent. Our results show that the maternal lineage composition in the Roma groups follows a pattern of different migration routes, with several founder effects, and low effective population sizes along their dispersal. Our data allowed the confirmation of a North/West migration route shared by Polish, Lithuanian and Iberian Roma. Additionally, eleven Roma founder lineages were identified and degrees of admixture with host populations were estimated. Finally, the comparison with an extensive database of Indian sequences allowed us to identify the Punjab state, in North-western India, as the putative ancestral homeland of the European Roma, in agreement with previous linguistic and anthropological studies.
India is a country with enormous social and cultural diversity due to its positioning on the crossroads of many historic and pre-historic human migrations. The hierarchical caste system in the Hindu society dominates the social structure of the Indian populations. The origin of the caste system in India is a matter of debate with many linguists and anthropologists suggesting that it began with the arrival of Indo-European speakers from Central Asia about 3500 years ago. Previous genetic studies based on Indian populations failed to achieve a consensus in this regard. We analysed the Y-chromosome and mitochondrial DNA of three tribal populations of southern India, compared the results with available data from the Indian subcontinent and tried to reconstruct the evolutionary history of Indian caste and tribal populations.
No significant difference was observed in the mitochondrial DNA between Indian tribal and caste populations, except for the presence of a higher frequency of west Eurasian-specific haplogroups in the higher castes, mostly in the north western part of India. On the other hand, the study of the Indian Y lineages revealed distinct distribution patterns among caste and tribal populations. The paternal lineages of Indian lower castes showed significantly closer affinity to the tribal populations than to the upper castes. The frequencies of deep-rooted Y haplogroups such as M89, M52, and M95 were higher in the lower castes and tribes, compared to the upper castes.
The present study suggests that the vast majority (>98%) of the Indian maternal gene pool, consisting of Indio-European and Dravidian speakers, is genetically more or less uniform. Invasions after the late Pleistocene settlement might have been mostly male-mediated. However, Y-SNP data provides compelling genetic evidence for a tribal origin of the lower caste populations in the subcontinent. Lower caste groups might have originated with the hierarchical divisions that arose within the tribal groups with the spread of Neolithic agriculturalists, much earlier than the arrival of Aryan speakers. The Indo-Europeans established themselves as upper castes among this already developed caste-like class structure within the tribes.
Y-chromosomal haplogroup (Y-HG) Q is suggested to originate in Asia and represent recent founder paternal Native American radiation into the Americas. This group is delineated into Q1, Q2 and Q3 subgroups defined by biallelic markers M120, M25/M143 and M3, respectively. Recently, a novel subgroup Q4 has been identified which is defined by bi-allelic marker M346, representing HG Q (0.41%, 3/728) in Indian population. With scanty details of HG Q in Asia, especially India, it was pertinent to explore the status of the Y-HG Q in Indian population to gather an insight to determine the extent of diversity within this region.
We observed 15/630 (2.38%) Y-HG Q individuals in India with an ancestral state at M120, M25, M3 and M346 markers, indicating an absence of already known Q1, Q2, Q3 and Q4 sub-haplogroups. Interestingly, we further observed a novel 4 bp deletion/insertion polymorphism (ss4 bp, rs41352448) at 72,314 position of human arylsulfatase D pseudogene, defining a novel sub-lineage Q5 (in 5/15 individuals, i.e., 33.3 % of the observed Y-HG Q) with distributions independent of the social, cultural, linguistic and geographical affiliations in India.
The study adds another sublineage Q5 in the already existing arrangement of Y-HG Q in literature. It was quite interesting to observe an ancestral state Q* and a novel sub-branch Q5, not reported elsewhere, in Indian subcontinent, though in low frequency. A novel subgroup Q4 was identified recently which is also restricted to Indian subcontinent. The most plausible explanation for these observations could be an ancestral migration of individuals bearing ancestral lineage Q* to Indian subcontinent followed by an autochthonous differentiation to Q4 and Q5 sublineages later on. However, other explanations of, either the presence of both the sub haplogroups (Q4 and Q5) in ancestral migrants or recent migrations from central Asia, cannot be ruled out till the distribution and diversity of these subgroups is explored extensively in Central Asia and other regions.
The genetic structure, affinities, and diversity of the 1 billion Indians hold important keys to numerous unanswered questions regarding the evolution of human populations and the forces shaping contemporary patterns of genetic variation. Although there have been several recent studies of South Indian caste groups, North Indian caste groups, and South Indian Muslims using Y-chromosomal markers, overall, the Indian population has still not been well studied compared to other geographical populations. In particular, no genetic study has been conducted on Shias and Sunnis from North India.
This study aims to investigate genetic variation and the gene pool in North Indians.
Subjects and methods
A total of 32 Y-chromosomal markers in 560 North Indian males collected from three higher caste groups (Brahmins, Chaturvedis and Bhargavas) and two Muslims groups (Shia and Sunni) were genotyped.
Three distinct lineages were revealed based upon 13 haplogroups. The first was a Central Asian lineage harbouring haplogroups R1 and R2. The second lineage was of Middle-Eastern origin represented by haplogroups J2*, Shia-specific E1b1b1, and to some extent G* and L*. The third was the indigenous Indian Y-lineage represented by haplogroups H1*, F*, C* and O*. Haplogroup E1b1b1 was observed in Shias only.
The results revealed that a substantial part of today’s North Indian paternal gene pool was contributed by Central Asian lineages who are Indo-European speakers, suggesting that extant Indian caste groups are primarily the descendants of Indo-European migrants. The presence of haplogroup E in Shias, first reported in this study, suggests a genetic distinction between the two Indo Muslim sects. The findings of the present study provide insights into prehistoric and early historic patterns of migration into India and the evolution of Indian populations in recent history.
Paternal lineages; Y-chromosomal markers; North Indians; migration
The geographical position of Maharashtra state makes it rather essential to study the dispersal of modern humans in South Asia. Several hypotheses have been proposed to explain the cultural, linguistic and geographical affinity of the populations living in Maharashtra state with other South Asian populations. The genetic origin of populations living in this state is poorly understood and hitherto been described at low molecular resolution level.
To address this issue, we have analyzed the mitochondrial DNA (mtDNA) of 185 individuals and NRY (non-recombining region of Y chromosome) of 98 individuals belonging to two major tribal populations of Maharashtra, and compared their molecular variations with that of 54 South Asian contemporary populations of adjacent states. Inter and intra population comparisons reveal that the maternal gene pool of Maharashtra state populations is composed of mainly South Asian haplogroups with traces of east and west Eurasian haplogroups, while the paternal haplogroups comprise the South Asian as well as signature of near eastern specific haplogroup J2a.
Our analysis suggests that Indian populations, including Maharashtra state, are largely derived from Paleolithic ancient settlers; however, a more recent (∼10 Ky older) detectable paternal gene flow from west Asia is well reflected in the present study. These findings reveal movement of populations to Maharashtra through the western coast rather than mainland where Western Ghats-Vindhya Mountains and Narmada-Tapti rivers might have acted as a natural barrier. Comparing the Maharastrian populations with other South Asian populations reveals that they have a closer affinity with the South Indian than with the Central Indian populations.
Recent advances in the understanding of the maternal and paternal heritage of south and southwest Asian populations have highlighted their role in the colonization of Eurasia by anatomically modern humans. Further understanding requires a deeper insight into the topology of the branches of the Indian mtDNA phylogenetic tree, which should be contextualized within the phylogeography of the neighboring regional mtDNA variation. Accordingly, we have analyzed mtDNA control and coding region variation in 796 Indian (including both tribal and caste populations from different parts of India) and 436 Iranian mtDNAs. The results were integrated and analyzed together with published data from South, Southeast Asia and West Eurasia.
Four new Indian-specific haplogroup M sub-clades were defined. These, in combination with two previously described haplogroups, encompass approximately one third of the haplogroup M mtDNAs in India. Their phylogeography and spread among different linguistic phyla and social strata was investigated in detail. Furthermore, the analysis of the Iranian mtDNA pool revealed patterns of limited reciprocal gene flow between Iran and the Indian sub-continent and allowed the identification of different assemblies of shared mtDNA sub-clades.
Since the initial peopling of South and West Asia by anatomically modern humans, when this region may well have provided the initial settlers who colonized much of the rest of Eurasia, the gene flow in and out of India of the maternally transmitted mtDNA has been surprisingly limited. Specifically, our analysis of the mtDNA haplogroups, which are shared between Indian and Iranian populations and exhibit coalescence ages corresponding to around the early Upper Paleolithic, indicates that they are present in India largely as Indian-specific sub-lineages. In contrast, other ancient Indian-specific variants of M and R are very rare outside the sub-continent.
The phylogeny of the indigenous Indian-specific mitochondrial DNA (mtDNA) haplogroups have been determined and refined in previous reports. Similar to mtDNA superhaplogroups M and N, a profusion of reports are also available for superhaplogroup R. However, there is a dearth of information on South Asian subhaplogroups in particular, including R8. Therefore, we ought to access the genealogy and pre-historic expansion of haplogroup R8 which is considered one of the autochthonous lineages of South Asia.
Upon screening the mtDNA of 5,836 individuals belonging to 104 distinct ethnic populations of the Indian subcontinent, we found 54 individuals with the HVS-I motif that defines the R8 haplogroup. Complete mtDNA sequencing of these 54 individuals revealed two deep-rooted subclades: R8a and R8b. Furthermore, these subclades split into several fine subclades. An isofrequency contour map detected the highest frequency of R8 in the state of Orissa. Spearman's rank correlation analysis suggests significant correlation of R8 occurrence with geography.
The coalescent age of newly-characterized subclades of R8, R8a (15.4±7.2 Kya) and R8b (25.7±10.2 Kya) indicates that the initial maternal colonization of this haplogroup occurred during the middle and upper Paleolithic period, roughly around 40 to 45 Kya. These results signify that the southern part of Orissa currently inhabited by Munda speakers is likely the origin of these autochthonous maternal deep-rooted haplogroups. Our high-resolution study on the genesis of R8 haplogroup provides ample evidence of its deep-rooted ancestry among the Orissa (Austro-Asiatic) tribes.
Sakha – an area connecting South and Northeast Siberia – is significant for understanding the history of peopling of Northeast Eurasia and the Americas. Previous studies have shown a genetic contiguity between Siberia and East Asia and the key role of South Siberia in the colonization of Siberia.
We report the results of a high-resolution phylogenetic analysis of 701 mtDNAs and 318 Y chromosomes from five native populations of Sakha (Yakuts, Evenks, Evens, Yukaghirs and Dolgans) and of the analysis of more than 500,000 autosomal SNPs of 758 individuals from 55 populations, including 40 previously unpublished samples from Siberia. Phylogenetically terminal clades of East Asian mtDNA haplogroups C and D and Y-chromosome haplogroups N1c, N1b and C3, constituting the core of the gene pool of the native populations from Sakha, connect Sakha and South Siberia. Analysis of autosomal SNP data confirms the genetic continuity between Sakha and South Siberia. Maternal lineages D5a2a2, C4a1c, C4a2, C5b1b and the Yakut-specific STR sub-clade of Y-chromosome haplogroup N1c can be linked to a migration of Yakut ancestors, while the paternal lineage C3c was most likely carried to Sakha by the expansion of the Tungusic people. MtDNA haplogroups Z1a1b and Z1a3, present in Yukaghirs, Evens and Dolgans, show traces of different and probably more ancient migration(s). Analysis of both haploid loci and autosomal SNP data revealed only minor genetic components shared between Sakha and the extreme Northeast Siberia. Although the major part of West Eurasian maternal and paternal lineages in Sakha could originate from recent admixture with East Europeans, mtDNA haplogroups H8, H20a and HV1a1a, as well as Y-chromosome haplogroup J, more probably reflect an ancient gene flow from West Eurasia through Central Asia and South Siberia.
Our high-resolution phylogenetic dissection of mtDNA and Y-chromosome haplogroups as well as analysis of autosomal SNP data suggests that Sakha was colonized by repeated expansions from South Siberia with minor gene flow from the Lower Amur/Southern Okhotsk region and/or Kamchatka. The minor West Eurasian component in Sakha attests to both recent and ongoing admixture with East Europeans and an ancient gene flow from West Eurasia.
mtDNA; Y chromosome; Autosomal SNPs; Sakha
Macrohaplogroups 'M' and 'N' have evolved almost in parallel from a founder haplogroup L3. Macrohaplogroup N in India has already been defined in previous studies and recently the macrohaplogroup M among the Indian populations has been characterized. In this study, we attempted to reconstruct and re-evaluate the phylogeny of Macrohaplogroup M, which harbors more than 60% of the Indian mtDNA lineage, and to shed light on the origin of its deep rooting haplogroups.
Using 11 whole mtDNA and 2231 partial coding sequence of Indian M lineage selected from 8670 HVS1 sequences across India, we have reconstructed the tree including Andamanese-specific lineage M31 and calculated the time depth of all the nodes. We defined one novel haplogroup M41, and revised the classification of haplogroups M3, M18, and M31.
Our result indicates that the Indian mtDNA pool consists of several deep rooting lineages of macrohaplogroup 'M' suggesting in-situ origin of these haplogroups in South Asia, most likely in the India. These deep rooting lineages are not language specific and spread over all the language groups in India. Moreover, our reanalysis of the Andamanese-specific lineage M31 suggests population specific two clear-cut subclades (M31a1 and M31a2). Onge and Jarwa share M31a1 branch while M31a2 clade is present in only Great Andamanese individuals. Overall our study supported the one wave, rapid dispersal theory of modern humans along the Asian coast.
To construct maternal phylogeny and prehistoric dispersals of modern human being in the Indian sub continent, a diverse subset of 641 complete mitochondrial DNA (mtDNA) genomes belonging to macrohaplogroup M was chosen from a total collection of 2,783 control-region sequences, sampled from 26 selected tribal populations of India. On the basis of complete mtDNA sequencing, we identified 12 new haplogroups - M53 to M64; redefined/ascertained and characterized haplogroups M2, M3, M4, M5, M6, M8′C′Z, M9, M10, M11, M12-G, D, M18, M30, M33, M35, M37, M38, M39, M40, M41, M43, M45 and M49, which were previously described by control and/or coding-region polymorphisms. Our results indicate that the mtDNA lineages reported in the present study (except East Asian lineages M8′C′Z, M9, M10, M11, M12-G, D ) are restricted to Indian region.The deep rooted lineages of macrohaplogroup ‘M’ suggest in-situ origin of these haplogroups in India. Most of these deep rooting lineages are represented by multiple ethnic/linguist groups of India. Hierarchical analysis of molecular variation (AMOVA) shows substantial subdivisions among the tribes of India (Fst = 0.16164). The current Indian mtDNA gene pool was shaped by the initial settlers and was galvanized by minor events of gene flow from the east and west to the restricted zones. Northeast Indian mtDNA pool harbors region specific lineages, other Indian lineages and East Asian lineages. We also suggest the establishment of an East Asian gene in North East India through admixture rather than replacement.
Central Asia and the Indian subcontinent represent an area considered as a source and a reservoir for human genetic diversity, with many markers taking root here, most of which are the ancestral state of eastern and western haplogroups, while others are local. Between these two regions, Terai (Nepal) is a pivotal passageway allowing, in different times, multiple population interactions, although because of its highly malarial environment, it was scarcely inhabited until a few decades ago, when malaria was eradicated. One of the oldest and the largest indigenous people of Terai is represented by the malaria resistant Tharus, whose gene pool could still retain traces of ancient complex interactions. Until now, however, investigations on their genetic structure have been scarce mainly identifying East Asian signatures.
High-resolution analyses of mitochondrial-DNA (including 34 complete sequences) and Y-chromosome (67 SNPs and 12 STRs) variations carried out in 173 Tharus (two groups from Central and one from Eastern Terai), and 104 Indians (Hindus from Terai and New Delhi and tribals from Andhra Pradesh) allowed the identification of three principal components: East Asian, West Eurasian and Indian, the last including both local and inter-regional sub-components, at least for the Y chromosome.
Although remarkable quantitative and qualitative differences appear among the various population groups and also between sexes within the same group, many mitochondrial-DNA and Y-chromosome lineages are shared or derived from ancient Indian haplogroups, thus revealing a deep shared ancestry between Tharus and Indians. Interestingly, the local Y-chromosome Indian component observed in the Andhra-Pradesh tribals is present in all Tharu groups, whereas the inter-regional component strongly prevails in the two Hindu samples and other Nepalese populations.
The complete sequencing of mtDNAs from unresolved haplogroups also provided informative markers that greatly improved the mtDNA phylogeny and allowed the identification of ancient relationships between Tharus and Malaysia, the Andaman Islands and Japan as well as between India and North and East Africa. Overall, this study gives a paradigmatic example of the importance of genetic isolates in revealing variants not easily detectable in the general population.
The Maldives are an 850 km-long string of atolls located centrally in the northern Indian Ocean basin. Because of this geographic situation, the present-day Maldivian population has potential for uncovering genetic signatures of historic migration events in the region. We therefore studied autosomal DNA-, mitochondrial DNA-, and Y-chromosomal DNA markers in a representative sample of 141 unrelated Maldivians, with 119 from six major settlements. We found a total of 63 different mtDNA haplotypes that could be allocated to 29 mtDNA haplogroups, mostly within the M, R, and U clades. We found 66 different Y-STR haplotypes in 10 Y-chromosome haplogroups, predominantly H1, J2, L, R1a1a, and R2. Parental admixture analysis for mtDNA- and Y-haplogroup data indicates a strong genetic link between the Maldive Islands and mainland South Asia, and excludes significant gene flow from Southeast Asia. Paternal admixture from West Asia is detected, but cannot be distinguished from admixture from South Asia. Maternal admixture from West Asia is excluded. Within the Maldives, we find a subtle genetic substructure in all marker systems that is not directly related to geographic distance or linguistic dialect. We found reduced Y-STR diversity and reduced male-mediated gene flow between atolls, suggesting independent male founder effects for each atoll. Detected reduced female-mediated gene flow between atolls confirms a Maldives-specific history of matrilocality. In conclusion, our new genetic data agree with the commonly reported Maldivian ancestry in South Asia, but furthermore suggest multiple, independent immigration events and asymmetrical migration of females and males across the archipelago. Am J Phys Anthropol 151:58–67, 2013. © 2013 Wiley Periodicals, Inc.
Y chromosome; mitochondrial DNA; migration; Indo-Aryan languages; South Asia
Chad Basin, lying within the bidirectional corridor of African Sahel, is one of the most populated places in Sub-Saharan Africa today. The origin of its settlement appears connected with Holocene climatic ameliorations (aquatic resources) that started ~10,000 years before present (YBP). Although both Nilo-Saharan and Niger-Congo language families are encountered here, the most diversified group is the Chadic branch belonging to the Afro-Asiatic language phylum. In this article, we investigate the proposed ancient migration of Chadic pastoralists from Eastern Africa based on linguistic data and test for genetic traces of this migration in extant Chadic speaking populations.
We performed whole mitochondrial genome sequencing of 16 L3f haplotypes, focused on clade L3f3 that occurs almost exclusively in Chadic speaking people living in the Chad Basin. These data supported the reconstruction of a L3f phylogenetic tree and calculation of times to the most recent common ancestor for all internal clades. A date ~8,000 YBP was estimated for the L3f3 sub-haplogroup, which is in good agreement with the supposed migration of Chadic speaking pastoralists and their linguistic differentiation from other Afro-Asiatic groups of East Africa. As a whole, the Afro-Asiatic language family presents low population structure, as 92.4% of mtDNA variation is found within populations and only 3.4% of variation can be attributed to diversity among language branches. The Chadic speaking populations form a relatively homogenous cluster, exhibiting lower diversification than the other Afro-Asiatic branches (Berber, Semitic and Cushitic).
The results of our study support an East African origin of mitochondrial L3f3 clade that is present almost exclusively within Chadic speaking people living in Chad Basin. Whole genome sequence-based dates show that the ancestral haplogroup L3f must have emerged soon after the Out-of-Africa migration (around 57,100 ± 9,400 YBP), but the "Chadic" L3f3 clade has much less internal variation, suggesting an expansion during the Holocene period about 8,000 ± 2,500 YBP. This time period in the Chad Basin is known to have been particularly favourable for the expansion of pastoralists coming from northeastern Africa, as suggested by archaeological, linguistic and climatic data.
The Austro-Asiatic linguistic family, which is considered to be the oldest of all the families in India, has a substantial presence in Southeast Asia. However, the possibility of any genetic link among the linguistic sub-families of the Indian Austro-Asiatics on the one hand and between the Indian and the Southeast Asian Austro-Asiatics on the other has not been explored till now. Therefore, to trace the origin and historic expansion of Austro-Asiatic groups of India, we analysed Y-chromosome SNP and STR data of the 1222 individuals from 25 Indian populations, covering all the three branches of Austro-Asiatic tribes, viz. Mundari, Khasi-Khmuic and Mon-Khmer, along with the previously published data on 214 relevant populations from Asia and Oceania.
Our results suggest a strong paternal genetic link, not only among the subgroups of Indian Austro-Asiatic populations but also with those of Southeast Asia. However, maternal link based on mtDNA is not evident. The results also indicate that the haplogroup O-M95 had originated in the Indian Austro-Asiatic populations ~65,000 yrs BP (95% C.I. 25,442 – 132,230) and their ancestors carried it further to Southeast Asia via the Northeast Indian corridor. Subsequently, in the process of expansion, the Mon-Khmer populations from Southeast Asia seem to have migrated and colonized Andaman and Nicobar Islands at a much later point of time.
Our findings are consistent with the linguistic evidence, which suggests that the linguistic ancestors of the Austro-Asiatic populations have originated in India and then migrated to Southeast Asia.
Keratoconus is characterized by the thinning of corneal stroma, resulting in reduced vision. The exact etiology of keratoconus (KC) is still unknown. The involvement of oxidative stress (OS) in this disease has been reported. However, the exact mechanism of OS in keratoconus is still unknown. Thus we planned this study to screen mitochondrial complex I genes for sequence changes in keratoconus patients and controls, as mitochondrial complex I is the chief source of reactive oxygen species (ROS) production.
A total of 20 keratoconus cases and 20 healthy controls without any ocular disorder were enrolled in this study. Mitochondrial complex I genes (ND1, 2, 3, 4, 4L, 5, and 6) were amplified in all patients and controls using 12 pairs of primers by PCR. After sequencing, DNA sequences were analyzed against the mitochondrial reference sequence NC_012920. Haplogroup frequency based Principle Component Analysis (PCA) was constructed to determine whether the gene pool of keratoconus patients is closer to major populations in India.
DNA sequencing revealed a total 84 nucleotide variations in patients and 29 in controls. Of 84 nucleotide changes, 18 variations were non-synonymous and two novel frame-shift mutations were detected in cases. Non-synonymous mtDNA sequence variations may account for increased ROS and decreased ATP production. This ultimately leads to OS; which is a known cause for variety of corneal abnormalities. Haplotype analysis showed that most of the patients were clustered under the haplogroups: T, C4a2a, R2’TJ, M21’Q1a, M12’G2a2a, M8’CZ and M7a2a, which are present as negligible frequency in normal Indian population, whereas only few patients were found to be a part of the other haplogroups like U7 (Indo-European), R2 and R31, whose origin is contentious.
Mt complex I sequence variations are the main cause of elevated ROS production which leads oxidative stress. This oxidative stress then starts a cascade of events which ultimately can lead to keratoconus. Prompt antioxidant therapy should be initiated in keratoconus patients to minimize ROS related damage.
Ethnic Belarusians make up more than 80% of the nine and half million people inhabiting the Republic of Belarus. Belarusians together with Ukrainians and Russians represent the East Slavic linguistic group, largest both in numbers and territory, inhabiting East Europe alongside Baltic-, Finno-Permic- and Turkic-speaking people. Till date, only a limited number of low resolution genetic studies have been performed on this population. Therefore, with the phylogeographic analysis of 565 Y-chromosomes and 267 mitochondrial DNAs from six well covered geographic sub-regions of Belarus we strove to complement the existing genetic profile of eastern Europeans. Our results reveal that around 80% of the paternal Belarusian gene pool is composed of R1a, I2a and N1c Y-chromosome haplogroups – a profile which is very similar to the two other eastern European populations – Ukrainians and Russians. The maternal Belarusian gene pool encompasses a full range of West Eurasian haplogroups and agrees well with the genetic structure of central-east European populations. Our data attest that latitudinal gradients characterize the variation of the uniparentally transmitted gene pools of modern Belarusians. In particular, the Y-chromosome reflects movements of people in central-east Europe, starting probably as early as the beginning of the Holocene. Furthermore, the matrilineal legacy of Belarusians retains two rare mitochondrial DNA haplogroups, N1a3 and N3, whose phylogeographies were explored in detail after de novo sequencing of 20 and 13 complete mitogenomes, respectively, from all over Eurasia. Our phylogeographic analyses reveal that two mitochondrial DNA lineages, N3 and N1a3, both of Middle Eastern origin, might mark distinct events of matrilineal gene flow to Europe: during the mid-Holocene period and around the Pleistocene-Holocene transition, respectively.
The geographic origin and time of dispersal of Austroasiatic (AA) speakers, presently settled in south and southeast Asia, remains disputed. Two rival hypotheses, both assuming a demic component to the language dispersal, have been proposed. The first of these places the origin of Austroasiatic speakers in southeast Asia with a later dispersal to south Asia during the Neolithic, whereas the second hypothesis advocates pre-Neolithic origins and dispersal of this language family from south Asia. To test the two alternative models, this study combines the analysis of uniparentally inherited markers with 610,000 common single nucleotide polymorphism loci from the nuclear genome. Indian AA speakers have high frequencies of Y chromosome haplogroup O2a; our results show that this haplogroup has significantly higher diversity and coalescent time (17–28 thousand years ago) in southeast Asia, strongly supporting the first of the two hypotheses. Nevertheless, the results of principal component and “structure-like” analyses on autosomal loci also show that the population history of AA speakers in India is more complex, being characterized by two ancestral components—one represented in the pattern of Y chromosomal and EDAR results and the other by mitochondrial DNA diversity and genomic structure. We propose that AA speakers in India today are derived from dispersal from southeast Asia, followed by extensive sex-specific admixture with local Indian populations.
Austroasiatic; mtDNA; Y chromosome; autosomes; admixture
South Asia possesses a significant amount of genetic diversity due to considerable intergroup differences in culture and language. There have been numerous reports on the genetic structure of Asian Indians, although these have mostly relied on genotyping microarrays or targeted sequencing of the mitochondria and Y chromosomes. Asian Indians in Singapore are primarily descendants of immigrants from Dravidian-language–speaking states in south India, and 38 individuals from the general population underwent deep whole-genome sequencing with a target coverage of 30X as part of the Singapore Sequencing Indian Project (SSIP). The genetic structure and diversity of these samples were compared against samples from the Singapore Sequencing Malay Project and populations in Phase 1 of the 1,000 Genomes Project (1 KGP). SSIP samples exhibited greater intra-population genetic diversity and possessed higher heterozygous-to-homozygous genotype ratio than other Asian populations. When compared against a panel of well-defined Asian Indians, the genetic makeup of the SSIP samples was closely related to South Indians. However, even though the SSIP samples clustered distinctly from the Europeans in the global population structure analysis with autosomal SNPs, eight samples were assigned to mitochondrial haplogroups that were predominantly present in Europeans and possessed higher European admixture than the remaining samples. An analysis of the relative relatedness between SSIP with two archaic hominins (Denisovan, Neanderthal) identified higher ancient admixture in East Asian populations than in SSIP. The data resource for these samples is publicly available and is expected to serve as a valuable complement to the South Asian samples in Phase 3 of 1 KGP.
Indians of South Asia has long been a population of interest to a wide audience, due to its unique diversity. We have deep-sequenced 38 individuals of Indian descent residing in Singapore (SSIP) in an effort to illustrate their diversity from a whole-genome standpoint. Indeed, among Asians in our population panel, SSIP was most diverse, followed by the Malays in Singapore (SSMP). Their diversity is further observed in the population's chromosome Y haplogroup and mitochondria haplogroup profiles; individuals with European-dominant haplogroups had greater proportion of European admixture. Among variants (single nucleotide polymorphism and small insertions/deletions) discovered in SSIP, 21.69% were novel with respect to previous sequencing projects. In addition, some 14 loss-of-function variants (LOFs) were associated to cancer, Type II diabetes, and cholesterol levels. Finally, D statistic test with ancient hominids concurred that there was gene flow to East Asians compared to South Asians.
Ancient DNA methodology was applied to analyse sequences extracted from freshly unearthed remains (teeth) of 4 individuals deeply deposited in slightly alkaline soil of the Tell Ashara (ancient Terqa) and Tell Masaikh (ancient Kar-Assurnasirpal) Syrian archaeological sites, both in the middle Euphrates valley. Dated to the period between 2.5 Kyrs BC and 0.5 Kyrs AD the studied individuals carried mtDNA haplotypes corresponding to the M4b1, M49 and/or M61 haplogroups, which are believed to have arisen in the area of the Indian subcontinent during the Upper Paleolithic and are absent in people living today in Syria. However, they are present in people inhabiting today’s Tibet, Himalayas, India and Pakistan. We anticipate that the analysed remains from Mesopotamia belonged to people with genetic affinity to the Indian subcontinent since the distribution of identified ancient haplotypes indicates solid link with populations from the region of South Asia-Tibet (Trans-Himalaya). They may have been descendants of migrants from much earlier times, spreading the clades of the macrohaplogroup M throughout Eurasia and founding regional Mesopotamian groups like that of Terqa or just merchants moving along trade routes passing near or through the region. None of the successfully identified nuclear alleles turned out to be ΔF508 CFTR, LCT-13910T or Δ32 CCR5.
Geographic distribution of the genetic diversity in domestic animals, particularly mitochondrial DNA, has often been used to infer centers of domestication. The underlying presumption is that phylogeographic patterns among domesticates were established during, or shortly after the domestication. Human activities are assumed not to have altered the haplogroup frequencies to any great extent. We studied this hypothesis by analyzing 24 mtDNA sequences in ancient Scandinavian dogs. Breeds originating in northern Europe are characterized by having a high frequency of mtDNA sequences belonging to a haplogroup rare in other populations (HgD). This has been suggested to indicate a possible origin of the haplogroup (perhaps even a separate domestication) in central or northern Europe.
The sequences observed in the ancient samples do not include the haplogroup indicative for northern European breeds (HgD). Instead, several of them correspond to haplogroups that are uncommon in the region today and that are supposed to have Asian origin.
We find no evidence for local domestication. We conclude that interpretation of the processes responsible for current domestic haplogroup frequencies should be carried out with caution if based only on contemporary data. They do not only tell their own story, but also that of humans.
A comprehensive review of uniparental systems in South Amerindians was undertaken. Variability in the Y-chromosome haplogroups were assessed in 68 populations and 1,814 individuals whereas that of Y-STR markers was assessed in 29 populations and 590 subjects. Variability in the mitochondrial DNA (mtDNA) haplogroup was examined in 108 populations and 6,697 persons, and sequencing studies used either the complete mtDNA genome or the highly variable segments 1 and 2. The diversity of the markers made it difficult to establish a general picture of Y-chromosome variability in the populations studied. However, haplogroup Q1a3a* was almost always the most prevalent whereas Q1a3* occurred equally in all regions, which suggested its prevalence among the early colonizers. The STR allele frequencies were used to derive a possible ancient Native American Q-clade chromosome haplotype and five of six STR loci showed significant geographic variation. Geographic and linguistic factors moderately influenced the mtDNA distributions (6% and 7%, respectively) and mtDNA haplogroups A and D correlated positively and negatively, respectively, with latitude. The data analyzed here provide rich material for understanding the biological history of South Amerindians and can serve as a basis for comparative studies involving other types of data, such as cultural data.
genetics; language and geography; mitochondrial DNA; Native Americans; South Amerindians; Y-chromosome
The genetic impact associated to the Neolithic spread in Europe has been widely debated over the last 20 years. Within this context, ancient DNA studies have provided a more reliable picture by directly analyzing the protagonist populations at different regions in Europe. However, the lack of available data from the original Near Eastern farmers has limited the achieved conclusions, preventing the formulation of continental models of Neolithic expansion. Here we address this issue by presenting mitochondrial DNA data of the original Near-Eastern Neolithic communities with the aim of providing the adequate background for the interpretation of Neolithic genetic data from European samples. Sixty-three skeletons from the Pre Pottery Neolithic B (PPNB) sites of Tell Halula, Tell Ramad and Dja'de El Mughara dating between 8,700–6,600 cal. B.C. were analyzed, and 15 validated mitochondrial DNA profiles were recovered. In order to estimate the demographic contribution of the first farmers to both Central European and Western Mediterranean Neolithic cultures, haplotype and haplogroup diversities in the PPNB sample were compared using phylogeographic and population genetic analyses to available ancient DNA data from human remains belonging to the Linearbandkeramik-Alföldi Vonaldiszes Kerámia and Cardial/Epicardial cultures. We also searched for possible signatures of the original Neolithic expansion over the modern Near Eastern and South European genetic pools, and tried to infer possible routes of expansion by comparing the obtained results to a database of 60 modern populations from both regions. Comparisons performed among the 3 ancient datasets allowed us to identify K and N-derived mitochondrial DNA haplogroups as potential markers of the Neolithic expansion, whose genetic signature would have reached both the Iberian coasts and the Central European plain. Moreover, the observed genetic affinities between the PPNB samples and the modern populations of Cyprus and Crete seem to suggest that the Neolithic was first introduced into Europe through pioneer seafaring colonization.
Since the original human expansions out of Africa 200,000 years ago, different prehistoric and historic migration events have taken place in Europe. Considering that the movement of the people implies a consequent movement of their genes, it is possible to estimate the impact of these migrations through the genetic analysis of human populations. Agricultural and husbandry practices originated 10,000 years ago in a region of the Near East known as the Fertile Crescent. According to the archaeological record this phenomenon, known as “Neolithic”, rapidly expanded from these territories into Europe. However, whether this diffusion was accompanied or not by human migrations is greatly debated. In the present work, mitochondrial DNA –a type of maternally inherited DNA located in the cell cytoplasm- from the first Near Eastern Neolithic populations was recovered and compared to available data from other Neolithic populations in Europe and also to modern populations from South Eastern Europe and the Near East. The obtained results show that substantial human migrations were involved in the Neolithic spread and suggest that the first Neolithic farmers entered Europe following a maritime route through Cyprus and the Aegean Islands.