Humans have more copies of amylase genes than other primates. It is still poorly understood, however, when the copy number expansion occurred and whether its spread was enhanced by selection. Here we assess amylase copy numbers in a global sample of 480 high coverage genomes and find that regions flanking the amylase locus show notable depression of genetic diversity both in African and non-African populations. Analysis of genetic variation in these regions supports the model of an early selective sweep in the human lineage after the split of humans from Neanderthals which led to the fixation of multiple copies of AMY1 in place of a single copy. We find evidence of multiple secondary losses of copy number with the highest frequency (52%) of a deletion of AMY2A and associated low copy number of AMY1 in Northeast Siberian populations whose diet has been low in starch content.
The history of human settlement in Southeast Asia has been complex and
involved several distinct dispersal events. Here we report the analyses of 1825
individuals from Southeast Asia including new genome-wide genotype data for 146
individuals from three Mainland Southeast Asian (Burmese, Malay and Vietnamese)
and four Island Southeast Asian (Dusun, Filipino, Kankanaey and Murut)
populations. While confirming the presence of previously recognized major
ancestry components in the Southeast Asian population structure, we highlight
the Kankanaey Igorots from the highlands of the Philippine Mountain Province as
likely the closest living representatives of the source population that may have
given rise to the Austronesian expansion. This conclusion rests on independent
evidence from various analyses of autosomal data and uniparental markers.
Given the extensive presence of trade goods, cultural and linguistic
evidence of Indian influence in Southeast Asia starting from 2.5kya we also
detect traces of a South Asian signature in different populations in the region
dating to the last couple of thousand years.
Medieval era encounters of nomadic groups of the Eurasian Steppe and largely sedentary East Europeans had a variety of demographic and cultural consequences. Amongst these outcomes was the emergence of the Lipka Tatars—a Slavic-speaking Sunni-Muslim minority residing in modern Belarus, Lithuania and Poland, whose ancestors arrived in these territories via several migration waves, mainly from the Golden Horde. Our results show that Belarusian Lipka Tatars share a substantial part of their gene pool with Europeans as indicated by their Y-chromosomal, mitochondrial and autosomal DNA variation. Nevertheless, Belarusian Lipkas still retain a strong genetic signal of their nomadic ancestry, witnessed by the presence of common Y-chromosomal and mitochondrial DNA variants as well as autosomal segments identical by descent between Lipkas and East Eurasians from temperate and northern regions. Hence, we document Lipka Tatars as a unique example of former Medieval migrants into Central Europe, who became sedentary, changed language to Slavic, yet preserved their faith and retained, both uni- and bi-parentally, a clear genetic echo of a complex population interplay throughout the Eurasian Steppe Belt, extending from Central Europe to northern China.
Clovis, with its distinctive biface, blade and osseous technologies, is the oldest widespread archaeological complex defined in North America, dating from 11,100 to 10,700 14C years BP (13,000 to 12,600 calendar years BP)1,2. Nearly fifty years of archaeological research point to the Clovis complex as having developed south of the North American ice sheets from an ancestral technology3. However, both the origins and genetic legacy of the people who manufactured Clovis tools remain debated. It is argued that these people ultimately derived from Asia and were directly related to contemporary Native Americans2. An alternative, Solutrean, hypothesis posits that the Clovis predecessors immigrated from Southwestern Europe during the Last Glacial Maximum (LGM)4. Here, we report the genome sequence of a male infant (Anzick-1) recovered from the Anzick burial site in western Montana. The human bones date to 10,705±35 14C years BP (CAMS-80538; c. 12,707–12,556 calendar years BP) and were directly associated with Clovis tools. We sequenced the genome to an average depth of 14.4× and show that the gene flow from the Siberian Upper Palaeolithic Mal′ta individual5 into Native American ancestors is also shared by the Anzick-1 individual and thus happened prior to 12,600 years BP. We also show that the Anzick-1 individual is more closely related to all indigenous American populations than to any other group. Our data are compatible with the hypothesis that Anzick-1 belonged to a population directly ancestral to many contemporary Native Americans. Finally, we find evidence of a deep divergence in Native American populations that pre-dates the Anzick-1 individual.
How and when the Americas were populated remains contentious. Using ancient and modern genome-wide data, we find that the ancestors of all present-day Native Americans, including Athabascans and Amerindians, entered the Americas as a single migration wave from Siberia no earlier than 23 thousand years ago (KYA), and after no more than 8,000-year isolation period in Beringia. Following their arrival to the Americas, ancestral Native Americans diversified into two basal genetic branches around 13 KYA, one that is now dispersed across North and South America and the other is restricted to North America. Subsequent gene flow resulted in some Native Americans sharing ancestry with present-day East Asians (including Siberians) and, more distantly, Australo-Melanesians. Putative ‘Paleoamerican’ relict populations, including the historical Mexican Pericúes and South American Fuego-Patagonians, are not directly related to modern Australo-Melanesians as suggested by the Paleoamerican Model.
The northern region of the Indian subcontinent is a vast landscape interlaced by diverse ecologies, for example, the Gangetic Plain and the Himalayas. A great number of ethnic groups are found there, displaying a multitude of languages and cultures. The Tharu is one of the largest and most linguistically diverse of such groups, scattered across the Tarai region of Nepal and bordering Indian states. Their origins are uncertain. Hypotheses have been advanced postulating shared ancestry with Austroasiatic, or Tibeto-Burman-speaking populations as well as aboriginal roots in the Tarai. Several Tharu groups speak a variety of Indo-Aryan languages, but have traditionally been described by ethnographers as representing East Asian phenotype. Their ancestry and intra-population diversity has previously been tested only for haploid (mitochondrial DNA and Y-chromosome) markers in a small portion of the population. This study presents the first systematic genetic survey of the Tharu from both Nepal and two Indian states of Uttarakhand and Uttar Pradesh, using genome-wide SNPs and haploid markers. We show that the Tharu have dual genetic ancestry as up to one-half of their gene pool is of East Asian origin. Within the South Asian proportion of the Tharu genetic ancestry, we see vestiges of their common origin in the north of the South Asian Subcontinent manifested by mitochondrial DNA haplogroup M43.
The bacteria Yersinia pestis is the etiological agent of plague and has caused human pandemics with millions of deaths in historic times. How and when it originated remains contentious. Here, we report the oldest direct evidence of Yersinia pestis identified by ancient DNA in human teeth from Asia and Europe dating from 2,800 to 5,000 years ago. By sequencing the genomes, we find that these ancient plague strains are basal to all known Yersinia pestis. We find the origins of the Yersinia pestis lineage to be at least two times older than previous estimates. We also identify a temporal sequence of genetic changes that lead to increased virulence and the emergence of the bubonic plague. Our results show that plague infection was endemic in the human populations of Eurasia at least 3,000 years before any historical recordings of pandemics.
•Yersinia pestis was common across Eurasia in the Bronze Age•The most recent common ancestor of all Y. pestis was 5,783 years ago•The ymt gene was acquired before 951 cal BC, giving rise to transmission via fleas•Bronze Age Y. pestis was not capable of causing bubonic plague
The plague-causing bacteria Yersinia pestis infected humans in Bronze Age Eurasia, three millennia earlier than any historical records of plague, but only acquired the genetic changes making it a highly virulent, flea-borne bubonic strain ∼3,000 years ago.
In order to explore the diversity and selective signatures of duplication and deletion human copy number variants (CNVs), we sequenced 236 individuals from 125 distinct human populations. We observed that duplications exhibit fundamentally different population genetic and selective signatures than deletions and are more likely to be stratified between human populations. Through reconstruction of the ancestral human genome, we identify megabases of DNA lost in different human lineages and pinpoint large duplications that introgressed from the extinct Denisova lineage now found at high frequency exclusively in Oceanic populations. We find that the proportion of CNV base pairs to single nucleotide variant base pairs is greater among non-Africans than it is among African populations, but we conclude that this difference is likely due to unique aspects of non-African population history as opposed to differences in CNV load.
The Slavic branch of the Balto-Slavic sub-family of Indo-European languages underwent rapid divergence as a result of the spatial expansion of its speakers from Central-East Europe, in early medieval times. This expansion–mainly to East Europe and the northern Balkans–resulted in the incorporation of genetic components from numerous autochthonous populations into the Slavic gene pools. Here, we characterize genetic variation in all extant ethnic groups speaking Balto-Slavic languages by analyzing mitochondrial DNA (n = 6,876), Y-chromosomes (n = 6,079) and genome-wide SNP profiles (n = 296), within the context of other European populations. We also reassess the phylogeny of Slavic languages within the Balto-Slavic branch of Indo-European. We find that genetic distances among Balto-Slavic populations, based on autosomal and Y-chromosomal loci, show a high correlation (0.9) both with each other and with geography, but a slightly lower correlation (0.7) with mitochondrial DNA and linguistic affiliation. The data suggest that genetic diversity of the present-day Slavs was predominantly shaped in situ, and we detect two different substrata: ‘central-east European’ for West and East Slavs, and ‘south-east European’ for South Slavs. A pattern of distribution of segments identical by descent between groups of East-West and South Slavs suggests shared ancestry or a modest gene flow between those two groups, which might derive from the historic spread of Slavic people.
The Turkic peoples represent a diverse collection of ethnic groups defined by the Turkic languages. These groups have dispersed across a vast area, including Siberia, Northwest China, Central Asia, East Europe, the Caucasus, Anatolia, the Middle East, and Afghanistan. The origin and early dispersal history of the Turkic peoples is disputed, with candidates for their ancient homeland ranging from the Transcaspian steppe to Manchuria in Northeast Asia. Previous genetic studies have not identified a clear-cut unifying genetic signal for the Turkic peoples, which lends support for language replacement rather than demic diffusion as the model for the Turkic language’s expansion. We addressed the genetic origin of 373 individuals from 22 Turkic-speaking populations, representing their current geographic range, by analyzing genome-wide high-density genotype data. In agreement with the elite dominance model of language expansion most of the Turkic peoples studied genetically resemble their geographic neighbors. However, western Turkic peoples sampled across West Eurasia shared an excess of long chromosomal tracts that are identical by descent (IBD) with populations from present-day South Siberia and Mongolia (SSM), an area where historians center a series of early Turkic and non-Turkic steppe polities. While SSM matching IBD tracts (> 1cM) are also observed in non-Turkic populations, Turkic peoples demonstrate a higher percentage of such tracts (p-values ≤ 0.01) compared to their non-Turkic neighbors. Finally, we used the ALDER method and inferred admixture dates (~9th–17th centuries) that overlap with the Turkic migrations of the 5th–16th centuries. Thus, our results indicate historical admixture among Turkic peoples, and the recent shared ancestry with modern populations in SSM supports one of the hypothesized homelands for their nomadic Turkic and related Mongolic ancestors.
Centuries of nomadic migrations have ultimately resulted in the distribution of Turkic languages over a large area ranging from Siberia, across Central Asia to Eastern Europe and the Middle East. Despite the profound cultural impact left by these nomadic peoples, little is known about their prehistoric origins. Moreover, because contemporary Turkic speakers tend to genetically resemble their geographic neighbors, it is not clear whether their nomadic ancestors left an identifiable genetic trace. In this study, we show that Turkic-speaking peoples sampled across the Middle East, Caucasus, East Europe, and Central Asia share varying proportions of Asian ancestry that originate in a single area, southern Siberia and Mongolia. Mongolic- and Turkic-speaking populations from this area bear an unusually high number of long chromosomal tracts that are identical by descent with Turkic peoples from across west Eurasia. Admixture induced linkage disequilibrium decay across chromosomes in these populations indicates that admixture occurred during the 9th–17th centuries, in agreement with the historically recorded Turkic nomadic migrations and later Mongol expansion. Thus, our findings reveal genetic traces of recent large-scale nomadic migrations and map their source to a previously hypothesized area of Mongolia and southern Siberia.
We sequenced the genomes of a ~7,000 year old farmer from Germany and eight
~8,000 year old hunter-gatherers from Luxembourg and Sweden. We analyzed these and other
ancient genomes1–4 with 2,345 contemporary humans to show that most
present Europeans derive from at least three highly differentiated populations: West
European Hunter-Gatherers (WHG), who contributed ancestry to all Europeans but not to Near
Easterners; Ancient North Eurasians (ANE) related to Upper Paleolithic Siberians3, who contributed to both Europeans and Near
Easterners; and Early European Farmers (EEF), who were mainly of Near Eastern origin but
also harbored WHG-related ancestry. We model these populations’ deep relationships
and show that EEF had ~44% ancestry from a “Basal Eurasian”
population that split prior to the diversification of other non-African lineages.
The origins of the First Americans remain contentious. Although Native Americans
seem to be genetically most closely related to east Asians1–3, there is no
consensus with regard to which specific Old World populations they are closest
to4–8. Here we sequence the draft genome of an approximately 24,000-year-old
individual (MA-1), from Mal’ta in south-central Siberia9, to an average depth of 13. To our knowledge this is the
oldest anatomically modern human genome reported to date. The MA-1 mitochondrial genome
belongs to haplogroup U, which has also been found at high frequency among Upper
Palaeolithic and Mesolithic European hunter-gatherers10–12, and the Y
chromosome of MA-1 is basal to modern-day western Eurasians and near the root of most
Native American lineages5. Similarly, we
find autosomal evidence that MA-1 is basal to modern-day western Eurasians and genetically
closely related to modern-day Native Americans, with no close affinity to east Asians.
This suggests that populations related to contemporary western Eurasians had a more
north-easterly distribution 24,000 years ago than commonly thought. Furthermore, we
estimate that 14 to 38% of Native American ancestry may originate through gene
flow from this ancient population. This is likely to have occurred after the divergence of
Native American ancestors from east Asian ancestors, but before the diversification of
Native American populations in the New World. Gene flow from the MA-1 lineage into Native
American ancestors could explain why several crania from the First Americans have been
reported as bearing morphological characteristics that do not resemble those of east
Asians2,13. Sequencing of another south-central Siberian, Afontova Gora-2 dating
to approximately 17,000 years ago14,
revealed similar autosomal genetic signatures as MA-1, suggesting that the region was
continuously occupied by humans throughout the Last Glacial Maximum. Our findings reveal
that western Eurasian genetic signatures in modern-day Native Americans derive not only
from post-Columbian admixture, as commonly thought, but also from a mixed ancestry of the
Following the dispersal out of Africa, where hominins evolved in warm environments for millions of years, our species has colonised different climate zones of the world, including high latitudes and cold environments. The extent to which human habitation in (sub-)Arctic regions has been enabled by cultural buffering, short-term acclimatization and genetic adaptations is not clearly understood. Present day indigenous populations of Siberia show a number of phenotypic features, such as increased basal metabolic rate, low serum lipid levels and increased blood pressure that have been attributed to adaptation to the extreme cold climate. In this study we introduce a dataset of 200 individuals from ten indigenous Siberian populations that were genotyped for 730,525 SNPs across the genome to identify genes and non-coding regions that have undergone unusually rapid allele frequency and long-range haplotype homozygosity change in the recent past. At least three distinct population clusters could be identified among the Siberians, each of which showed a number of unique signals of selection. A region on chromosome 11 (chr11:66–69 Mb) contained the largest amount of clustering of significant signals and also the strongest signals in all the different selection tests performed. We present a list of candidate cold adaption genes that showed significant signals of positive selection with our strongest signals associated with genes involved in energy regulation and metabolism (CPT1A, LRP5, THADA) and vascular smooth muscle contraction (PRKG1). By employing a new method that paints phased chromosome chunks by their ancestry we distinguish local Siberian-specific long-range haplotype signals from those introduced by admixture.
We present an Aboriginal Australian genomic sequence obtained from a 100-year-old lock of hair donated by an Aboriginal man from southern Western Australia in the early 20th century. We detect no evidence of European admixture and estimate contamination levels to be below 0.5%. We show that Aboriginal Australians are descendants of an early human dispersal into eastern Asia, possibly 62,000 to 75,000 years ago. This dispersal is separate from the one that gave rise to modern Asians 25,000 to 38,000 years ago. We also find evidence of gene flow between populations of the two dispersal waves prior to the divergence of Native Americans from modern Asian ancestors. Our findings support the hypothesis that present-day Aboriginal Australians descend from the earliest humans to occupy Australia, likely representing one of the oldest continuous populations outside Africa.
We report here the genome sequence of an ancient human. Obtained from ∼4,000-year-old permafrost-preserved hair, the genome represents a male individual from the first known culture to settle in Greenland. Sequenced to an average depth of 20×, we recover 79% of the diploid genome, an amount close to the practical limit of current sequencing technologies. We identify 353,151 high-confidence single-nucleotide polymorphisms (SNPs), of which 6.8% have not been reported previously. We estimate raw read contamination to be no higher than 0.8%. We use functional SNP assessment to assign possible phenotypic characteristics of the individual that belonged to a culture whose location has yielded only trace human remains. We compare the high-confidence SNPs to those of contemporary populations to find the populations most closely related to the individual. This provides evidence for a migration from Siberia into the New World some 5,500 years ago, independent of that giving rise to the modern Native Americans and Inuit.
Skin pigmentation is one of the most variable phenotypic traits in humans. A non-synonymous substitution (rs1426654) in the third exon of SLC24A5 accounts for lighter skin in Europeans but not in East Asians. A previous genome-wide association study carried out in a heterogeneous sample of UK immigrants of South Asian descent suggested that this gene also contributes significantly to skin pigmentation variation among South Asians. In the present study, we have quantitatively assessed skin pigmentation for a largely homogeneous cohort of 1228 individuals from the Southern region of the Indian subcontinent. Our data confirm significant association of rs1426654 SNP with skin pigmentation, explaining about 27% of total phenotypic variation in the cohort studied. Our extensive survey of the polymorphism in 1573 individuals from 54 ethnic populations across the Indian subcontinent reveals wide presence of the derived-A allele, although the frequencies vary substantially among populations. We also show that the geospatial pattern of this allele is complex, but most importantly, reflects strong influence of language, geography and demographic history of the populations. Sequencing 11.74 kb of SLC24A5 in 95 individuals worldwide reveals that the rs1426654-A alleles in South Asian and West Eurasian populations are monophyletic and occur on the background of a common haplotype that is characterized by low genetic diversity. We date the coalescence of the light skin associated allele at 22–28 KYA. Both our sequence and genome-wide genotype data confirm that this gene has been a target for positive selection among Europeans. However, the latter also shows additional evidence of selection in populations of the Middle East, Central Asia, Pakistan and North India but not in South India.
Human skin color is one of the most visible aspects of human diversity. The genetic basis of pigmentation in Europeans has been understood to some extent, but our knowledge about South Asians has been restricted to a handful of studies. It has been suggested that a single nucleotide difference in SLC24A5 accounts for 25–38% European-African pigmentation differences and correlates with lighter skin. This genetic variant has also been associated with skin color variation among South Asians living in the UK. Here, we report a study based on a homogenous cohort of South India. Our results confirm that SLC24A5 plays a key role in pigmentation diversity of South Asians. Country-wide screening of the variant reveals that the light skin associated allele is widespread in the Indian subcontinent and its complex patterning is shaped by a combination of processes involving selection and demographic history of the populations. By studying the variation of SLC24A5 sequences among a diverse set of individuals, we show that the light skin associated allele in South Asians is identical by descent to that found in Europeans. Our study also provides new insights into positive selection acting on the gene and the evolutionary history of light skin in humans.
Despite being located at the crossroads of Asia, genetics of the Afghanistan populations have been largely overlooked. It is currently inhabited by five major ethnic populations: Pashtun, Tajik, Hazara, Uzbek and Turkmen. Here we present autosomal from a subset of our samples, mitochondrial and Y- chromosome data from over 500 Afghan samples among these 5 ethnic groups. This Afghan data was supplemented with the same Y-chromosome analyses of samples from Iran, Kyrgyzstan, Mongolia and updated Pakistani samples (HGDP-CEPH). The data presented here was integrated into existing knowledge of pan-Eurasian genetic diversity. The pattern of genetic variation, revealed by structure-like and Principal Component analyses and Analysis of Molecular Variance indicates that the people of Afghanistan are made up of a mosaic of components representing various geographic regions of Eurasian ancestry. The absence of a major Central Asian-specific component indicates that the Hindu Kush, like the gene pool of Central Asian populations in general, is a confluence of gene flows rather than a source of distinctly autochthonous populations that have arisen in situ: a conclusion that is reinforced by the phylogeography of both haploid loci.
Sakha – an area connecting South and Northeast Siberia – is significant for understanding the history of peopling of Northeast Eurasia and the Americas. Previous studies have shown a genetic contiguity between Siberia and East Asia and the key role of South Siberia in the colonization of Siberia.
We report the results of a high-resolution phylogenetic analysis of 701 mtDNAs and 318 Y chromosomes from five native populations of Sakha (Yakuts, Evenks, Evens, Yukaghirs and Dolgans) and of the analysis of more than 500,000 autosomal SNPs of 758 individuals from 55 populations, including 40 previously unpublished samples from Siberia. Phylogenetically terminal clades of East Asian mtDNA haplogroups C and D and Y-chromosome haplogroups N1c, N1b and C3, constituting the core of the gene pool of the native populations from Sakha, connect Sakha and South Siberia. Analysis of autosomal SNP data confirms the genetic continuity between Sakha and South Siberia. Maternal lineages D5a2a2, C4a1c, C4a2, C5b1b and the Yakut-specific STR sub-clade of Y-chromosome haplogroup N1c can be linked to a migration of Yakut ancestors, while the paternal lineage C3c was most likely carried to Sakha by the expansion of the Tungusic people. MtDNA haplogroups Z1a1b and Z1a3, present in Yukaghirs, Evens and Dolgans, show traces of different and probably more ancient migration(s). Analysis of both haploid loci and autosomal SNP data revealed only minor genetic components shared between Sakha and the extreme Northeast Siberia. Although the major part of West Eurasian maternal and paternal lineages in Sakha could originate from recent admixture with East Europeans, mtDNA haplogroups H8, H20a and HV1a1a, as well as Y-chromosome haplogroup J, more probably reflect an ancient gene flow from West Eurasia through Central Asia and South Siberia.
Our high-resolution phylogenetic dissection of mtDNA and Y-chromosome haplogroups as well as analysis of autosomal SNP data suggests that Sakha was colonized by repeated expansions from South Siberia with minor gene flow from the Lower Amur/Southern Okhotsk region and/or Kamchatka. The minor West Eurasian component in Sakha attests to both recent and ongoing admixture with East Europeans and an ancient gene flow from West Eurasia.
mtDNA; Y chromosome; Autosomal SNPs; Sakha
Population genetic studies on European populations have highlighted Italy as one of genetically most diverse regions. This is possibly due to the country's complex demographic history and large variability in terrain throughout the territory. This is the reason why Italy is enriched for population isolates, Sardinia being the best-known example. As the population isolates have a great potential in disease-causing genetic variants identification, we aimed to genetically characterize a region from northeastern Italy, which is known for isolated communities. Total of 1310 samples, collected from six geographically isolated villages, were genotyped at >145 000 single-nucleotide polymorphism positions. Newly genotyped data were analyzed jointly with the available genome-wide data sets of individuals of European descent, including several population isolates. Despite the linguistic differences and geographical isolation the village populations still show the greatest genetic similarity to other Italian samples. The genetic isolation and small effective population size of the village populations is manifested by higher levels of genomic homozygosity and elevated linkage disequilibrium. These estimates become even more striking when the detected substructure is taken into account. The observed level of genetic isolation in Friuli-Venezia Giulia region is more extreme according to several measures of isolation compared with Sardinians, French Basques and northern Finns, thus proving the status of an isolate.
population genetics; isolated population; genetic distance
A Southwest Asian origin and dispersal to North Africa in the Early Upper Palaeolithic era has been inferred in previous studies for mtDNA haplogroups M1 and U6. Both haplogroups have been proposed to show similar geographic patterns and shared demographic histories.
We report here 24 M1 and 33 U6 new complete mtDNA sequences that allow us to refine the existing phylogeny of these haplogroups. The resulting phylogenetic information was used to genotype a further 131 M1 and 91 U6 samples to determine the geographic spread of their sub-clades. No southwest Asian specific clades for M1 or U6 were discovered. U6 and M1 frequencies in North Africa, the Middle East and Europe do not follow similar patterns, and their sub-clade divisions do not appear to be compatible with their shared history reaching back to the Early Upper Palaeolithic. The Bayesian Skyline Plots testify to non-overlapping phases of expansion, and the haplogroups’ phylogenies suggest that there are U6 sub-clades that expanded earlier than those in M1. Some M1 and U6 sub-clades could be linked with certain events. For example, U6a1 and M1b, with their coalescent ages of ~20,000–22,000 years ago and earliest inferred expansion in northwest Africa, could coincide with the flourishing of the Iberomaurusian industry, whilst U6b and M1b1 appeared at the time of the Capsian culture.
Our high-resolution phylogenetic dissection of both haplogroups and coalescent time assessments suggest that the extant main branching pattern of both haplogroups arose and diversified in the mid-later Upper Palaeolithic, with some sub-clades concomitantly with the expansion of the Iberomaurusian industry. Carriers of these maternal lineages have been later absorbed into and diversified further during the spread of Afro-Asiatic languages in North and East Africa.
mtDNA haplogroups M1 and U6; Afro-Asiatic languages; North Africa
The geographic origin and time of dispersal of Austroasiatic (AA) speakers, presently settled in south and southeast Asia, remains disputed. Two rival hypotheses, both assuming a demic component to the language dispersal, have been proposed. The first of these places the origin of Austroasiatic speakers in southeast Asia with a later dispersal to south Asia during the Neolithic, whereas the second hypothesis advocates pre-Neolithic origins and dispersal of this language family from south Asia. To test the two alternative models, this study combines the analysis of uniparentally inherited markers with 610,000 common single nucleotide polymorphism loci from the nuclear genome. Indian AA speakers have high frequencies of Y chromosome haplogroup O2a; our results show that this haplogroup has significantly higher diversity and coalescent time (17–28 thousand years ago) in southeast Asia, strongly supporting the first of the two hypotheses. Nevertheless, the results of principal component and “structure-like” analyses on autosomal loci also show that the population history of AA speakers in India is more complex, being characterized by two ancestral components—one represented in the pattern of Y chromosomal and EDAR results and the other by mitochondrial DNA diversity and genomic structure. We propose that AA speakers in India today are derived from dispersal from southeast Asia, followed by extensive sex-specific admixture with local Indian populations.
Austroasiatic; mtDNA; Y chromosome; autosomes; admixture
Human Y-chromosome haplogroup structure is largely circumscribed by continental boundaries. One notable exception to this general pattern is the young haplogroup R1a that exhibits post-Glacial coalescent times and relates the paternal ancestry of more than 10% of men in a wide geographic area extending from South Asia to Central East Europe and South Siberia. Its origin and dispersal patterns are poorly understood as no marker has yet been described that would distinguish European R1a chromosomes from Asian. Here we present frequency and haplotype diversity estimates for more than 2000 R1a chromosomes assessed for several newly discovered SNP markers that introduce the onset of informative R1a subdivisions by geography. Marker M434 has a low frequency and a late origin in West Asia bearing witness to recent gene flow over the Arabian Sea. Conversely, marker M458 has a significant frequency in Europe, exceeding 30% in its core area in Eastern Europe and comprising up to 70% of all M17 chromosomes present there. The diversity and frequency profiles of M458 suggest its origin during the early Holocene and a subsequent expansion likely related to a number of prehistoric cultural developments in the region. Its primary frequency and diversity distribution correlates well with some of the major Central and East European river basins where settled farming was established before its spread further eastward. Importantly, the virtual absence of M458 chromosomes outside Europe speaks against substantial patrilineal gene flow from East Europe to Asia, including to India, at least since the mid-Holocene.
Y chromosome; haplogroup R1a; human evolution; population genetics
We have analyzed 7137 samples from 125 different caste, tribal and religious groups of India and 99 samples from three populations of Nepal for the length variation in the COII/tRNALys region of mtDNA. Samples showing length variation were subjected to detailed phylogenetic analysis based on HVS-I and informative coding region sequence variation. The overall frequencies of the 9-bp deletion and insertion variants in South Asia were 1.8% and 0.5%, respectively. We have also defined a novel deep-rooting haplogroup M43 and identified the rare haplogroup H14 in Indian populations carrying the 9bp-deletion by complete mtDNA sequencing. Moreover, we redefined haplogroup M6 and dissected it into two well-defined subclades. The presence of haplogroups F1 and B5a in Uttar Pradesh suggests minor maternal contribution from Southeast Asia to Northern India. The occurrence of haplogroup F1 in the Nepalese sample implies that Nepal might have served as a bridge for the flow of eastern lineages to India. The presence of R6 in the Nepalese, on the other hand, suggests that the gene flow between India and Nepal has been reciprocal.
South Asia; 9bp indel; mtDNA; Haplogroup