To better define the structure and origin of the Bulgarian paternal gene pool, we have examined the Y-chromosome variation in 808 Bulgarian males. The analysis was performed by high-resolution genotyping of biallelic markers and by analyzing the STR variation within the most informative haplogroups. We found that the Y-chromosome gene pool in modern Bulgarians is primarily represented by Western Eurasian haplogroups with ∼ 40% belonging to haplogroups E-V13 and I-M423, and 20% to R-M17. Haplogroups common in the Middle East (J and G) and in South Western Asia (R-L23*) occur at frequencies of 19% and 5%, respectively. Haplogroups C, N and Q, distinctive for Altaic and Central Asian Turkic-speaking populations, occur at the negligible frequency of only 1.5%. Principal Component analyses group Bulgarians with European populations, apart from Central Asian Turkic-speaking groups and South Western Asia populations. Within the country, the genetic variation is structured in Western, Central and Eastern Bulgaria indicating that the Balkan Mountains have been permeable to human movements. The lineage analysis provided the following interesting results: (i) R-L23* is present in Eastern Bulgaria since the post glacial period; (ii) haplogroup E-V13 has a Mesolithic age in Bulgaria from where it expanded after the arrival of farming; (iii) haplogroup J-M241 probably reflects the Neolithic westward expansion of farmers from the earliest sites along the Black Sea. On the whole, in light of the most recent historical studies, which indicate a substantial proto-Bulgarian input to the contemporary Bulgarian people, our data suggest that a common paternal ancestry between the proto-Bulgarians and the Altaic and Central Asian Turkic-speaking populations either did not exist or was negligible.
Sakha – an area connecting South and Northeast Siberia – is significant for understanding the history of peopling of Northeast Eurasia and the Americas. Previous studies have shown a genetic contiguity between Siberia and East Asia and the key role of South Siberia in the colonization of Siberia.
We report the results of a high-resolution phylogenetic analysis of 701 mtDNAs and 318 Y chromosomes from five native populations of Sakha (Yakuts, Evenks, Evens, Yukaghirs and Dolgans) and of the analysis of more than 500,000 autosomal SNPs of 758 individuals from 55 populations, including 40 previously unpublished samples from Siberia. Phylogenetically terminal clades of East Asian mtDNA haplogroups C and D and Y-chromosome haplogroups N1c, N1b and C3, constituting the core of the gene pool of the native populations from Sakha, connect Sakha and South Siberia. Analysis of autosomal SNP data confirms the genetic continuity between Sakha and South Siberia. Maternal lineages D5a2a2, C4a1c, C4a2, C5b1b and the Yakut-specific STR sub-clade of Y-chromosome haplogroup N1c can be linked to a migration of Yakut ancestors, while the paternal lineage C3c was most likely carried to Sakha by the expansion of the Tungusic people. MtDNA haplogroups Z1a1b and Z1a3, present in Yukaghirs, Evens and Dolgans, show traces of different and probably more ancient migration(s). Analysis of both haploid loci and autosomal SNP data revealed only minor genetic components shared between Sakha and the extreme Northeast Siberia. Although the major part of West Eurasian maternal and paternal lineages in Sakha could originate from recent admixture with East Europeans, mtDNA haplogroups H8, H20a and HV1a1a, as well as Y-chromosome haplogroup J, more probably reflect an ancient gene flow from West Eurasia through Central Asia and South Siberia.
Our high-resolution phylogenetic dissection of mtDNA and Y-chromosome haplogroups as well as analysis of autosomal SNP data suggests that Sakha was colonized by repeated expansions from South Siberia with minor gene flow from the Lower Amur/Southern Okhotsk region and/or Kamchatka. The minor West Eurasian component in Sakha attests to both recent and ongoing admixture with East Europeans and an ancient gene flow from West Eurasia.
mtDNA; Y chromosome; Autosomal SNPs; Sakha
Huntington disease (HD) results from CAG expansion in the huntingtin (HTT) gene. Although HD occurs worldwide, there are large geographic differences in its prevalence. The prevalence in populations derived from Europe is 10–100 times greater than in East Asia. The European general population chromosomes can be grouped into three major haplogroups (group of similar haplotypes): A, B and C. The majority of HD chromosomes in Europe are found on haplogroup A. However, in the East-Asian populations of China and Japan, we find the majority of HD chromosomes are associated with haplogroup C. The highest risk HD haplotypes (A1 and A2), are absent from the general and HD populations of China and Japan, and therefore provide an explanation for why HD prevalence is low in East Asia. Interestingly, both East-Asian and European populations share a similar low level of HD on haplogroup C. Our data are consistent with the hypothesis that different HTT haplotypes have different mutation rates, and geographic differences in HTT haplotypes explain the difference in HD prevalence. Further, the bias for expansion on haplogroup C in the East-Asian population cannot be explained by a higher average CAG size, as haplogroup C has a lower average CAG size in the general East-Asian population compared with other haplogroups. This finding suggests that CAG-tract size is not the only factor important for CAG instability. Instead, the expansion bias may be because of genetic cis-elements within the haplotype that influence CAG instability in HTT, possibly through different mutational mechanisms for the different haplogroups.
Huntington disease; prevalence; CAG expansion; CAG instability; haplotypes; Cis-elements
Kazakh populations have traditionally lived as nomadic pastoralists that seasonally migrate across the steppe and surrounding mountain ranges in Kazakhstan and southern Siberia. To clarify their population history from a paternal perspective, we analyzed the non-recombining portion of the Y-chromosome from Kazakh populations living in southern Altai Republic, Russia, using a high-resolution analysis of 60 biallelic markers and 17 STRs. We noted distinct differences in the patterns of genetic variation between maternal and paternal genetic systems in the Altaian Kazakhs. While they possess a variety of East and West Eurasian mtDNA haplogroups, only three East Eurasian paternal haplogroups appear at significant frequencies (C3*, C3c and O3a3c*). In addition, the Y-STR data revealed low genetic diversity within these lineages. Analysis of the combined biallelic and STR data also demonstrated genetic differences among Kazakh populations from across Central Asia. The observed differences between Altaian Kazakhs and indigenous Kazakhs were not the result of admixture between Altaian Kazakhs and indigenous Altaians. Overall, the shared paternal ancestry of Kazakhs differentiates them from other Central Asian populations. In addition, all of them showed evidence of genetic influence by the 13th century CE Mongol Empire. Ultimately, the social and cultural traditions of the Kazakhs shaped their current pattern of genetic variation.
We analyzed 40 SNP and 19 STR Y-chromosomal markers in a large sample of 1,525 indigenous individuals from 14 populations in the Caucasus and 254 additional individuals representing potential source populations. We also employed a lexicostatistical approach to reconstruct the history of the languages of the North Caucasian family spoken by the Caucasus populations. We found a different major haplogroup to be prevalent in each of four sets of populations that occupy distinct geographic regions and belong to different linguistic branches. The haplogroup frequencies correlated with geography and, even more strongly, with language. Within haplogroups, a number of haplotype clusters were shown to be specific to individual populations and languages. The data suggested a direct origin of Caucasus male lineages from the Near East, followed by high levels of isolation, differentiation and genetic drift in situ. Comparison of genetic and linguistic reconstructions covering the last few millennia showed striking correspondences between the topology and dates of the respective gene and language trees, and with documented historical events. Overall, in the Caucasus region, unmatched levels of gene-language co-evolution occurred within geographically isolated populations, probably due to its mountainous terrain.
Y chromosome; glottochronology; Caucasus; gene geography
The geographic origin and time of dispersal of Austroasiatic (AA) speakers, presently settled in south and southeast Asia, remains disputed. Two rival hypotheses, both assuming a demic component to the language dispersal, have been proposed. The first of these places the origin of Austroasiatic speakers in southeast Asia with a later dispersal to south Asia during the Neolithic, whereas the second hypothesis advocates pre-Neolithic origins and dispersal of this language family from south Asia. To test the two alternative models, this study combines the analysis of uniparentally inherited markers with 610,000 common single nucleotide polymorphism loci from the nuclear genome. Indian AA speakers have high frequencies of Y chromosome haplogroup O2a; our results show that this haplogroup has significantly higher diversity and coalescent time (17–28 thousand years ago) in southeast Asia, strongly supporting the first of the two hypotheses. Nevertheless, the results of principal component and “structure-like” analyses on autosomal loci also show that the population history of AA speakers in India is more complex, being characterized by two ancestral components—one represented in the pattern of Y chromosomal and EDAR results and the other by mitochondrial DNA diversity and genomic structure. We propose that AA speakers in India today are derived from dispersal from southeast Asia, followed by extensive sex-specific admixture with local Indian populations.
Austroasiatic; mtDNA; Y chromosome; autosomes; admixture
This study aims to establish the likely origin of EEJ (Eastern European Jews) by genetic distance analysis of autosomal markers and haplogroups on the X and Y chromosomes and mtDNA.
According to the autosomal polymorphisms the investigated Jewish populations do not share a common origin, and EEJ are closer to Italians in particular and to Europeans in general than to the other Jewish populations. The similarity of EEJ to Italians and Europeans is also supported by the X chromosomal haplogroups. In contrast according to the Y-chromosomal haplogroups EEJ are closest to the non-Jewish populations of the Eastern Mediterranean. MtDNA shows a mixed pattern, but overall EEJ are more distant from most populations and hold a marginal rather than a central position. The autosomal genetic distance matrix has a very high correlation (0.789) with geography, whereas the X-chromosomal, Y-chromosomal and mtDNA matrices have a lower correlation (0.540, 0.395 and 0.641 respectively).
The close genetic resemblance to Italians accords with the historical presumption that Ashkenazi Jews started their migrations across Europe in Italy and with historical evidence that conversion to Judaism was common in ancient Rome. The reasons for the discrepancy between the biparental markers and the uniparental markers are discussed.
This article was reviewed by Damian Labuda (nominated by Jerzy Jurka), Kateryna Makova and Qasim Ayub (nominated by Dan Graur).
The Y-chromosome haplogroup N-M231 (Hg N) is distributed widely in eastern and central Asia, Siberia, as well as in eastern and northern Europe. Previous studies suggested a counterclockwise prehistoric migration of Hg N from eastern Asia to eastern and northern Europe. However, the root of this Y chromosome lineage and its detailed dispersal pattern across eastern Asia are still unclear. We analyzed haplogroup profiles and phylogeographic patterns of 1,570 Hg N individuals from 20,826 males in 359 populations across Eurasia. We first genotyped 6,371 males from 169 populations in China and Cambodia, and generated data of 360 Hg N individuals, and then combined published data on 1,210 Hg N individuals from Japanese, Southeast Asian, Siberian, European and Central Asian populations. The results showed that the sub-haplogroups of Hg N have a distinct geographical distribution. The highest Y-STR diversity of the ancestral Hg N sub-haplogroups was observed in the southern part of mainland East Asia, and further phylogeographic analyses supports an origin of Hg N in southern China. Combined with previous data, we propose that the early northward dispersal of Hg N started from southern China about 21 thousand years ago (kya), expanding into northern China 12–18 kya, and reaching further north to Siberia about 12–14 kya before a population expansion and westward migration into Central Asia and eastern/northern Europe around 8.0–10.0 kya. This northward migration of Hg N likewise coincides with retreating ice sheets after the Last Glacial Maximum (22–18 kya) in mainland East Asia.
While it is generally accepted that patterns of intra-specific genetic differentiation are substantially affected by glacial history, population genetic processes occurring during Pleistocene glaciations are still poorly understood. In this study, we address the question of the genetic consequences of Pleistocene glaciations for European grey wolves. Combining our data with data from published studies, we analysed phylogenetic relationships and geographic distribution of mitochondrial DNA haplotypes for 947 contemporary European wolves. We also compared the contemporary wolf sequences with published sequences of 24 ancient European wolves.
We found that haplotypes representing two haplogroups, 1 and 2, overlap geographically, but substantially differ in frequency between populations from south-western and eastern Europe. A comparison between haplotypes from Europe and other continents showed that both haplogroups are spread throughout Eurasia, while only haplogroup 1 occurs in contemporary North American wolves. All ancient wolf samples from western Europe that dated from between 44,000 and 1,200 years B.P. belonged to haplogroup 2, suggesting the long-term predominance of this haplogroup in this region. Moreover, a comparison of current and past frequencies and distributions of the two haplogroups in Europe suggested that haplogroup 2 became outnumbered by haplogroup 1 during the last several thousand years.
Parallel haplogroup replacement, with haplogroup 2 being totally replaced by haplogroup 1, has been reported for North American grey wolves. Taking into account the similarity of diets reported for the late Pleistocene wolves from Europe and North America, the correspondence between these haplogroup frequency changes may suggest that they were associated with ecological changes occurring after the Last Glacial Maximum.
Evenks and Evens, Tungusic-speaking reindeer herders and hunter-gatherers, are spread over a wide area of northern Asia, whereas their linguistic relatives the Udegey, sedentary fishermen and hunter-gatherers, are settled to the south of the lower Amur River. The prehistory and relationships of these Tungusic peoples are as yet poorly investigated, especially with respect to their interactions with neighbouring populations. In this study, we analyse over 500 complete mtDNA genome sequences from nine different Evenk and even subgroups as well as their geographic neighbours from Siberia and their linguistic relatives the Udegey from the Amur-Ussuri region in order to investigate the prehistory of the Tungusic populations. These data are supplemented with analyses of Y-chromosomal haplogroups and STR haplotypes in the Evenks, Evens, and neighbouring Siberian populations. We demonstrate that whereas the North Tungusic Evenks and Evens show evidence of shared ancestry both in the maternal and in the paternal line, this signal has been attenuated by genetic drift and differential gene flow with neighbouring populations, with isolation by distance further shaping the maternal genepool of the Evens. The Udegey, in contrast, appear quite divergent from their linguistic relatives in the maternal line, with a mtDNA haplogroup composition characteristic of populations of the Amur-Ussuri region. Nevertheless, they show affinities with the Evenks, indicating that they might be the result of admixture between local Amur-Ussuri populations and Tungusic populations from the north.
Knowledge of high resolution Y-chromosome haplogroup diversification within Iran provides important geographic context regarding the spread and compartmentalization of male lineages in the Middle East and southwestern Asia. At present, the Iranian population is characterized by an extraordinary mix of different ethnic groups speaking a variety of Indo-Iranian, Semitic and Turkic languages. Despite these features, only few studies have investigated the multiethnic components of the Iranian gene pool. In this survey 938 Iranian male DNAs belonging to 15 ethnic groups from 14 Iranian provinces were analyzed for 84 Y-chromosome biallelic markers and 10 STRs. The results show an autochthonous but non-homogeneous ancient background mainly composed by J2a sub-clades with different external contributions. The phylogeography of the main haplogroups allowed identifying post-glacial and Neolithic expansions toward western Eurasia but also recent movements towards the Iranian region from western Eurasia (R1b-L23), Central Asia (Q-M25), Asia Minor (J2a-M92) and southern Mesopotamia (J1-Page08). In spite of the presence of important geographic barriers (Zagros and Alborz mountain ranges, and the Dasht-e Kavir and Dash-e Lut deserts) which may have limited gene flow, AMOVA analysis revealed that language, in addition to geography, has played an important role in shaping the nowadays Iranian gene pool. Overall, this study provides a portrait of the Y-chromosomal variation in Iran, useful for depicting a more comprehensive history of the peoples of this area as well as for reconstructing ancient migration routes. In addition, our results evidence the important role of the Iranian plateau as source and recipient of gene flow between culturally and genetically distinct populations.
Koreans are generally considered a Northeast Asian group, thought to be related to Altaic-language-speaking populations. However, recent findings have indicated that the peopling of Korea might have been more complex, involving dual origins from both southern and northern parts of East Asia. To understand the male lineage history of Korea, more data from informative genetic markers from Korea and its surrounding regions are necessary. In this study, 25 Y-chromosome single nucleotide polymorphism markers and 17 Y-chromosome short tandem repeat (Y-STR) loci were genotyped in 1,108 males from several populations in East Asia.
In general, we found East Asian populations to be characterized by male haplogroup homogeneity, showing major Y-chromosomal expansions of haplogroup O-M175 lineages. Interestingly, a high frequency (31.4%) of haplogroup O2b-SRY465 (and its sublineage) is characteristic of male Koreans, whereas the haplogroup distribution elsewhere in East Asian populations is patchy. The ages of the haplogroup O2b-SRY465 lineages (~9,900 years) and the pattern of variation within the lineages suggested an ancient origin in a nearby part of northeastern Asia, followed by an expansion in the vicinity of the Korean Peninsula. In addition, the coalescence time (~4,400 years) for the age of haplogroup O2b1-47z, and its Y-STR diversity, suggest that this lineage probably originated in Korea. Further studies with sufficiently large sample sizes to cover the vast East Asian region and using genomewide genotyping should provide further insights.
These findings are consistent with linguistic, archaeological and historical evidence, which suggest that the direct ancestors of Koreans were proto-Koreans who inhabited the northeastern region of China and the Korean Peninsula during the Neolithic (8,000-1,000 BC) and Bronze (1,500-400 BC) Ages.
At the southern entrance to East Asia, early population migration has affected most of the Y-chromosome variations of East Asians.
To assess the isolated genetic structure of Hainan Island and the original genetic structure at the southern entrance, we studied the Y chromosome diversity of 405 Hainan Island aborigines from all the six populations, who have little influence of the recent mainland population relocations and admixtures. Here we report that haplogroups O1a* and O2a* are dominant among Hainan aborigines. In addition, the frequency of the mainland dominant haplogroup O3 is quite low among these aborigines, indicating that they have lived rather isolated. Clustering analyses suggests that the Hainan aborigines have been segregated since about 20 thousand years ago, after two dominant haplogroups entered East Asia (31 to 36 thousand years ago).
Our results suggest that Hainan aborigines have been isolated at the entrance to East Asia for about 20 thousand years, whose distinctive genetic characteristics could be used as important controls in many population genetic studies.
Phylogenetic mitochondrial DNA haplogroups are highly partitioned across global geographic regions. A unique exception is the X haplogroup, which has a widespread global distribution without major regions of distinct localization.
We have examined mitochondrial DNA sequence variation together with Y-chromosome-based haplogroup structure among the Druze, a religious minority with a unique socio-demographic history residing in the Near East. We observed a striking overall pattern of heterogeneous parental origins, consistent with Druze oral tradition, together with both a high frequency and a high diversity of the mitochondrial DNA (mtDNA) X haplogroup within a confined regional subpopulation. Furthermore demographic modeling indicated low migration rates with nearby populations.
These findings were enabled through the use of a paternal kindred based sampling approach, and suggest that the Galilee Druze represent a population isolate, and that the combination of a high frequency and diversity of the mtDNA X haplogroup signifies a phylogenetic refugium, providing a sample snapshot of the genetic landscape of the Near East prior to the modern age.
Diversity patterns of livestock species are informative to the history of agriculture and indicate uniqueness of breeds as relevant for conservation. So far, most studies on cattle have focused on mitochondrial and autosomal DNA variation. Previous studies of Y-chromosomal variation, with limited breed panels, identified two Bos taurus (taurine) haplogroups (Y1 and Y2; both composed of several haplotypes) and one Bos indicus (indicine/zebu) haplogroup (Y3), as well as a strong phylogeographic structuring of paternal lineages.
Methodology and Principal Findings
Haplogroup data were collected for 2087 animals from 138 breeds. For 111 breeds, these were resolved further by genotyping microsatellites INRA189 (10 alleles) and BM861 (2 alleles). European cattle carry exclusively taurine haplotypes, with the zebu Y-chromosomes having appreciable frequencies in Southwest Asian populations. Y1 is predominant in northern and north-western Europe, but is also observed in several Iberian breeds, as well as in Southwest Asia. A single Y1 haplotype is predominant in north-central Europe and a single Y2 haplotype in central Europe. In contrast, we found both Y1 and Y2 haplotypes in Britain, the Nordic region and Russia, with the highest Y-chromosomal diversity seen in the Iberian Peninsula.
We propose that the homogeneous Y1 and Y2 regions reflect founder effects associated with the development and expansion of two groups of dairy cattle, the pied or red breeds from the North Sea and Baltic coasts and the spotted, yellow or brown breeds from Switzerland, respectively. The present Y1-Y2 contrast in central Europe coincides with historic, linguistic, religious and cultural boundaries.
The phylogenetic relationships of numerous branches within the core Y-chromosome haplogroup R-M207 support a West Asian origin of haplogroup R1b, its initial differentiation there followed by a rapid spread of one of its sub-clades carrying the M269 mutation to Europe. Here, we present phylogeographically resolved data for 2043 M269-derived Y-chromosomes from 118 West Asian and European populations assessed for the M412 SNP that largely separates the majority of Central and West European R1b lineages from those observed in Eastern Europe, the Circum-Uralic region, the Near East, the Caucasus and Pakistan. Within the M412 dichotomy, the major S116 sub-clade shows a frequency peak in the upper Danube basin and Paris area with declining frequency toward Italy, Iberia, Southern France and British Isles. Although this frequency pattern closely approximates the spread of the Linearbandkeramik (LBK), Neolithic culture, an advent leading to a number of pre-historic cultural developments during the past ≤10 thousand years, more complex pre-Neolithic scenarios remain possible for the L23(xM412) components in Southeast Europe and elsewhere.
Y-chromosome; haplogroup R1b; human evolution; population genetics
Linguistic and genetic studies on Roma populations inhabited in Europe have unequivocally traced these populations to the Indian subcontinent. However, the exact parental population group and time of the out-of-India dispersal have remained disputed. In the absence of archaeological records and with only scanty historical documentation of the Roma, comparative linguistic studies were the first to identify their Indian origin. Recently, molecular studies on the basis of disease-causing mutations and haploid DNA markers (i.e. mtDNA and Y-chromosome) supported the linguistic view. The presence of Indian-specific Y-chromosome haplogroup H1a1a-M82 and mtDNA haplogroups M5a1, M18 and M35b among Roma has corroborated that their South Asian origins and later admixture with Near Eastern and European populations. However, previous studies have left unanswered questions about the exact parental population groups in South Asia. Here we present a detailed phylogeographical study of Y-chromosomal haplogroup H1a1a-M82 in a data set of more than 10,000 global samples to discern a more precise ancestral source of European Romani populations. The phylogeographical patterns and diversity estimates indicate an early origin of this haplogroup in the Indian subcontinent and its further expansion to other regions. Tellingly, the short tandem repeat (STR) based network of H1a1a-M82 lineages displayed the closest connection of Romani haplotypes with the traditional scheduled caste and scheduled tribe population groups of northwestern India.
Central Asia has served as a corridor for human migrations providing trading routes since ancient times. It has functioned as a conduit connecting Europe and the Middle East with South Asia and far Eastern civilizations. Therefore, the study of populations in this region is essential for a comprehensive understanding of early human dispersal on the Eurasian continent. Although Y- chromosome distributions in Central Asia have been widely surveyed, present-day Afghanistan remains poorly characterized genetically. The present study addresses this lacuna by analyzing 190 Pathan males from Afghanistan using high-resolution Y-chromosome binary markers. In addition, haplotype diversity for its most common lineages (haplogroups R1a1a*-M198 and L3-M357) was estimated using a set of 15 Y-specific STR loci. The observed haplogroup distribution suggests some degree of genetic isolation of the northern population, likely due to the Hindu Kush mountain range separating it from the southern Afghans who have had greater contact with neighboring Pathans from Pakistan and migrations from the Indian subcontinent. Our study demonstrates genetic similarities between Pathans from Afghanistan and Pakistan, both of which are characterized by the predominance of haplogroup R1a1a*-M198 (>50%) and the sharing of the same modal haplotype. Furthermore, the high frequencies of R1a1a-M198 and the presence of G2c-M377 chromosomes in Pathans might represent phylogenetic signals from Khazars, a common link between Pathans and Ashkenazi groups, whereas the absence of E1b1b1a2-V13 lineage does not support their professed Greek ancestry.
Afghanistan; Pathans/Pashtuns; Y-SNP; phylogenetic analyses; haplogroup; haplotype
Archaeological studies have revealed a series of cultural changes around the Last Glacial Maximum in East Asia; whether these changes left any signatures in the gene pool of East Asians remains poorly indicated. To achieve deeper insights into the demographic history of modern humans in East Asia around the Last Glacial Maximum, we extensively analyzed mitochondrial DNA haplogroup M9a'b, a specific haplogroup that was suggested to have some potential for tracing the migration around the Last Glacial Maximum in East Eurasia.
A total of 837 M9a'b mitochondrial DNAs (583 from the literature, while the remaining 254 were newly collected in this study) pinpointed from over 28,000 subjects residing across East Eurasia were studied here. Fifty-nine representative samples were further selected for total mitochondrial DNA sequencing so we could better understand the phylogeny within M9a'b. Based on the updated phylogeny, an extensive phylogeographic analysis was carried out to reveal the differentiation of haplogroup M9a'b and to reconstruct the dispersal histories.
Our results indicated that southern China and/or Southeast Asia likely served as the source of some post-Last Glacial Maximum dispersal(s). The detailed dissection of haplogroup M9a'b revealed the existence of an inland dispersal in mainland East Asia during the post-glacial period. It was this dispersal that expanded not only to western China but also to northeast India and the south Himalaya region. A similar phylogeographic distribution pattern was also observed for haplogroup F1c, thus substantiating our proposition. This inland post-glacial dispersal was in agreement with the spread of the Mesolithic culture originating in South China and northern Vietnam.
Haplogroup G, together with J2 clades, has been associated with the spread of agriculture, especially in the European context. However, interpretations based on simple haplogroup frequency clines do not recognize underlying patterns of genetic diversification. Although progress has been recently made in resolving the haplogroup G phylogeny, a comprehensive survey of the geographic distribution patterns of the significant sub-clades of this haplogroup has not been conducted yet. Here we present the haplogroup frequency distribution and STR variation of 16 informative G sub-clades by evaluating 1472 haplogroup G chromosomes belonging to 98 populations ranging from Europe to Pakistan. Although no basal G-M201* chromosomes were detected in our data set, the homeland of this haplogroup has been estimated to be somewhere nearby eastern Anatolia, Armenia or western Iran, the only areas characterized by the co-presence of deep basal branches as well as the occurrence of high sub-haplogroup diversity. The P303 SNP defines the most frequent and widespread G sub-haplogroup. However, its sub-clades have more localized distribution with the U1-defined branch largely restricted to Near/Middle Eastern and the Caucasus, whereas L497 lineages essentially occur in Europe where they likely originated. In contrast, the only U1 representative in Europe is the G-M527 lineage whose distribution pattern is consistent with regions of Greek colonization. No clinal patterns were detected suggesting that the distributions are rather indicative of isolation by distance and demographic complexities.
Y-chromosome; haplogroup G; human evolution; population genetics
To determine the human Y-chromosome haplogroup backgrounds of non-consensus DYS458.2 short tandem repeat alleles and evaluate their phylogenetic substructure and frequency in representative samples from the Middle East, Europe, and Pakistan.
Molecular characterization of lineages was achieved using a combination of Y-chromosome haplogroup defining binary polymorphisms and up to 37 short tandem repeat loci, including DYS388 to construct haplotypes. DNA sequencing of the DYS458 locus and median-joining network analyses were used to evaluate Y-chromosome lineages displaying the DYS458.2 motif.
We showed that the DYS458.2 allelic innovation arose independently on at least two distinctive binary haplogroup backgrounds and possibly a third as well. The partial allele length pattern was fixed in all haplogroup J1 chromosomes examined, including its known rare sub-haplogroups. Within the alternative R1b3 associated M405 defined sub-haplogroup, both DYS458.0 and DYS458.2 allele classes occurred. A single chromosome also allocated to the R1b3-M269*(xM405) classification. The physical position of the partial insertion/deletion occurrence within the normal tetramer tract differed distinctly in each haplogroup context.
While unusual DYS458.2 alleles are informative, additional information for other linked polymorphic loci is required when using such non-conforming alleles to infer haplogroup background and common ancestry.
Most present-day European men inherited their Y chromosomes from the farmers who spread from the Near East 10,000 years ago, rather than from the hunter-gatherers of the Paleolithic.
The relative contributions to modern European populations of Paleolithic hunter-gatherers and Neolithic farmers from the Near East have been intensely debated. Haplogroup R1b1b2 (R-M269) is the commonest European Y-chromosomal lineage, increasing in frequency from east to west, and carried by 110 million European men. Previous studies suggested a Paleolithic origin, but here we show that the geographical distribution of its microsatellite diversity is best explained by spread from a single source in the Near East via Anatolia during the Neolithic. Taken with evidence on the origins of other haplogroups, this indicates that most European Y chromosomes originate in the Neolithic expansion. This reinterpretation makes Europe a prime example of how technological and cultural change is linked with the expansion of a Y-chromosomal lineage, and the contrast of this pattern with that shown by maternally inherited mitochondrial DNA suggests a unique role for males in the transition.
Arguably the most important cultural transition in the history of modern humans was the development of farming, since it heralded the population growth that culminated in our current massive population size. The genetic diversity of modern populations retains the traces of such past events, and can therefore be studied to illuminate the demographic processes involved in past events. Much debate has focused on the origins of agriculture in Europe some 10,000 years ago, and in particular whether its westerly spread from the Near East was driven by farmers themselves migrating, or by the transmission of ideas and technologies to indigenous hunter-gatherers. This study examines the diversity of the paternally inherited Y chromosome, focusing on the commonest lineage in Europe. The distribution of this lineage, the diversity within it, and estimates of its age all suggest that it spread with farming from the Near East. Taken with evidence on the origins of other lineages, this indicates that most European Y chromosomes descend from Near Eastern farmers. In contrast, most maternal lineages descend from hunter-gatherers, suggesting a reproductive advantage for farming males over indigenous hunter-gatherer males during the cultural transition from hunting-gathering to farming.
A Southwest Asian origin and dispersal to North Africa in the Early Upper Palaeolithic era has been inferred in previous studies for mtDNA haplogroups M1 and U6. Both haplogroups have been proposed to show similar geographic patterns and shared demographic histories.
We report here 24 M1 and 33 U6 new complete mtDNA sequences that allow us to refine the existing phylogeny of these haplogroups. The resulting phylogenetic information was used to genotype a further 131 M1 and 91 U6 samples to determine the geographic spread of their sub-clades. No southwest Asian specific clades for M1 or U6 were discovered. U6 and M1 frequencies in North Africa, the Middle East and Europe do not follow similar patterns, and their sub-clade divisions do not appear to be compatible with their shared history reaching back to the Early Upper Palaeolithic. The Bayesian Skyline Plots testify to non-overlapping phases of expansion, and the haplogroups’ phylogenies suggest that there are U6 sub-clades that expanded earlier than those in M1. Some M1 and U6 sub-clades could be linked with certain events. For example, U6a1 and M1b, with their coalescent ages of ~20,000–22,000 years ago and earliest inferred expansion in northwest Africa, could coincide with the flourishing of the Iberomaurusian industry, whilst U6b and M1b1 appeared at the time of the Capsian culture.
Our high-resolution phylogenetic dissection of both haplogroups and coalescent time assessments suggest that the extant main branching pattern of both haplogroups arose and diversified in the mid-later Upper Palaeolithic, with some sub-clades concomitantly with the expansion of the Iberomaurusian industry. Carriers of these maternal lineages have been later absorbed into and diversified further during the spread of Afro-Asiatic languages in North and East Africa.
mtDNA haplogroups M1 and U6; Afro-Asiatic languages; North Africa
The geographic and ethnolinguistic differentiation of many African Y-chromosomal lineages provides an opportunity to evaluate human migration episodes and admixture processes, in a pan-continental context. The analysis of the paternal genetic structure of Equatorial West Africans carried out to date leaves their origins and relationships unclear, and raises questions about the existence of major demographic phenomena analogous to the large-scale Bantu expansions. To address this, we have analysed the variation of 31 binary and 11 microsatellite markers on the non-recombining portion of the Y chromosome in Guinea-Bissau samples of diverse ethnic affiliations, some not studied before.
The Guinea-Bissau Y chromosome pool is characterized by low haplogroup diversity (D = 0.470, sd 0.033), with the predominant haplogroup E3a*-M2 shared among the ethnic clusters and reaching a maximum of 82.2% in the Mandenka people. The Felupe-Djola and Papel groups exhibit the highest diversity of lineages and harbor the deep-rooting haplogroups A-M91, E2-M75 and E3*-PN2, typical of Sahel's more central and eastern areas. Their genetic distinction from other groups is statistically significant (P = 0.01) though not attributable to linguistic, geographic or religious criteria. Non sub-Saharan influences were associated with the presence of haplogroup R1b-P25 and particular lineages of E3b1-M78.
The predominance and high diversity of haplogroup E3a*-M2 suggests a demographic expansion in the equatorial western fringe, possibly supported by a local agricultural center. The paternal pool of the Mandenka and Balanta displays evidence of a particularly marked population growth among the Guineans, possibly reflecting the demographic effects of the agriculturalist lifestyle and their putative relationship to the people that introduced early cultivation practices into West Africa. The paternal background of the Felupe-Djola and Papel ethnic groups suggests a better conserved ancestral pool deriving from East Africa, from where they have supposedly migrated in recent times. Despite the overall homogeneity in a multiethnic sample, which contrasts with their social structure, minor clusters suggest the imprints of multiple peoples at different timescales: traces of ancestral inhabitants in haplogroups A-M91 and B-M60, today typical of hunter-gatherers; North African influence in E3b1-M78 Y chromosomes, probably due to trans-Saharan contacts; and R1b-P25 lineages reflecting European admixture via the North Atlantic slave trade.
Goats (Capra hircus) are one of the oldest domesticated species, and they are kept all over the world as an essential resource for meat, milk, and fiber. Although recent archeological and molecular biological studies suggested that they originated in West Asia, their domestication processes such as the timing of population expansion and the dynamics of their selection pressures are little known. With the aim of addressing these issues, the nearly complete mitochondrial protein-encoding genes were determined from East, Southeast, and South Asian populations. Our coalescent time estimations suggest that the timing of their major population expansions was in the Late Pleistocene and significantly predates the beginning of their domestication in the Neolithic era (≈10,000 years ago). The ω (ratio of non-synonymous rate/synonymous substitution rate) for each lineage was also estimated. We found that the ω of the globally distributed haplogroup A which is inherited by more than 90% of goats examined, turned out to be extremely low, suggesting that they are under severe selection pressure probably due to their large population size. Conversely, the ω of the Asian-specific haplogroup B inherited by about 5% of goats was relatively high. Although recent molecular studies suggest that domestication of animals may tend to relax selective constraints, the opposite pattern observed in our goat mitochondrial genome data indicates the process of domestication is more complex than may be presently appreciated and cannot be explained only by a simple relaxation model.