The origins of the First Americans remain contentious. Although Native Americans
seem to be genetically most closely related to east Asians1–3, there is no
consensus with regard to which specific Old World populations they are closest
to4–8. Here we sequence the draft genome of an approximately 24,000-year-old
individual (MA-1), from Mal’ta in south-central Siberia9, to an average depth of 13. To our knowledge this is the
oldest anatomically modern human genome reported to date. The MA-1 mitochondrial genome
belongs to haplogroup U, which has also been found at high frequency among Upper
Palaeolithic and Mesolithic European hunter-gatherers10–12, and the Y
chromosome of MA-1 is basal to modern-day western Eurasians and near the root of most
Native American lineages5. Similarly, we
find autosomal evidence that MA-1 is basal to modern-day western Eurasians and genetically
closely related to modern-day Native Americans, with no close affinity to east Asians.
This suggests that populations related to contemporary western Eurasians had a more
north-easterly distribution 24,000 years ago than commonly thought. Furthermore, we
estimate that 14 to 38% of Native American ancestry may originate through gene
flow from this ancient population. This is likely to have occurred after the divergence of
Native American ancestors from east Asian ancestors, but before the diversification of
Native American populations in the New World. Gene flow from the MA-1 lineage into Native
American ancestors could explain why several crania from the First Americans have been
reported as bearing morphological characteristics that do not resemble those of east
Asians2,13. Sequencing of another south-central Siberian, Afontova Gora-2 dating
to approximately 17,000 years ago14,
revealed similar autosomal genetic signatures as MA-1, suggesting that the region was
continuously occupied by humans throughout the Last Glacial Maximum. Our findings reveal
that western Eurasian genetic signatures in modern-day Native Americans derive not only
from post-Columbian admixture, as commonly thought, but also from a mixed ancestry of the
Following the dispersal out of Africa, where hominins evolved in warm environments for millions of years, our species has colonised different climate zones of the world, including high latitudes and cold environments. The extent to which human habitation in (sub-)Arctic regions has been enabled by cultural buffering, short-term acclimatization and genetic adaptations is not clearly understood. Present day indigenous populations of Siberia show a number of phenotypic features, such as increased basal metabolic rate, low serum lipid levels and increased blood pressure that have been attributed to adaptation to the extreme cold climate. In this study we introduce a dataset of 200 individuals from ten indigenous Siberian populations that were genotyped for 730,525 SNPs across the genome to identify genes and non-coding regions that have undergone unusually rapid allele frequency and long-range haplotype homozygosity change in the recent past. At least three distinct population clusters could be identified among the Siberians, each of which showed a number of unique signals of selection. A region on chromosome 11 (chr11:66–69 Mb) contained the largest amount of clustering of significant signals and also the strongest signals in all the different selection tests performed. We present a list of candidate cold adaption genes that showed significant signals of positive selection with our strongest signals associated with genes involved in energy regulation and metabolism (CPT1A, LRP5, THADA) and vascular smooth muscle contraction (PRKG1). By employing a new method that paints phased chromosome chunks by their ancestry we distinguish local Siberian-specific long-range haplotype signals from those introduced by admixture.
We present an Aboriginal Australian genomic sequence obtained from a 100-year-old lock of hair donated by an Aboriginal man from southern Western Australia in the early 20th century. We detect no evidence of European admixture and estimate contamination levels to be below 0.5%. We show that Aboriginal Australians are descendants of an early human dispersal into eastern Asia, possibly 62,000 to 75,000 years ago. This dispersal is separate from the one that gave rise to modern Asians 25,000 to 38,000 years ago. We also find evidence of gene flow between populations of the two dispersal waves prior to the divergence of Native Americans from modern Asian ancestors. Our findings support the hypothesis that present-day Aboriginal Australians descend from the earliest humans to occupy Australia, likely representing one of the oldest continuous populations outside Africa.
During their migrations out of Africa, humans successfully colonised and adapted to a wide range of habitats, including extreme high altitude environments, where reduced atmospheric oxygen (hypoxia) imposes a number of physiological challenges. This study evaluates genetic and phenotypic variation in the Colla population living in the Argentinean Andes above 3500 m and compares it to the nearby lowland Wichí group in an attempt to pinpoint evolutionary mechanisms underlying adaptation to high altitude hypoxia. We genotyped 730,525 SNPs in 25 individuals from each population. In genome-wide scans of extended haplotype homozygosity Collas showed the strongest signal around VEGFB, which plays an essential role in the ischemic heart, and ELTD1, another gene crucial for heart development and prevention of cardiac hypertrophy. Moreover, pathway enrichment analysis showed an overrepresentation of pathways associated with cardiac morphology. Taken together, these findings suggest that Colla highlanders may have evolved a toolkit of adaptative mechanisms resulting in cardiac reinforcement, most likely to counteract the adverse effects of the permanently increased haematocrit and associated shear forces that characterise the Andean response to hypoxia. Regulation of cerebral vascular flow also appears to be part of the adaptive response in Collas. These findings are not only relevant to understand the evolution of hypoxia protection in high altitude populations but may also suggest new avenues for medical research into conditions where hypoxia constitutes a detrimental factor.
We report here the genome sequence of an ancient human. Obtained from ∼4,000-year-old permafrost-preserved hair, the genome represents a male individual from the first known culture to settle in Greenland. Sequenced to an average depth of 20×, we recover 79% of the diploid genome, an amount close to the practical limit of current sequencing technologies. We identify 353,151 high-confidence single-nucleotide polymorphisms (SNPs), of which 6.8% have not been reported previously. We estimate raw read contamination to be no higher than 0.8%. We use functional SNP assessment to assign possible phenotypic characteristics of the individual that belonged to a culture whose location has yielded only trace human remains. We compare the high-confidence SNPs to those of contemporary populations to find the populations most closely related to the individual. This provides evidence for a migration from Siberia into the New World some 5,500 years ago, independent of that giving rise to the modern Native Americans and Inuit.
Haplogroup G, together with J2 clades, has been associated with the spread of agriculture, especially in the European context. However, interpretations based on simple haplogroup frequency clines do not recognize underlying patterns of genetic diversification. Although progress has been recently made in resolving the haplogroup G phylogeny, a comprehensive survey of the geographic distribution patterns of the significant sub-clades of this haplogroup has not been conducted yet. Here we present the haplogroup frequency distribution and STR variation of 16 informative G sub-clades by evaluating 1472 haplogroup G chromosomes belonging to 98 populations ranging from Europe to Pakistan. Although no basal G-M201* chromosomes were detected in our data set, the homeland of this haplogroup has been estimated to be somewhere nearby eastern Anatolia, Armenia or western Iran, the only areas characterized by the co-presence of deep basal branches as well as the occurrence of high sub-haplogroup diversity. The P303 SNP defines the most frequent and widespread G sub-haplogroup. However, its sub-clades have more localized distribution with the U1-defined branch largely restricted to Near/Middle Eastern and the Caucasus, whereas L497 lineages essentially occur in Europe where they likely originated. In contrast, the only U1 representative in Europe is the G-M527 lineage whose distribution pattern is consistent with regions of Greek colonization. No clinal patterns were detected suggesting that the distributions are rather indicative of isolation by distance and demographic complexities.
Y-chromosome; haplogroup G; human evolution; population genetics
Skin pigmentation is one of the most variable phenotypic traits in humans. A non-synonymous substitution (rs1426654) in the third exon of SLC24A5 accounts for lighter skin in Europeans but not in East Asians. A previous genome-wide association study carried out in a heterogeneous sample of UK immigrants of South Asian descent suggested that this gene also contributes significantly to skin pigmentation variation among South Asians. In the present study, we have quantitatively assessed skin pigmentation for a largely homogeneous cohort of 1228 individuals from the Southern region of the Indian subcontinent. Our data confirm significant association of rs1426654 SNP with skin pigmentation, explaining about 27% of total phenotypic variation in the cohort studied. Our extensive survey of the polymorphism in 1573 individuals from 54 ethnic populations across the Indian subcontinent reveals wide presence of the derived-A allele, although the frequencies vary substantially among populations. We also show that the geospatial pattern of this allele is complex, but most importantly, reflects strong influence of language, geography and demographic history of the populations. Sequencing 11.74 kb of SLC24A5 in 95 individuals worldwide reveals that the rs1426654-A alleles in South Asian and West Eurasian populations are monophyletic and occur on the background of a common haplotype that is characterized by low genetic diversity. We date the coalescence of the light skin associated allele at 22–28 KYA. Both our sequence and genome-wide genotype data confirm that this gene has been a target for positive selection among Europeans. However, the latter also shows additional evidence of selection in populations of the Middle East, Central Asia, Pakistan and North India but not in South India.
Human skin color is one of the most visible aspects of human diversity. The genetic basis of pigmentation in Europeans has been understood to some extent, but our knowledge about South Asians has been restricted to a handful of studies. It has been suggested that a single nucleotide difference in SLC24A5 accounts for 25–38% European-African pigmentation differences and correlates with lighter skin. This genetic variant has also been associated with skin color variation among South Asians living in the UK. Here, we report a study based on a homogenous cohort of South India. Our results confirm that SLC24A5 plays a key role in pigmentation diversity of South Asians. Country-wide screening of the variant reveals that the light skin associated allele is widespread in the Indian subcontinent and its complex patterning is shaped by a combination of processes involving selection and demographic history of the populations. By studying the variation of SLC24A5 sequences among a diverse set of individuals, we show that the light skin associated allele in South Asians is identical by descent to that found in Europeans. Our study also provides new insights into positive selection acting on the gene and the evolutionary history of light skin in humans.
Many efforts have been made to detect signatures of positive selection in the human genome, especially those associated with expansion from Africa and subsequent colonization of all other continents. However, most approaches have not directly probed the relationship between the environment and patterns of variation among humans. We have designed a method to identify regions of the genome under selection based on Mantel tests conducted within a general linear model framework, which we call MAntel-GLM to Infer Clinal Selection (MAGICS). MAGICS explicitly incorporates population-specific and genome-wide patterns of background variation as well as information from environmental values to provide an improved picture of selection and its underlying causes in human populations.
Our results significantly overlap with those obtained by other published methodologies, but MAGICS has several advantages. These include improvements that: limit false positives by reducing the number of independent tests conducted and by correcting for geographic distance, which we found to be a major contributor to selection signals; yield absolute rather than relative estimates of significance; identify specific geographic regions linked most strongly to particular signals of selection; and detect recent balancing as well as directional selection.
We find evidence of selection associated with climate (P < 10-5) in 354 genes, and among these observe a highly significant enrichment for directional positive selection. Two of our strongest 'hits’, however, ADRA2A and ADRA2C, implicated in vasoconstriction in response to cold and pain stimuli, show evidence of balancing selection. Our results clearly demonstrate evidence of climate-related signals of directional and balancing selection.
Climate; Adaptation; Human evolution; Natural selection; Environmental adaptation; Population genetics
Global variation in skin pigmentation is one of the most striking examples of environmental adaptation in humans. More than two hundred loci have been identified as candidate genes in model organisms and a few tens of these have been found to be significantly associated with human skin pigmentation in genome-wide association studies. However, the evolutionary history of different pigmentation genes is rather complex: some loci have been subjected to strong positive selection, while others evolved under the relaxation of functional constraints in low UV environment. Here we report the results of a global study of the human tyrosinase gene, which is one of the key enzymes in melanin production, to assess the role of its variation in the evolution of skin pigmentation differences among human populations. We observe a higher rate of non-synonymous polymorphisms in the European sample consistent with the relaxation of selective constraints. A similar pattern was previously observed in the MC1R gene and concurs with UV radiation-driven model of skin color evolution by which mutations leading to lower melanin levels and decreased photoprotection are subject to purifying selection at low latitudes while being tolerated or even favored at higher latitudes because they facilitate UV-dependent vitamin D production. Our coalescent date estimates suggest that the non-synonymous variants, which are frequent in Europe and North Africa, are recent and have emerged after the separation of East and West Eurasian populations.
The Tibetan and Andean Plateaus and Ethiopian highlands are the largest regions to have long-term high-altitude residents. Such populations are exposed to lower barometric pressures and hence atmospheric partial pressures of oxygen. Such “hypobaric hypoxia” may limit physical functional capacity, reproductive health, and even survival. As such, selection of genetic variants advantageous to hypoxic adaptation is likely to have occurred. Identifying signatures of such selection is likely to help understanding of hypoxic adaptive processes. Here, we seek evidence of such positive selection using five Ethiopian populations, three of which are from high-altitude areas in Ethiopia. As these populations may have been recipients of Eurasian gene flow, we correct for this admixture. Using single-nucleotide polymorphism genotype data from multiple populations, we find the strongest signal of selection in BHLHE41 (also known as DEC2 or SHARP1). Remarkably, a major role of this gene is regulation of the same hypoxia response pathway on which selection has most strikingly been observed in both Tibetan and Andean populations. Because it is also an important player in the circadian rhythm pathway, BHLHE41 might also provide insights into the mechanisms underlying the recognized impacts of hypoxia on the circadian clock. These results support the view that Ethiopian, Andean, and Tibetan populations living at high altitude have adapted to hypoxia differently, with convergent evolution affecting different genes from the same pathway.
adaptation to high altitude; natural selection
A Southwest Asian origin and dispersal to North Africa in the Early Upper Palaeolithic era has been inferred in previous studies for mtDNA haplogroups M1 and U6. Both haplogroups have been proposed to show similar geographic patterns and shared demographic histories.
We report here 24 M1 and 33 U6 new complete mtDNA sequences that allow us to refine the existing phylogeny of these haplogroups. The resulting phylogenetic information was used to genotype a further 131 M1 and 91 U6 samples to determine the geographic spread of their sub-clades. No southwest Asian specific clades for M1 or U6 were discovered. U6 and M1 frequencies in North Africa, the Middle East and Europe do not follow similar patterns, and their sub-clade divisions do not appear to be compatible with their shared history reaching back to the Early Upper Palaeolithic. The Bayesian Skyline Plots testify to non-overlapping phases of expansion, and the haplogroups’ phylogenies suggest that there are U6 sub-clades that expanded earlier than those in M1. Some M1 and U6 sub-clades could be linked with certain events. For example, U6a1 and M1b, with their coalescent ages of ~20,000–22,000 years ago and earliest inferred expansion in northwest Africa, could coincide with the flourishing of the Iberomaurusian industry, whilst U6b and M1b1 appeared at the time of the Capsian culture.
Our high-resolution phylogenetic dissection of both haplogroups and coalescent time assessments suggest that the extant main branching pattern of both haplogroups arose and diversified in the mid-later Upper Palaeolithic, with some sub-clades concomitantly with the expansion of the Iberomaurusian industry. Carriers of these maternal lineages have been later absorbed into and diversified further during the spread of Afro-Asiatic languages in North and East Africa.
mtDNA haplogroups M1 and U6; Afro-Asiatic languages; North Africa
Linguistic and genetic studies on Roma populations inhabited in Europe have unequivocally traced these populations to the Indian subcontinent. However, the exact parental population group and time of the out-of-India dispersal have remained disputed. In the absence of archaeological records and with only scanty historical documentation of the Roma, comparative linguistic studies were the first to identify their Indian origin. Recently, molecular studies on the basis of disease-causing mutations and haploid DNA markers (i.e. mtDNA and Y-chromosome) supported the linguistic view. The presence of Indian-specific Y-chromosome haplogroup H1a1a-M82 and mtDNA haplogroups M5a1, M18 and M35b among Roma has corroborated that their South Asian origins and later admixture with Near Eastern and European populations. However, previous studies have left unanswered questions about the exact parental population groups in South Asia. Here we present a detailed phylogeographical study of Y-chromosomal haplogroup H1a1a-M82 in a data set of more than 10,000 global samples to discern a more precise ancestral source of European Romani populations. The phylogeographical patterns and diversity estimates indicate an early origin of this haplogroup in the Indian subcontinent and its further expansion to other regions. Tellingly, the short tandem repeat (STR) based network of H1a1a-M82 lineages displayed the closest connection of Romani haplotypes with the traditional scheduled caste and scheduled tribe population groups of northwestern India.
The geographic origin and time of dispersal of Austroasiatic (AA) speakers, presently settled in south and southeast Asia, remains disputed. Two rival hypotheses, both assuming a demic component to the language dispersal, have been proposed. The first of these places the origin of Austroasiatic speakers in southeast Asia with a later dispersal to south Asia during the Neolithic, whereas the second hypothesis advocates pre-Neolithic origins and dispersal of this language family from south Asia. To test the two alternative models, this study combines the analysis of uniparentally inherited markers with 610,000 common single nucleotide polymorphism loci from the nuclear genome. Indian AA speakers have high frequencies of Y chromosome haplogroup O2a; our results show that this haplogroup has significantly higher diversity and coalescent time (17–28 thousand years ago) in southeast Asia, strongly supporting the first of the two hypotheses. Nevertheless, the results of principal component and “structure-like” analyses on autosomal loci also show that the population history of AA speakers in India is more complex, being characterized by two ancestral components—one represented in the pattern of Y chromosomal and EDAR results and the other by mitochondrial DNA diversity and genomic structure. We propose that AA speakers in India today are derived from dispersal from southeast Asia, followed by extensive sex-specific admixture with local Indian populations.
Austroasiatic; mtDNA; Y chromosome; autosomes; admixture
The phylogenetic relationships of numerous branches within the core Y-chromosome haplogroup R-M207 support a West Asian origin of haplogroup R1b, its initial differentiation there followed by a rapid spread of one of its sub-clades carrying the M269 mutation to Europe. Here, we present phylogeographically resolved data for 2043 M269-derived Y-chromosomes from 118 West Asian and European populations assessed for the M412 SNP that largely separates the majority of Central and West European R1b lineages from those observed in Eastern Europe, the Circum-Uralic region, the Near East, the Caucasus and Pakistan. Within the M412 dichotomy, the major S116 sub-clade shows a frequency peak in the upper Danube basin and Paris area with declining frequency toward Italy, Iberia, Southern France and British Isles. Although this frequency pattern closely approximates the spread of the Linearbandkeramik (LBK), Neolithic culture, an advent leading to a number of pre-historic cultural developments during the past ≤10 thousand years, more complex pre-Neolithic scenarios remain possible for the L23(xM412) components in Southeast Europe and elsewhere.
Y-chromosome; haplogroup R1b; human evolution; population genetics
Human Y-chromosome haplogroup structure is largely circumscribed by continental boundaries. One notable exception to this general pattern is the young haplogroup R1a that exhibits post-Glacial coalescent times and relates the paternal ancestry of more than 10% of men in a wide geographic area extending from South Asia to Central East Europe and South Siberia. Its origin and dispersal patterns are poorly understood as no marker has yet been described that would distinguish European R1a chromosomes from Asian. Here we present frequency and haplotype diversity estimates for more than 2000 R1a chromosomes assessed for several newly discovered SNP markers that introduce the onset of informative R1a subdivisions by geography. Marker M434 has a low frequency and a late origin in West Asia bearing witness to recent gene flow over the Arabian Sea. Conversely, marker M458 has a significant frequency in Europe, exceeding 30% in its core area in Eastern Europe and comprising up to 70% of all M17 chromosomes present there. The diversity and frequency profiles of M458 suggest its origin during the early Holocene and a subsequent expansion likely related to a number of prehistoric cultural developments in the region. Its primary frequency and diversity distribution correlates well with some of the major Central and East European river basins where settled farming was established before its spread further eastward. Importantly, the virtual absence of M458 chromosomes outside Europe speaks against substantial patrilineal gene flow from East Europe to Asia, including to India, at least since the mid-Holocene.
Y chromosome; haplogroup R1a; human evolution; population genetics
Islam is the second most practiced religion in India, next to Hinduism. It is still unclear whether the spread of Islam in India has been only a cultural transformation or is associated with detectable levels of gene flow. To estimate the contribution of West Asian and Arabian admixture to Indian Muslims, we assessed genetic variation in mtDNA, Y-chromosomal and LCT/MCM6 markers in 472, 431 and 476 samples, respectively, representing six Muslim communities from different geographical regions of India. We found that most of the Indian Muslim populations received their major genetic input from geographically close non-Muslim populations. However, low levels of likely sub-Saharan African, Arabian and West Asian admixture were also observed among Indian Muslims in the form of L0a2a2 mtDNA and E1b1b1a and J*(xJ2) Y-chromosomal lineages. The distinction between Iranian and Arabian sources was difficult to make with mtDNA and the Y chromosome, as the estimates were highly correlated because of similar gene pool compositions in the sources. In contrast, the LCT/MCM6 locus, which shows a clear distinction between the two sources, enabled us to rule out significant gene flow from Arabia. Overall, our results support a model according to which the spread of Islam in India was predominantly cultural conversion associated with minor but still detectable levels of gene flow from outside, primarily from Iran and Central Asia, rather than directly from the Arabian Peninsula.
Indian Muslims; mtDNA; Y chromosome; Middle East; sub-Saharan; gene flow
Haplogroup J1 is a prevalent Y-chromosome lineage within the Near East. We report the frequency and YSTR diversity data for its major sub-clade (J1e). The overall expansion time estimated from 453 chromosomes is 10 000 years. Moreover, the previously described J1 (DYS388=13) chromosomes, frequently found in the Caucasus and eastern Anatolian populations, were ancestral to J1e and displayed an expansion time of 9000 years. For J1e, the Zagros/Taurus mountain region displays the highest haplotype diversity, although the J1e frequency increases toward the peripheral Arabian Peninsula. The southerly pattern of decreasing expansion time estimates is consistent with the serial drift and founder effect processes. The first such migration is predicted to have occurred at the onset of the Neolithic, and accordingly J1e parallels the establishment of rain-fed agriculture and semi-nomadic herders throughout the Fertile Crescent. Subsequently, J1e lineages might have been involved in episodes of the expansion of pastoralists into arid habitats coinciding with the spread of Arabic and other Semitic-speaking populations.
Y-chromosome haplogroup J1e; Neolithic; Arabic languages; pastoralism
The geographical position of Maharashtra state makes it rather essential to study the dispersal of modern humans in South Asia. Several hypotheses have been proposed to explain the cultural, linguistic and geographical affinity of the populations living in Maharashtra state with other South Asian populations. The genetic origin of populations living in this state is poorly understood and hitherto been described at low molecular resolution level.
To address this issue, we have analyzed the mitochondrial DNA (mtDNA) of 185 individuals and NRY (non-recombining region of Y chromosome) of 98 individuals belonging to two major tribal populations of Maharashtra, and compared their molecular variations with that of 54 South Asian contemporary populations of adjacent states. Inter and intra population comparisons reveal that the maternal gene pool of Maharashtra state populations is composed of mainly South Asian haplogroups with traces of east and west Eurasian haplogroups, while the paternal haplogroups comprise the South Asian as well as signature of near eastern specific haplogroup J2a.
Our analysis suggests that Indian populations, including Maharashtra state, are largely derived from Paleolithic ancient settlers; however, a more recent (∼10 Ky older) detectable paternal gene flow from west Asia is well reflected in the present study. These findings reveal movement of populations to Maharashtra through the western coast rather than mainland where Western Ghats-Vindhya Mountains and Narmada-Tapti rivers might have acted as a natural barrier. Comparing the Maharastrian populations with other South Asian populations reveals that they have a closer affinity with the South Indian than with the Central Indian populations.
Islam is the second-most practiced religion in India, next to Hinduism. It is still unclear whether the spread of Islam in India has been only a cultural transformation or was associated with detectable levels of gene flow. To estimate the contribution of West Asian and Arabian admixture to Indian Muslims we have assessed genetic variation in mtDNA, Y-chromosomal and LCT/MCM6 markers in 472, 431 and 476 samples, respectively, representing six Muslim communities from different geographical regions of India. We found that most of the Indian Muslim populations received their major genetic input from geographically close non-Muslim populations. However, low levels of likely Arabian and West Asian admixture were also observed among Indian Muslims in the form of L0a2a2 mtDNA and E1b1b1a and J*(xJ2) Y-chromosomal lineages. The distinction between Iranian and Arabian sources was difficult to make with mtDNA and the Y chromosome as the estimates were highly correlated due to similar gene pool compositions in the sources. In contrast, the LCT/MCM6 locus, which shows a clear distinction between the two sources, enabled us to rule out significant gene flow from Arabia. Overall, our results support a model according to which the spread of Islam in India was predominantly cultural conversion associated with minor but still detectable levels of gene flow from outside, primarily from Iran and Central Asia, rather than directly from the Arabian Peninsula.
Indian Muslims; mtDNA; Y chromosome; Middle East; sub-Saharan; gene flow
Using established criteria for work with fossil DNA we have analysed mitochondrial DNA from 92 individuals from 18 locations in Denmark ranging in time from the Mesolithic to the Medieval Age. Unequivocal assignment of mtDNA haplotypes was possible for 56 of the ancient individuals; however, the success rate varied substantially between sites; the highest rates were obtained with untouched, freshly excavated material, whereas heavy handling, archeological preservation and storage for many years influenced the ability to obtain authentic endogenic DNA. While the nucleotide diversity at two locations was similar to that among extant Danes, the diversity at four sites was considerably higher. This supports previous observations for ancient Britons. The overall occurrence of haplogroups did not deviate from extant Scandinavians, however, haplogroup I was significantly more frequent among the ancient Danes (average 13%) than among extant Danes and Scandinavians (∼2.5%) as well as among other ancient population samples reported. Haplogroup I could therefore have been an ancient Southern Scandinavian type “diluted” by later immigration events. Interestingly, the two Neolithic samples (4,200 YBP, Bell Beaker culture) that were typed were haplogroup U4 and U5a, respectively, and the single Bronze Age sample (3,300–3,500 YBP) was haplogroup U4. These two haplogroups have been associated with the Mesolithic populations of Central and Northern Europe. Therefore, at least for Southern Scandinavia, our findings do not support a possible replacement of a haplogroup U dominated hunter-gatherer population by a more haplogroup diverse Neolithic Culture.
The molecular clock of mitochondrial DNA has been extensively used to date various genetic events. However, its substitution rate among humans appears to be higher than rates inferred from human-chimpanzee comparisons, limiting the potential of interspecies clock calibrations for intraspecific dating. It is not well understood how and why the substitution rate accelerates. We have analyzed a phylogenetic tree of 3057 publicly available human mitochondrial DNA coding region sequences for changes in the ratios of mutations belonging to different functional classes. The proportion of non-synonymous and RNA genes substitutions has reduced over hundreds of thousands of years. The highest mutation ratios corresponding to fast acceleration in the apparent substitution rate of the coding sequence have occurred after the end of the Last Ice Age. We recalibrate the molecular clock of human mtDNA as 7990 years per synonymous mutation over the mitochondrial genome. However, the distribution of substitutions at synonymous sites in human data significantly departs from a model assuming a single rate parameter and implies at least 3 different subclasses of sites. Neutral model with 3 synonymous substitution rates can explain most, if not all, of the apparent molecular clock difference between the intra- and interspecies levels. Our findings imply the sluggishness of purifying selection in removing the slightly deleterious mutations from the human as well as the Neandertal and chimpanzee populations. However, for humans, the weakness of purifying selection has been further exacerbated by the population expansions associated with the out-of Africa migration and the end of the Last Ice Age.
Polymorphic Y chromosome short tandem repeats (STRs) have been widely used in population genetic and evolutionary studies. Compared to di-, tri-, and tetranucleotide repeats, STRs with longer repeat units occur more rarely and are far less commonly used.
In order to study the evolutionary dynamics of STRs according to repeat unit size, we analysed variation at 24 Y chromosome repeat loci: 1 tri-, 14 tetra-, 7 penta-, and 2 hexanucleotide loci. According to our results, penta- and hexanucleotide repeats have approximately two times lower repeat variance and diversity than tri- and tetranucleotide repeats, indicating that their mutation rate is about half of that of tri- and tetranucleotide repeats. Thus, STR markers with longer repeat units are more robust in distinguishing Y chromosome haplogroups and, in some cases, phylogenetic splits within established haplogroups.
Our findings suggest that Y chromosome STRs of increased repeat unit size have a lower rate of evolution, which has significant relevance in population genetic and evolutionary studies.
Human genetic diversity observed in Indian subcontinent is second only to that of Africa. This implies an early settlement and demographic growth soon after the first 'Out-of-Africa' dispersal of anatomically modern humans in Late Pleistocene. In contrast to this perspective, linguistic diversity in India has been thought to derive from more recent population movements and episodes of contact. With the exception of Dravidian, which origin and relatedness to other language phyla is obscure, all the language families in India can be linked to language families spoken in different regions of Eurasia. Mitochondrial DNA and Y chromosome evidence has supported largely local evolution of the genetic lineages of the majority of Dravidian and Indo-European speaking populations, but there is no consensus yet on the question of whether the Munda (Austro-Asiatic) speaking populations originated in India or derive from a relatively recent migration from further East.
Here, we report the analysis of 35 novel complete mtDNA sequences from India which refine the structure of Indian-specific varieties of haplogroup R. Detailed analysis of haplogroup R7, coupled with a survey of ~12,000 mtDNAs from caste and tribal groups over the entire Indian subcontinent, reveals that one of its more recently derived branches (R7a1), is particularly frequent among Munda-speaking tribal groups. This branch is nested within diverse R7 lineages found among Dravidian and Indo-European speakers of India. We have inferred from this that a subset of Munda-speaking groups have acquired R7 relatively recently. Furthermore, we find that the distribution of R7a1 within the Munda-speakers is largely restricted to one of the sub-branches (Kherwari) of northern Munda languages. This evidence does not support the hypothesis that the Austro-Asiatic speakers are the primary source of the R7 variation. Statistical analyses suggest a significant correlation between genetic variation and geography, rather than between genes and languages.
Our high-resolution phylogeographic study, involving diverse linguistic groups in India, suggests that the high frequency of mtDNA haplogroup R7 among Munda speaking populations of India can be explained best by gene flow from linguistically different populations of Indian subcontinent. The conclusion is based on the observation that among Indo-Europeans, and particularly in Dravidians, the haplogroup is, despite its lower frequency, phylogenetically more divergent, while among the Munda speakers only one sub-clade of R7, i.e. R7a1, can be observed. It is noteworthy that though R7 is autochthonous to India, and arises from the root of hg R, its distribution and phylogeography in India is not uniform. This suggests the more ancient establishment of an autochthonous matrilineal genetic structure, and that isolation in the Pleistocene, lineage loss through drift, and endogamy of prehistoric and historic groups have greatly inhibited genetic homogenization and geographical uniformity.