|Home | About | Journals | Submit | Contact Us | Français|
Africa is the birthplace of modern humans, and is the source of the geographic expansion of ancestral populations into other regions of the world. Indigenous Africans are characterized by high levels of genetic diversity within and between populations. The pattern of genetic variation in these populations has been shaped by demographic events occurring over the last 200,000 years. The dramatic variation in climate, diet, and exposure to infectious disease across the continent has also resulted in novel genetic and phenotypic adaptations in extant Africans. This review summarizes some recent advances in our understanding of the demographic history and selective pressures that have influenced levels and patterns of diversity in African populations.
Modern humans evolved in Africa around 200 kya (thousand years ago), and have lived continuously on the African continent longer than in any other geographic region. Africa not only has the highest levels of human genetic variation in the world but also contains a considerable amount of linguistic, environmental and cultural diversity. For example, more than 2,000 distinct ethno-linguistic groups, representing nearly a third of the world’s languages, currently exist in Africa (http://www.ethnologue.com/) (Figure 1). Africans live in a wide range of environments, such as deserts, tropical rainforests, savannas, swamps, and mountain highlands [1, 2]. Furthermore, some of these environments have undergone dramatic changes over the course of modern human evolution [1, 3, 4]. African populations also practice a wide array of subsistence strategies, including various forms of hunting-gathering, agriculture and pastoralism, across the continent perhaps in response to this environmental variability over time and geographic space.
African demographic history has consisted of fluctuations in population size, short- and long-range migration, admixture and extensive population structure which have resulted in complex patterns of variation in modern populations [1, 5]. The timing and duration of some of these demographic events were often correlated with known major environmental changes and/or cultural developments in Africa . A number of novel genetic and phenotypic adaptations have also evolved in Africans in response to dramatic variation in environment, diet, and exposure to infectious disease across the continent. In some cases, these adaptations have occurred in the last several thousand years, exemplifying the ongoing evolution of human populations. Thus, present-day patterns of variation in African genomes are a product of both demographic and selective events.
The characterization of extant genetic diversity in Africa will be critical for reconstructing modern human origins and African demographic history. In addition, this genetic information, together with phenotype data on variable traits, will be informative for identifying population-specific variants that play a role in gene function, phenotypic adaptation and complex disease susceptibility in Africans and populations of African descent.
Current paleontological data suggest that the transition to anatomically modern Homo sapiens occurred in Africa, supporting the ‘Recent African Origin’ model of human evolution (Figure 2). The earliest known suite of derived traits associated with anatomically modern humans was identified in fossil remains from East Africa dating to around 195–150 kya [7–9]. Thus, the basic morphology of modern humans was established in Africa about 200 kya . Other early anatomically modern humans, with a more full set of modern features, also appear in Africa before 100 kya and in the Near East around 100 kya [11–13], followed by the more recent expansion of anatomically modern humans into Eurasia within the past 40,000–80,000 years [1, 2](Figure 2). Although the mode of evolution is still unclear, it has been suggested that the emergence of modern humans was not a sudden event, but rather a continuous process of gradual morphological change from archaic to modern H. sapiens . However, it has also been argued that modern human origins likely involved episodes of sudden morphological change, leading to the appearance of anatomically modern H. sapiens as a species distinct from archaic humans . Regardless of the mode of evolution, current fossil and chronological evidence indicate that modern humans existed in Africa for a relatively long period of time before their migration across much of the globe.
Two main migratory routes out of Africa have been hypothesized for anatomically modern humans. The traditionally favored model involves a northern route of migration via North Africa and the Nile valley into the Levant with subsequent dispersal into both Europe and Asia . Alternatively, an earlier southern coastal route has also been proposed in which modern humans first left Africa by crossing the Bab-el-Mandeb strait at the mouth of the Red Sea and then rapidly migrated along the South Asia coastline to Australia/Melanesia where evidence of human settlement dating to around 55 Kya can be found [15–17]. A recent genetic study that correlated levels of microsatellite diversity and the geographic position of sampled populations inferred a waypoint of dispersal of anatomically modern humans out of Africa centered on the Red Sea, strongly supporting an East African origin of migration of modern humans . Although this study was not able to rule out the possibility of multiple migrations out of Africa, prior analysis of autosomal haplotype variability suggests that migration events originating from multiple genetically distinct source populations in Africa are unlikely .
The geographic expansion of a small number of anatomically modern humans out of Africa resulted in a population bottleneck. The size of the ancestral population(s) that left Africa is estimated to be around 1000 effective founding males and females based on autosomal microsatellite loci  or around 1500 effective founding males and females based on combined mtDNA, Y-chromosome, and X-chromosome resequencing data . Recent studies comparing X-chromosome and autosomal diversity also suggest that the effective population sizes of founding males and females were not equal due to sex-biased migration. However, whether the sex-ratio of migrating individuals was male-biased or female-biased is currently under debate [22–24].
Genome-wide data have indicated higher levels of genetic diversity in Africans compared to non-Africans, confirming the results of previous mtDNA, X-chromosome, and Y-chromosome studies [1, 25–33]. For example, a survey of 1327 nuclear microsatellite, insertion/deletion (INDEL) and single nucleotide polymorphism (SNP) markers showed that African populations, as well as African Americans, have the highest levels of within population genetic diversity relative to non-Africans . In addition, more private alleles are present in Africa than in other geographic regions. The high level of genetic diversity in Africa is consistent with a larger long-term effective population size (Ne) which is estimated to be ~15,000 for Africans (and ~7,500 for non-Africans) based on resequencing data from a 10-kb autosomal non-coding region of the genome. However, a recent study of 98-kb of sequence data from 20 loci on the X-chromosome indicated that effective population sizes of individual populations may be smaller (ranging from 2,300 to 9,000 for African populations and from 300 to 3,300 in non-African populations) . Non-African populations also appear to have fewer private alleles and a subset of the genetic diversity present in sub-Saharan Africa, as expected under the ‘Recent African Origin’ model  (Figure 2).
A number of recent studies have also identified a considerable amount of structural variation in the human genome. ‘Structural variation’ commonly refers to genomic alterations that may be as small as a single nucleotide (excluding SNPs) or as large as millions of nucleotides in size, such as INDELs, copy number variation, translocations and inversions [36, 37]. Although the population genetics of structural variation are still in their infancy, information regarding the distribution of such variation within and between populations is emerging [1, 29, 38–40]. For example, a recent phylogenetic analysis based on 396 non-singleton copy number variants demonstrated that globally diverse populations clustered roughly by geographic region , and a study of 67 common copy number variants in the HapMap populations also noted that average FST, a classical measure of population divergence, of these loci was 0.11, comparable to previous estimates based on nucleotide and haplotype data [1, 38, 40]. Overall, results from these different studies suggest that levels and patterns of copy number variation in global populations are influenced by demographic history. However, technical limitations associated with a number of methods commonly used for structural variation detection [36, 37], including bias in structural variation ascertainment, can possibly distort estimates of demographic parameters. Advances in next-generation sequencing and other technologies, as well as the identification of structural variation in ethnically diverse Africans, will be needed to better determine patterns of structural variation in Africa and their possible role in phenotypic variability.
Several studies have indicated that ancestral populations were geographically structured before modern humans migrated out of Africa . For example, it has been suggested that the deep coalescence times of mtDNA [41, 42] and X-chromosome [43, 44] lineages are consistent with a demographic scenario of ancient population structure in Africa. A recent analysis of cranial shape variability in anatomically modern human fossils (dating to 200–60 kya) from Africa and the Middle East also reported a high level of morphological divergence among these fossil hominids which was interpreted as evidence for ancestral population structure in Pleistocene Africa . Thus, arguably, a considerable amount of genetic and phenotypic diversity may have been present at an early stage of modern human evolution.
Studies of global population structure in samples from the Human Genome Diversity Panel (HGDP-CEPH) also identified substructure in Africa, particularly between hunter-gatherers and other Africans [29, 46]. However, because the HGDP-CEPH contains a small number of African populations, some of which likely share recent common ancestry, these results do not reflect the full extent of population structure in Africa . A recent genome-wide study of a much larger set of diverse Africans detected more extensive population structure within Africa than had been previously observed. Specifically, an analysis of 848 short tandem repeat polymorphisms (STRPs), 476 INDELs and 3 SNPs genotyped in ~2,400 individuals from 121 geographically diverse populations revealed the presence of 14 genetically distinct ancestral population clusters in Africa . Each cluster consisted of populations that shared genetic similarity, as well as cultural and/or linguistic properties (for example, Pygmies, Khoesan-speaking hunter-gatherers, Bantu-speakers, Cushitic-speakers). Thus, there is a general correlation between genetic relatedness and linguistic/cultural similarity in Africa, although some exceptions exist. This study  also observed fine-scale genetic structure among populations speaking languages that belong to the same linguistic family, such as between western African and East African Bantu Niger-Kordofanian-speakers, among other examples . East Africa was characterized by high levels of population differentiation, likely resulting from the historical migration of linguistically distinct populations into this geographic region and the long-term presence of indigenous East African Khoesan-speakers, namely the Hadza and Sandawe . A pattern of strong genetic differentiation was also observed in East Africa based on mtDNA data . Additionally, the above genome-wide analysis in geographically diverse Africans showed that Central African Pygmies share common ancestry with several southern and eastern African Khoesan-speaking populations, who speak a language with click consonants, suggesting that these contemporary hunter-gatherers may have descended from a proto-Khoesan-Pygmy population of hunter-gatherers that diverged more than 35 kya . This finding also raises the intriguing possibility that the original language spoken by African Pygmies, who are known to have lost their indigenous language, may have contained click consonants .
Several studies have also identified genetic structure within Central Africa, particularly between Pygmy and non-pygmy populations [18, 31, 48–51], as well as subtle substructure among Pygmies [18, 48, 50, 51]. Specifically, data have shown that the inferred ancestors of modern Pygmy hunter-gatherers and Bantu-speaking agriculturalists could have diverged as long as 70 kya , and that ancestral western and eastern Pygmy populations separated more than 18 kya  with subsequent genetic differentiation among the western Pygmies within the past 2,800 years . The subtle structure among western Pygmies may be due to recent geographic isolation, genetic drift, and differential levels of admixture between Pygmies and neighboring Bantu-speaking populations [18, 49, 50, 52–54]. Overall, these patterns of genetic and phenotypic variation suggest that African populations have maintained a large and subdivided population structure throughout much of their evolutionary history. This population subdivision may have been facilitated by a number of factors, including ethnicity, culture, language, geography, as well as past fluctuations in geology and climate which may have affected population growth, contraction, fragmentation, and gene-flow in Africa [18, 55].
Both short- and long- range migration events, and subsequent admixture between migrating and indigenous populations have also influenced the current genetic landscape of sub-Saharan Africa. One of the most significant migration events in recent African history was the geographic expansion of the Bantu Niger-Kordofanian-speakers from Nigeria and Cameroon first into the rainforests of equatorial Africa, and then into eastern and southern Africa within the past 5,000 years [18, 56] (Figure 1). Indeed, the presence of Bantu Niger-Kordofanian ancestry in many African populations and the widespread distribution of Bantu-related languages are signatures of the historical migration of Bantu-speakers across Africa and their subsequent admixture with other indigenous populations [18, 57]. Genetic evidence also indicates that independent waves of migration of western African and East African Bantu-speakers into southern Africa occurred, in agreement with previous linguistic and archaeological studies [18, 58]. Although the reasons for the radiation of Bantu-speakers across sub-Saharan Africa are not entirely known, it has been suggested that a shift from a humid to a drier climate around 5,000 years ago and the adoption of new crops suited to this drier environment may have contributed to the widespread movement of Bantu farmers throughout Africa [58, 59].
Genetic signatures of both historic and prehistoric migration events are also observed in other regions of Africa [18, 42, 57, 60–62]. For example, an analysis of microsatellite, INDEL and SNP polymorphisms in the nuclear genome showed that populations from central/southern Sudan, such as the Nuer and Dinka, have the highest proportion of Nilo-Saharan ancestry, with decreasing frequency observed in populations from northern Kenya to northern Tanzania in East Africa. These data suggest a Sudanese origin of Nilo-Saharan-speaking populations, with subsequent migration(s) southeastward to East Africa . In addition, Nilo-Saharan-speakers from Sudan, Tanzania, Kenya and Chad also clustered closely with Afroasiatic Chadic-speaking populations from the southern Lake Chad Basin in genetic structure analyses, suggesting that these Chadic-speakers of Nilo-Saharan ancestry likely migrated westward from a Sudanese homeland to Lake Chad and adopted an Afroasiatic language at some point in their history without significant genetic exchange . This shift in language may have occurred through interactions with proto-Chadic Afroasiatic-speakers who migrated from the central Sahara to the Lake Chad Basin around 8 kya [6, 56, 58, 63]. These genetic data are in general agreement with archaeological and linguistic studies that advocate a common origin of Nilo-Saharan populations in eastern Sudan, with subsequent migration events northward to the eastern Sahara, westward to the Chad Basin, and southeastward into Kenya and Tanzania [6, 64].
The migration of Nilo-Saharan-speakers may have been associated with past changes in environmental conditions. For example, archeological data suggest that following a climatic shift from dry to more humid conditions in the early Holocene around 10.5 kya, several Nilo-Saharan-speaking populations expanded westward from the Middle Nile Basin to Lake Chad and southeastward to northern Kenya to exploit newly created aquatic food resources [6, 65]. However, a small number of Nilo-Saharan-speakers, collectively referred to as Northern Sudanians, migrated northward to the eastern Sahara where they engaged in cattle domestication [6, 65] (Figure 3). During the Mid-Holocene arid phase around 8,500-7,500 years ago, aquatic food collecting Nilo-Saharans continued their mode of subsistence mainly along permanent rivers, such as the middle Niger and the Nile, and remaining lakes such as Lake Chad. However, during the return of more humid conditions after 7,500 years ago, Northern Sudanic cattle raisers expanded westward across the Sahara [6, 56, 65] (Figure 3). A subset of Nilo-Saharans, namely Nilotic pastoralists, originating from eastern Sudan is also known to have migrated southeastward to Kenya and northern Tanzania within the past 3,000 years .
Additionally, many Nilo-Saharan-speaking populations from East Africa have high levels of Afroasiatic Cushitic ancestry (likely of Ethiopian origin ), suggesting a long history of gene-flow between Nilo-Saharans and Cushites . Archaeological studies have indicated a shared cultural practice of cattle herding between Nilo-Saharans and Cushites in northern Kenya which likely brought these linguistically distinct populations into repeated contact with one another over the past several thousand years [6, 56]. Cushitic agropastoralists are also thought to have expanded into southern Kenya and central northern Tanzania, engaging in cultural exchange and inter-marriage with southern and eastern Nilotic pastoralists  (Figure 1). This archaeological and genetic evidence of gene-flow between Nilo-Saharans and Cushites is consistent with other genetic data that showed the shared presence of an East African-specific mutation associated with lactose tolerance in these linguistically distinct populations .
The reverse migration of non-Africans into Africa was also shown to contribute to the gene-pool of modern African populations. For example, high levels of both Middle Eastern/European and eastern African Cushitic ancestry were detected in the Saharan African Beja, indicative of possible gene-flow from non-African populations . These genetic patterns correlate well with linguistic and archaeological data that suggest that modern-day Beja pastoralists descended from northern Cushitic-speakers who migrated from Ethiopia to the Red Sea coast of Sudan . Furthermore, the Beja have also had more intensive contact with the Middle East through commercial trade across the Red Sea as early as the 9th century A.D. and with nomadic camel herders of Arab Bedouin origin who settled in Sudan beginning in the 14th century A.D.  (Figure 1). These studies demonstrate that migration in Africa occurred at different points in time and over a range of geographic areas, resulting in complex patterns of genetic variation.
Natural selection, the process by which favorable heritable traits become more common in successive generations and unfavorable heritable traits become less common , operates to either increase or decrease the frequency of mutations that have an effect on an individual’s fitness. Selectively advantageous mutations associated with diet have evolved in populations which likely enabled ancestral humans to adapt to their environment, including changes in cultural practices. This section will focus on a few case studies of dietary adaptation primarily in African populations (for a detailed review of additional genetic adaptations in Africa, see reference ).
Lactase persistence, the ability to digest fresh milk and other dairy products into adulthood, varies in frequency in different human populations. However, the lactase persistence trait is common in pastoralist and dairying populations, such as northern Europeans, and certain African and Arabic nomadic groups . Previous studies identified the T allele at a C/T SNP located 13910 bp upstream of the lactase gene (LCT) as the likely causal mutation of lactase persistence in Europeans and a few West African pastoralist populations, such as the Fulani. However, this mutation was not found to be a strong predictor of lactase persistence in other African populations that practice pastoralism [69, 70]. A recent study  of 43 populations from Tanzania, Kenya, and the Sudan identified three polymorphisms (G/C-14010, common in Tanzania and Kenya, T/G-13915 and C/G-13907, common in northern Sudan and Kenya) located ~14 kb upstream of LCT that are significantly associated with lactase persistence in East African populations. Further analysis revealed evidence consistent with a recent ongoing selective sweep over the past 3,000–7,000 years . Recent resequencing studies have also found additional variants within this genomic region upstream of LCT in other African populations [71, 72]. However, the functional impact of these polymorphisms on lactose tolerance has not been firmly established.
Genetic data have indicated that past migration events may have resulted in the presence of shared genetic variation among pastoralist populations in different regions of Africa. For example, a survey of SNPs associated with lactase persistence in Africa found the C-14010 mutation at low frequency in several southern African Bantu-speaking pastoralists This study argued for the spread of this mutation into southern Africa by migrating Khoe pastoralists who may have admixed with migrating Nilotic or Cushitic herders from East Africa, with subsequent admixture occurring between Khoe and Bantu-speakers in southern Angola around 2 kya . Similar results have been observed in other East African and southern African populations (A. Ranciaro & S.A. Tishkoff, unpublished data, ). Recent Y-chromosome evidence has also suggested that eastern African Nilotic pastoralists migrated to southern-central Africa and admixed with local populations within the last few thousand years, supporting the hypothesis of an East African origin of southern African pastoralism . Overall, these findings are a striking example of convergent evolution, local adaptation due to strong selective pressure resulting from shared cultural practices in Europeans and Africans, and gene-culture co-evolution [66, 74].
Similarly, starch consumption is variable among populations with different subsistence patterns, but is more notably elevated in agricultural and certain hunter-gatherer groups . It has been suggested that a genetic adaptation arose in distinct populations in response to an increase in dietary starch intake. Perry and colleagues  examined the number of copies of the salivary amylase gene (AMY1), which plays a role in starch hydrolysis, in distinct populations with contrasting levels of starch consumption, including several African populations. This study observed a greater number of AMY1 gene copies in populations with a high starch diet (consisting of European Americans, Japanese, Hadza (African)) compared to populations with a low-starch diet (consisting of Biaka and Mbuti Pygmies (African), Datog (African), and Yakut (Asian)), arguing that dietary differences rather than ancestry influenced gene copy number . Perry and coworkers  suggest that positive selection has been the dominant force affecting the divergence in gene copy number among populations at the AMY1 locus. However, targeted resequencing and long- range haplotype analyses of the AMY1 gene in additional populations with distinct subsistence patterns, including culturally diverse Africans, might be informative for confirming signatures of recent selection, providing further evidence for adaptive evolution at this locus.
Another important dietary adaptation is bitter taste perception which may have evolved in humans to prevent the ingestion of plant toxins. The ability to taste the bitter synthetic compound phenylthiocarbamide (PTC) is a highly variable trait in humans, and is correlated with both the ability to taste naturally bitter substances in food and food preference [76, 77]. A large proportion of the phenotypic variance in PTC sensitivity has been attributed to genetic variability at TAS2R38, a bitter taste receptor gene, located on chromosome 7 . Specifically, three amino acid substitutions at TAS2R38 that form two common amino acid haplotypes have been shown to influence PTC sensitivity: a dominant taster haplotype, PAV and a non-taster haplotype, AVI . A genetic analysis of TAS2R38 in geographically diverse populations, including a small number of Africans, uncovered evidence of balancing selection at this locus, such as an excess of intermediate-frequency variants and an ancient divergence between the major taster and nontaster haplotypes (estimated to be 1.5 million years ago) . These patterns of diversity suggest that variants at this locus are selectively advantageous, likely playing a role in food choice [77, 79–81], and may represent long-term genetic adaptations in humans. More recently, genetic data have also shown that additional haplotypes associated with a broader range of PTC sensitivity are present in culturally and linguistically distinct African populations (M.C. Campbell and S.A. Tishkoff, unpublished data), raising the possibility that a wider range of sensitivity to naturally bitter compounds among Africans might be advantageous in Africa. These studies provide further information regarding the evolution of genetic variation associated with bitter taste perception and the role that genetic/phenotypic variability may play in dietary preference in human populations.
To date, only a fraction of the many ethno-linguistic groups in Africa has been extensively studied for genome-wide variation. Analyses of genetic diversity in more geographically and ethnically distinct African populations will be critical for testing models of modern human origins and dispersal out of Africa, as well as for inferring African demographic history. With the recent development of both “next generation” sequencing technology and methods for targeted sequencing, together with their rapidly decreasing costs, it may be feasible to resequence large portions of the genome, and conceivably the entire genome, in diverse African populations to more accurately test evolutionary models and to infer demographic history. The integration of archaeological, linguistic, paleoclimatic, and geographical information with these genetic data will be particularly useful in this reconstruction, providing a more unified account of historic and prehistoric events in Africa. Indeed, the “1000 genomes” project, which aims to discover novel variants through targeted and whole-genome resequencing in a subset of the extended HapMap populations, represents a first step in identifying novel genome-wide variation in more diverse Africans. However, given the extensive population structure present in Africa, additional linguistically and culturally distinct Africans will likely be needed to more fully characterize levels and patterns of variation which can then be used to infer population history.
Finally, African populations have been severely underrepresented in studies of genetic adaptation. Given that Africans possess high levels of genetic and phenotypic diversity, and live in distinct environments, it is likely that ethnically and geographically diverse populations in Africa have undergone local adaptation. To gain a better understanding of genetic and phenotypic adaptations in human populations, it is imperative to include a wider range of ethnically diverse African populations living in distinct environments in genetic studies. Moreover, genome-wide resequencing in a number of African populations is likely to lead to the identification of novel population-specific variation associated with variable traits. The development of statistical and computational methods for detecting selection across the genome that distinguish the effects of selection and demography will also be critical for identifying the genetic basis of adaptation in Africa. Overall, studies of adaptation combined with information on African demographic history will provide a more robust and accurate view of the processes that have shaped the genomes of African populations.
We thank A. Ranciaro, J. Hirbo, F. Gomez, and S. Soi for critical review of the manuscript. We are especially grateful to C. Ehret for his careful review of the manuscript and figures, and for his suggestions regarding African population history. The authors are funded by the U.S. National Science Foundation (NSF) grant BCS-0552486, U.S. NSF grant BCS-0827436, U.S. National Institutes of Health (NIH) grant R01GM076637, U.S. NIH Director's Pioneer Award Program DP1-OD-006445, and a David and Lucile Packard Career Award to S.A.T.