As expected, PCA on our entire sample revealed the greatest genetic differentiation between the US Caucasians and the Africans, with the African Americans intermediate between them, reflecting their recent admixture between ancestors from Europe and Africa. Our estimate of European individual admixture (IA) in the African Americans was also roughly consistent with prior studies [
3], with an average of 21.9%. We found considerable variation among individuals in terms of European IA, and a number of individuals with particularly high European IA values (eight individuals of 136, or 6% with values greater than 45%).
Prior studies focusing on mtDNA and Y chromosomes have found a greater African and lesser European representation of mtDNA haplotypes compared with Y chromosome haplotypes in African Americans, suggesting a greater contribution of African matrilineal descent compared with patrilineal descent [
6,
7]. For example, Kayser and colleagues [
6] estimated that 27.5% to 33.6% of Y chromosomes in African Americans are of European origin, compared with 9.0% to 15.4% of mtDNA haplotypes.
One study of nine short tandem repeat (STR) loci compared the Y chromosomes of African Americans with those of various African populations, including West Africans, West Central Africans (Cameroon), South Africans, Mbuti Pygmies, Mali, San, and Ethiopians [
6]. In a multiple dimensional scaling analysis, these authors placed the African Americans in the middle of these African groups, suggesting origins from multiple African populations. However, they also found that they could not differentiate the Y-chromosome distributions of West African and West Central African groups, presumably a major source of ancestry for African Americans.
Another study of mtDNA haplotypes in African Americans and different African populations found that more than 50% of the African-American mtDNAs exactly matched common haplotypes shared among multiple African ethnic groups, whereas 40% matched no sequences in the African database they referenced [
26]. Fewer than 10% of African-American mtDNA haplotypes matched exactly to a single African ethnic group. The haplotypes that did match were more often found in ethnic groups of West African or Central West African than of East or South African origin.
The most extensive examination of mtDNA haplotypes in Africans and African Americans [
13] used mtDNA data from a large number of African ethnic groups spread around the continent. These authors observed large similarities in mtDNA profiles among ethnic groups from West, Central West, and South West Africa, with a continuous geographic gradient. As observed previously [
26], these authors also found that many mtDNA haplotypes were widely distributed across Africa, making it impossible to trace African ancestry to a particular region or group, based on mtDNA data alone. These authors also estimated the proportionate ancestry within Africa based on African American mtDNA haplotypes as 60% from West Africa, 9% from Central West Africa, 30% from South West Africa, and minimal ancestry from North, East, Southeast, or South Africa.
These studies all suggest close genetic kinship among various West African, Central West African, and South West African ethnic groups. A prior analysis of genetic structure among the African populations included in the HGDP based on 377 autosomal STR loci was able to define distinct genetic clusters for the Biaka, Mbuti, and San; however, the study lacked the power to differentiate the Mandenka, Yoruba, and Bantu groups [
27]. Similarly, another study examining two ethnic groups from Ghana (Akan and Gaa-Adangbe) and two from Nigeria (Yoruba, Igbo), based on 372 autosomal microsatellite markers in 493 individuals, did not differentiate these groups by genetic cluster analysis and found only modest genetic differences between them [
28]. In contrast, greater resolution of African ethnic groups, particularly for the Mandenka and Yoruba, was possible in our analysis, based on more than 450,000 SNPs. We note that, in a recent study of malaria, PCA distinguished the HapMap YRI individuals from the Mandenka individuals in the Gambian sample on the basis of 100,715 SNPs; however, admixture analysis with a few selected markers did not reveal clear clusters that correspond to self-reported ancestry [
29].
It is of interest to compare our African admixture estimates to descriptions of proportional representation of various African groups to the Middle Passage and slave trade occurring in post-Columbian America. A highly detailed census based on historic records has been documented by several authors [
10-
12]. Africans were deported from numerous locations along the broad western coast of Africa, ranging from Senegal in the far west all the way down to Angola in the southwest. In addition, a smaller number of slaves were taken from the southeast of Africa. In terms of numbers, the largest group, approximately 50% to 60%, derived from Central and Southern West Africa and the Bight of Biafra; approximately 10% from Western Africa; 25% to 35% from the West Coast in between (Windward Coast, Gold Coast, and Bight of Benin), and the remaining 5% from Southeast Africa [
7]. These estimates show considerable consistency with our results, which also indicated the largest ancestral component of African Americans to be from Central West Africa, followed by West Africa and Southwest Africa. However, because we did not have groups representative of Southeastern and other parts of Southern Africa, we may have underestimated their ancestral representation among African Americans.
It is important to note that considerable migration has occurred among African ethnic groups over the past three millennia or more. For example, the two Bantu groups included in our analysis originated from a more-central African location (Nigeria-Cameroon) several millennia ago, making precise geographic localization of African ancestry difficult [
30]. This difficulty is also reflected in the close genetic relationships among the various West, West Central, and South West African groups, who also show considerable overlap in terms of mtDNA haplotypes.
Our results are based on examination of the entire autosomal genome and, therefore, provide a more-robust picture of the admixed African ancestry of individual African Americans compared with prior analyses, which focused on only a single locus (mtDNA or Y chromosome). We found all African Americans in our sample to be admixed, with representation from various geographic regions of Western Africa. The amount of variation in the African components of ancestry among the African Americans was quite modest, suggesting considerable similarity in African genetic profiles among African Americans. Thus, African ancestry testing based on a single locus, such as the mtDNA or Y chromosome, as is commonly done by ancestry-testing companies, provides only a very limited, and in many cases, misleading picture of an individual's African ancestry [
31].
An important limitation in our analysis is the modest number of African subjects and groups represented. However, we were clearly able to exclude certain African ethnic groups as contributing substantially to African Americans, such as the two Pygmy and San groups. Furthermore, the close genetic similarity observed among West, Central West, and Southwest African ethnic groups (such as the Mandenka, Yoruba, and Bantu), found by us and others [
28], suggests that precise identification of ancestry for African Americans may be difficult, even with the inclusion of additional ethnic groups.
Very recently, the limited range of African groups included in population genetic studies of Africans was addressed in a landmark study of 113 geographically diverse African ethnic groups by Tishkoff and co-workers [
4]. These authors included 848 microsatellite, 476 indel, and four SNP markers. to examine genetic structure among these groups, as well as among 98 African Americans from four U.S. recruitment sites. In a genetic cluster analysis, they found only modest differentiation among West Africans, similar to the findings from other studies of a subset of these groups, based on a comparable number of markers. They also estimated proportionate African ancestry among their African Americans in a structured analysis including African ethnic subgroups, allowing the African Americans to be admixed. Comparable to our results, within the African Americans, they also found the majority African ancestry to be West, Central West, and Southwest African, including Bantu and non-Bantu speakers, with somewhat greater representation of the Bantu speakers (about 50% of the African total component) than the Western non-Bantu speakers (for example, Mandenka, about 30% of the African total component). Larger collections of indigenous African populations, such as those described earlier [
4], when assayed with dense genotyping arrays, as done in this study (to allow finer genetic differentiation), will likely add further clarification of the African ancestral origins of African Americans.
The results of our analysis also strongly point to random mating among African Americans with respect to the African components of their ancestry. This is reflected both by the modest variances we observed in the African IA components, and also by the lack of structure in the PC analysis of African Americans with non-African genotypes removed. This conclusion is consistent with the idea that, for most African Americans, specific African origins are mixed or unknown or both and do not affect social characteristics that influence the choice of mate. It is also consistent with the notion that the African slaves brought to North America were mixed with regard to their geographic and ethnic ancestry and language [
32]. By contrast, considerably greater variation in the proportion of European ancestry was found within the African Americans in our study. This high level of variation in European ancestry may reflect recent admixture or nonrandom mating (for example, as seen in Latino populations [
33]), or both; these questions require additional study.