|Home | About | Journals | Submit | Contact Us | Français|
Starch consumption is a prominent characteristic of agricultural societies and hunter-gatherers in arid environments. In contrast, rainforest and circum-arctic hunter-gatherers and some pastoralists consume much less starch1-3. This behavioral variation raises the possibility that different selective pressures have acted on amylase, the enzyme responsible for starch hydrolysis4. We found that salivary amylase gene (AMY1) copy number is correlated positively with salivary amylase protein levels, and that individuals from populations with high-starch diets have on average more AMY1 copies than those with traditionally low-starch diets. Comparisons with other loci in a subset of these populations suggest that the level of AMY1 copy number differentiation is unusual. This example of positive selection on a copy number variable gene is one of the first in the human genome. Higher AMY1 copy numbers and protein levels likely improve the digestion of starchy foods, and may buffer against the fitness-reducing effects of intestinal disease.
Hominin evolution is characterized by significant dietary shifts, facilitated in part by the development of stone tool technology, the control of fire, and most recently the domestication of plants and animals5-7. Starch, for instance, has become an increasingly prominent component of the human diet, particularly among agricultural societies8. It stands to reason, therefore, that studies of the evolution of amylase in humans and our close primate relatives may provide insight into our ecological history. Because the human salivary amylase gene (AMY1) shows extensive variation in copy number9,10, we first assess whether a functional relationship exists between AMY1 copy number and the level of amylase protein expression in saliva. We then determine if AMY1 copy number differs among modern human populations with contrasting levels of dietary starch.
We estimated diploid AMY1 gene copy number for 50 European-Americans using an AMY1-specific real-time quantitative polymerase chain reaction (qPCR) assay. We observed extensive variation in AMY1 copy number in this population sample (Fig. 1a and Supplementary Table 1 online), consistent with previous studies10,11. Next, we performed western blot experiments with saliva samples from the same individuals in order to estimate salivary amylase protein levels (Fig. 1b). These experiments revealed a significant positive correlation between salivary amylase gene copy number and protein expression level (P < 0.001; Fig. 1c).
While there is a considerable range of variation in dietary starch intake among human populations, a distinction can be made between “high-starch” populations for which starchy food resources comprise a substantial portion of the diet, and the small fraction of “low-starch” populations with traditional diets that incorporate relatively few starchy foods. Such diets instead emphasize proteinaceous resources (e.g., meats and blood) and simple saccharides (e.g., from fruit, honey and milk). To determine if AMY1 copy number differs among populations with high- and low-starch diets, we estimated AMY1 copy number in three high-starch and four low-starch population samples. Our high-starch sample included two agricultural populations, European-Americans (n = 50) and Japanese (n = 45), and Hadza hunter-gatherers who rely extensively on starch-rich roots and tubers (n = 38)12. Low-starch populations included Biaka (n = 36) and Mbuti (n = 15) rainforest hunter-gatherers, Datog pastoralists (n = 17), and the Yakut, a pastoralist/fishing society (n = 25). Additional details on the diets of these populations are provided in Supplementary Table 2 online. We found that mean diploid AMY1 copy number is greater in high-starch populations (Fig. 2 and Supplementary Fig. 1 online). Strikingly, the proportion of individuals from the combined high-starch sample with at least 6 AMY1 copies (70%) is nearly 2 times greater than that for low-starch populations (37%). To visualize the allele-specific number and orientation of AMY1 gene copies, we performed high-resolution fluorescence in situ hybridization on stretched DNA fibers (fiber FISH); these results were consistent with diploid AMY1 copy number estimates from our qPCR experiments (Fig. 3a,b).
The among-population patterns of AMY1 copy number variation do not fit expectations under a simple regional-based model of genetic drift: our high- and low-starch samples include both African and Asian populations, suggesting that diet more strongly predicts AMY1 copy number than geographic proximity. Based on this observation, we hypothesized that natural selection may have influenced AMY1 copy number in certain human populations. However, we cannot rigorously test such a hypothesis on the basis of our qPCR results alone, in part because we lack comparative data from other loci. Therefore, we next performed array-based comparative genomic hybridization (aCGH) on the Yakut population sample with a Whole Genome TilePath (WGTP) array platform that was previously used by Redon and colleagues11 to describe genome-wide patterns of copy number variation in 270 individuals (the HapMap collection), including the same Japanese population sample as in our study. For the Yakut aCGH experiments, we used the same reference DNA sample (NA10851) as in the previous study11, facilitating comparisons of Japanese and Yakut relative intensity log2 ratios for the 26,574 bacterial artificial chromosome (BAC) clones on the array, including two clones mapped to the AMY1 locus.
Results from the two AMY1-mapped clones on the WGTP array supported our original observations: the log2 ratios were strongly correlated with the qPCR estimates of AMY1 diploid copy number (Supplementary Fig. 1 online), and the population mean log2 ratios for both clones were greater for the Japanese sample (Fig. 4a and Supplementary Fig. 1 online). More importantly, with the WGTP data we were able to compare the level of population differentiation at the AMY1 locus to other loci in the genome for the two Asian population samples in our study. We would expect the magnitude and direction of the Japanese-Yakut mean log2 ratio difference for the AMY1-mapped clones to be similar to those for other copy number variable clones, if these CNVs have experienced similar evolutionary pressures. However, the two AMY1-mapped clones are significant outliers in this distribution (Fig. 4b and Supplementary Fig. 2 online), leading us to reject this null hypothesis. In addition, we considered a database of genotypes for 783 genome-wide microsatellites for the same Yakut individuals and a different Japanese population sample13, because microsatellite loci are usually multi-allelic (as is the AMY1 locus). We found that the level of Japanese-Yakut differentiation at the AMY1 locus exceeds that for >97% of the microsatellite loci (Supplementary Fig. 3 online). Although this result should be interpreted with caution because we do not know whether AMY1 copy number and microsatellite mutation rates and patterns are similar, this finding is consistent with our results from the genome-wide WGTP comparison.
These observations suggest that natural selection has shaped AMY1 copy number variation in either the Japanese or the Yakut, or in both populations. We cannot fully test the null hypothesis for the other high- and low-starch populations in our study, but the patterns of copy number variation we observed in these populations are similar to those for the Japanese and Yakut and therefore may also reflect non-neutral evolution. We favor a model in which AMY1 copy number has been subject to positive or directional selection in at least some high-starch populations but has evolved neutrally (i.e., through genetic drift) in low-starch populations. Although it is possible that lower AMY1 gene copy numbers have been favored by selection in low-starch populations, such an interpretation is less plausible for the simple reason that excessive amylase production is unlikely to have a significant negative effect on fitness. Furthermore, several lines of evidence offer mechanisms by which higher salivary amylase protein levels may confer a fitness advantage for individuals with a high-starch diet. First, a significant amount of starch digestion occurs in the mouth during mastication14. For example, blood glucose levels have been shown to be significantly higher when high-starch foods such as corn, rice, and potatoes (but not apples) are first chewed and then swallowed, rather than swallowed directly15. In addition, it has been suggested that oral digestion of starch is critically important for energy absorption during episodes of diarrhea4. Diarrheal diseases can have a significant effect on fitness; for example, such diseases caused 15% of worldwide deaths among children younger than 5 years as recently as 200116. Lastly, salivary amylase persists in the stomach and intestines after swallowing17, thereby augmenting the enzymatic activity of pancreatic amylase in the small intestine. Higher AMY1 copy number and a concomitant increase in salivary amylase protein level are therefore likely to improve the efficiency with which high-starch foods are digested in the mouth, stomach, and intestines, and may also buffer against the potential fitness-reducing effects of intestinal disease.
To understand better the evolutionary context of human AMY1 copy number variation, we analyzed patterns of AMY1 copy number variation in chimpanzees (Pan troglodytes) and bonobos (Pan paniscus). In contrast to the extensive copy number variation we observed in humans, each of 15 wild-born western chimpanzees (P. t. verus) showed evidence of only 2 diploid AMY1 copies (Fig. 3c and Supplementary Fig. 4 online), which is consistent with previous findings18-21. Although we observed evidence of a gain in AMY1 copy number in bonobos relative to chimpanzees (Supplementary Fig. 4 online), our sequence-based analyses suggest that each of these AMY1 copies has a disrupted coding sequence and may be non-functional (Supplementary Fig. 5 online). Therefore, the average human has ~3 times more AMY1 copies than chimpanzees, and bonobos may not have salivary amylase at all. Outgroup comparisons with other great apes suggest that AMY1 copy number was most likely gained in the human lineage, rather than lost in chimpanzees21,22. Given that AMY1 copy number is positively correlated with salivary amylase protein level in humans, it stands to reason that the human-specific increase in copy number may explain, at least in part, why salivary amylase protein levels are ~6-8 times higher in humans than in chimpanzees23. These patterns are consistent with the general dietary characteristics of Pan and Homo; chimpanzees and bonobos are predominantly frugivorous and ingest little starch relative to most human populations24. Considering other primates, while New World monkeys do not produce salivary amylase and tend to consume little starch, cercopithecines (a subfamily of Old World monkeys including macaques and mangabeys) have relatively high salivary amylase expression, even compared to humans23. Although the genetic mechanisms are unknown, this expression pattern may have evolved to facilitate the digestion of starchy foods (such as the seeds of unripe fruits) stowed in the cheek pouch, a trait that among primates is unique to cercopithecines25.
The initial human-specific increase in AMY1 copy number may have been coincident with a dietary shift early in hominin evolutionary history. For example, it is hypothesized that starch-rich plant underground storage organs (USOs) were a critical food resource for early hominins26,27. Changes in USO consumption may even have facilitated the initial emergence and spread of Homo erectus out of Africa5,28. Yet such arguments are difficult to test, mainly because direct evidence for the use of USOs is difficult to obtain, particularly for more remote time periods. USOs themselves are perishable, as are many of the tools used to collect and process them. Therefore, understanding the timing and nature of the initial human-lineage AMY1 duplications may provide insight into our ecological and evolutionary history. The low level of nucleotide sequence divergence among the three AMY1 gene copies found in the human genome reference sequence (hg18; d = 0.00011 to 0.00056) implies a relatively recent origin that may be within the timeframe of modern human origins (i.e., within the last ~200,000 years; based on human-chimpanzee AMY1 d = 0.027 and a 6 MYA estimate for divergence of the human and chimpanzee lineages). However, given the possibility for gene conversion, we do not necessarily consider this estimate to be reliable. The generation of AMY1 sequences from multiple human individuals may ultimately help to shed light on this issue.
In summary, we have shown that the pattern of variation in copy number of the human AMY1 gene is consistent with a history of diet-related selection pressures, demonstrating the importance of starchy foods in human evolution. While the amylase locus is one of the most variable in the human genome with regard to copy number10, it is by no means unique; a recent genome-wide survey identified 1,447 copy number variable regions among 270 phenotypically normal human individuals11, and many more such regions will likely be discovered with advances in copy number variation detection technology. It is reasonable to speculate that copy number variants other than AMY1 are or have been subject to strong pressures of natural selection, particularly given their potential influence on transcriptional and translational levels (e.g., ref. 29). The characterization of copy number variation among humans and between humans and other primates promises to offer considerable insight into our evolutionary history.
Buccal swabs and saliva were collected under informed consent from 50 European-Americans age 18-30 (Arizona State University IRB protocol no. 0503002355). Saliva was collected for 3 min from under the tongue. Buccal swabs were collected from the Hadza (n = 38) and Datog (n = 17) from Tanzania (Stanford University IRB protocol no. 9798-414). Genomic DNAs from Biaka (Central African Republic; n = 32), Mbuti (Democratic Republic of Congo; n = 15) and Yakut (Siberia; n = 25) are from the HGDP-CEPH Human Genome Diversity Cell Line Panel. Lymphoblastoid cell lines from 45 Japanese, 4 additional Biaka, and the donor for the chimpanzee genome sequence (Clint) were obtained from the Coriell Institute for Medical Research. Whole bloods were collected during routine veterinary examinations from chimpanzees and bonobos housed at various zoological and research facilities. Two additional bonobo samples were obtained from the Integrated Primate Biomaterials and Information Resource. DNA was isolated using standard methods.
Primers for qPCR (Supplementary Table 3 online) were designed to be specific to AMY1 (i.e., sequence mismatches with AMY2A and AMY2B) based on the human and chimpanzee reference genome sequences. A previous study reported a single (haploid) copy of AMY1 for one chimpanzee18, and a recent analysis by Cheng et al.19 found no evidence of recent AMY1 duplication for Clint. We used fiber FISH to confirm that Clint has two diploid copies of AMY1 (Fig. 3c). Therefore, we were able to estimate diploid copy number based on relative AMY1 quantity for human DNAs compared to a standard curve constructed from the DNA of Clint. A fragment from the TP53 gene was also amplified to adjust for DNA dilution quantity variation. Samples were run in triplicate and standards in duplicate. Experiments were performed and analyzed as described20.
Protein samples were prepared by solubilizing saliva samples in 2% sodium dodecyl sulfate (SDS) and heating at 100°C for 5 min. These samples were analyzed on mini SDS-polyacrylamide gels and transferred to polyvinylidene difluoride (PVDF) membranes (Immobilon-Millipore). For quantification purposes, a human salivary amylase protein sample of known quantity (Sigma) was run on each gel, with 5 μL of saliva for each sample. After transfer, the membranes were incubated for 1.5 hours with primary antibodies raised against human salivary amylase (Sigma). The membranes were washed and goat anti-rabbit alkaline phosphatase conjugated IgG secondary antibodies (Pierce) were added for 1 hour. The membranes were exposed to ECF substrate (Amersham Biosciences) for 5 min and analyzed using a phosphorimager. Quantification of protein bands was performed using ImageQuant software (Molecular Dynamics).
DNA fibers were prepared by gently lysing cultured lymphoblast cells with 300 μl Cell Lysis Buffer (Gentra Systems) per 5 million cells. 10 μl of lysate was placed on a poly-L-lysine coated slide (LabScientific) and mechanically stretched with the edge of a coverslip. After 30 sec, 300 μl of 100% methanol was applied to fix the fibers. Slides were dried at 37° C for 5 min and then stored at room temperature (RT).
PCR product probes were made from (i) the entire AMY1 gene itself (~10 kb; red in images), and (ii) the retrotransposon found directly upstream of all AMY1 copies but not pancreatic amylase genes or amylase pseudogenes (~8 kb; green in images); while the gene probe may not be specific to AMY1 under all hybridization conditions (AMY1 sequence divergence with AMY2A and AMY2B = 7.5% and 7.1%, respectively), the upstream probe is. We used long-range followed by nested PCR for each region (primers and conditions are provided in Supplementary Table 3 online). PCR products were purified with DNA Clean and Concentrator columns (Zymo).
For each nested PCR product, 750 ng was combined with 20 μl 2.5x random primer (BioPrime aCGH Labeling Module, Invitrogen) in 39 μl total volume, placed at 100° C for 5 min, and then ice for 5 min. Next, 5 μl 10x dUTP and 1 μl Exo-Klenow Fragment (BioPrime Module), and either 5 μl (5 nmol) Biotin-16-dUTP (Roche; gene probe) or 5 μl (5 nmol) Digoxigenin-11-dUTP (Roche; upstream probe) were added, and incubated at 37° C for 5 hours. Labeled products were purified with Microcon Centrifugal Filter Devices (Millipore) using 3 washes of 300 μl 0.1x SSC, eluted with 50 μl H2O. For each 1 μg of labeled DNA, we added 10 μg human Cot-1 DNA (Invitrogen).
For each experiment, 500 ng of labeled DNA from each of the nested PCR reactions were combined, lyophilized, reconstituted in 10 μl hybridization buffer (50% formamide, 20% dextran sulfate, 2x SSC), and added to the slide (18×18 mm cover glass; Fisher). Fibers and probes were co-denatured (95° C for 3 min) and hybridized in a humidified chamber (37° C for 40 hours). The slide was washed in 0.5x SSC at 75° C for 5 min followed by 3 washes in 1x PBS at RT for 2 min each. Next, fibers were incubated with 200 μl CAS Block (Zymed) and 10% Normal Goat Serum (Zymed) for 20 min at RT under a HybriSlip (Invitrogen). We used a 3-step detection/amplification (with reagents in 200 μl CAS Block/ 10% Normal Goat Serum). Each step was 30 min at RT under a HybriSlip followed by 3 washes in 1x PBS for 2 min each at RT: (i) 1:500 Anti-digoxigenin-fluorescein, Fab fragments (Roche) and 1:500 Strepavidin, Alexa Fluor 594 conjugate (Invitrogen); (ii) 1:250 Rabbit anti-FITC antibody (Zymed) and 1:500 Biotinylated anti-streptavidin (Vector Laboratories); (iii) 1:100 Goat anti-rabbit IgG-FITC (Zymed) and 1:500 Strepavidin, Alexa Fluor 594 conjugate. Images were captured on an Olympus BX51 fluorescent microscope with an Applied Imaging camera and analyzed with Applied Imaging’s Genus software.
For aCGH experiments we used a large-insert clone DNA microarray covering the human genome in tiling path resolution30. Test (Yakut individuals) and reference (NA10851) genomic DNA samples were labeled with Cy3-dCTP and Cy5-dCTP, respectively (NEN Life Science Products) and co-hybridized to the array. For each sample, a duplicate experiment was performed in dye-swap to reduce false-positive error rates. Labeling, hybridization, washes, and analyses were performed as described11,30.
We are grateful to all our study participants. We thank Drs. H. Cann and C. de Toma of the Fondation Jean Dausset (CEPH), the Cincinnati Zoo, Lincoln Park Zoo, New Iberia Research Center, Primate Foundation of Arizona, Southwest Foundation for Biomedical Research, Coriell Institute for Medical Research, and the Integrated Primate Biomaterials and Information Resource for samples. C. Tyler-Smith and Y. Gilad provided helpful comments on a previous version of the manuscript. We would also like to thank the Wellcome Trust Sanger Institute Microarray Facility for printing the arrays and T. Fitzgerald and D. Rajan for technical support. This study was funded by grants from the L.S.B. Leakey Foundation and Wenner-Gren Foundation (to N.J.D.), the Department of Pathology, Brigham & Women’s Hospital (to C.L.), the National Institutes of Health (to the University of Louisiana at Lafayette New Iberia Research Center; nos. RR015087, RR014491 and RR016483), and the Wellcome Trust (H.F., R.R., and N.P.C.).