|Home | About | Journals | Submit | Contact Us | Français|
Changes in gene expression during development play an important role in shaping morphological and behavioral differences, including between humans and nonhuman primates. Although many of the most striking developmental changes occur during early development, reproductive maturation represents another critical window in primate life history. However, this process is difficult to study at the molecular level in natural primate populations. Here, we took advantage of ovarian samples made available through an unusual episode of human–wildlife conflict to identify genes that are important in this process. Specifically, we used RNA sequencing (RNA-Seq) to compare genome-wide gene expression patterns in the ovarian tissue of juvenile and adult female baboons from Amboseli National Park, Kenya. We combined this information with prior evidence of selection occurring on two primate lineages (human and chimpanzee). We found that in cases in which genes were both differentially expressed over the course of ovarian maturation and also linked to lineage-specific selection this selective signature was much more likely to occur in regulatory regions than in coding regions. These results suggest that adaptive change in the development of the primate ovary may be largely driven at the mechanistic level by selection on gene regulation, potentially in relationship to the physiology or timing of female reproductive maturation.
Nonhuman primates are valuable sources of insight into human evolution. Until recently, however, such insight was limited by the dearth of genetic resources for most primate species. In addition, studies of primates in their natural habitats, while rich in behavioral and ecological detail, have rarely included extensive genetic or genomic components. This situation is changing now that genomic resources are increasingly available, and gene regulatory studies of captive primates have set the stage (reviewed in Tung et al. 2010). However, we still know relatively little about variation in gene expression in wild primates.
Collecting functional genomic data on such systems could provide important context for the evolution of gene regulation in humans. Specifically, studying changes in gene expression during maturational milestones in nonhuman primates may provide insight into the loci that contributed to shifts in developmental timing and physiology during human evolution (Uddin et al. 2008; Somel et al. 2009; Gunz et al. 2010). Some examples of these shifts include relatively late menarche in human hunter-gatherers compared with nonhuman primates (reviewed in Blurton Jones et al. 1999); a skeletal growth spurt that accompanies female maturation in humans that appears to be absent in nonhuman primates (Bogin and Smith 1996); and short interbirth intervals in humans relative to body size (reviewed in Mace 2000). Circumstantial evidence suggests a role for gene regulation in these changes. Indeed, sequence-based analyses have revealed that the regulatory regions of many development-related genes have undergone positive selection within primates (Haygood et al. 2010) and that rapidly evolving regulatory regions near duplicated genes in humans are enriched for genes related to pregnancy and reproduction (Kostka et al. 2010).
Yellow baboons (Papio cynocephalus) are close human relatives (~94% sequence similarity: see Silva and Kondrashov 2002) that, like humans, are large-bodied terrestrial savanna primates with long life histories and nonseasonal reproduction. They also inhabit African savanna environments similar to those relevant for early humans (Potts 1998; Behrensmeyer 2006). Yellow baboons have been the subjects of extensive study in the wild (Altmann SA and Altmann J 1970; Jolly 1993; Rhine et al. 2000; Buchan et al. 2003; Wasser et al. 2004; Alberts et al. 2006), including in the Amboseli basin of Kenya where individually recognized baboons have been monitored since 1971 (Altmann SA and Altmann J 1970; Buchan et al. 2003; Alberts et al. 2006). This system therefore presents an exceptional opportunity to integrate functional genomic data sets with detailed life history information about the same animals.
Here, we take advantage of life history and behavioral data from the Amboseli baboon population, combined with an unusual circumstance in which we were able to collect fresh tissue from seven known females (four premenarcheal juveniles and three multiparous adults). Six of these seven females died in an episode of conflict with the local human population (the Maasai community in Amboseli) that perceived the baboons as a threat to their livestock; the seventh died of natural causes a few days later. The bodies of all seven females were collected within a few hours of their death, with the help of the Maasai community. We used these data and samples to investigate gene expression changes related to the onset of sexual maturity in females and to examine differential expression in maturity-related genes among genes inferred to have evolved under lineage-specific selection in primates. We focused specifically on expression differences in the ovary, an organ that plays a central role in reproductive maturation. We present a genome-wide analysis of ovarian gene expression changes in these seven female baboons from this natural population using RNA-Seq.
RNA-Seq libraries were made using ovarian RNA from three adult and four juvenile females (supplementary fig. S1 and table S1, Supplementary Material online). We obtained ~15 million reads per individual (supplementary table S1, Supplementary Material online), and we measured the expression of a total of 9,770 genes in the baboon ovary. Ninety-seven genes (~1% of genes in the data set) were differentially expressed between the juveniles and the adults (False discovery rate [FDR]–adjusted P value < 0.05) (fig. 1). This result is consistent with the expectation that intraspecific differential expression, particularly within a population and within sex, is likely to be less common than interspecific differential expression between different primate species (Babbitt et al. 2010; Blekhman et al. 2010; Xu et al. 2010). Of the differentially expressed genes, 79 were upregulated in the adults and 18 were upregulated in the juveniles. This imbalance in upregulated expression toward the adult females was expected, as the adult ovary is much more metabolically active than the premenarcheal ovary (reviewed in McGee and Hsueh 2000).
To evaluate the global effect of maturation stage on gene expression variation, we performed a principal components (PCs) analysis. The first three PCs in this analysis explained ~67% of variation in the gene expression data (supplementary fig. S3, Supplementary Material online). None of these PCs clearly differentiated adult and juvenile tissues, although PC2 exhibited the strongest (albeit nonsignificant) relationship with life history stage (Mann–Whitney test, W = 11, P value = 0.1143). In contrast, when we examined only those genes that were significantly differentially expressed (n = 97), PC1 alone explained 70% of the variation in the gene expression data. PC1 also exhibited a trend toward higher values for juveniles than for adults (Mann–Whitney test, W = 0, P value = 0.05714 and supplementary fig. S3, Supplementary Material online).
Little is known about ovarian gene expression in human or mouse models during either the premenarcheal stage or in noncycling adult tissue, as most studies concerning ovarian gene expression have focused on embryonic sex specification (Nef et al. 2005; Small et al. 2005), fertility disorders (reviewed in Matzuk and Lamb 2008), or cancer states (e.g., Wang et al. 1999; Welsh et al. 2001; Haviv and Campbell 2002; Adib et al. 2004). To explore patterns in expression differences between these life history stages, we performed categorical enrichment analyses using the GO (Gene Ontology Consortium 2000) and PANTHER (Mi et al. 2005) ontology databases. The enrichments were performed in two ways: first, using the absolute rankings of gene expression differences between adults and juveniles, regardless of the direction of the difference; and second, using only genes that were more highly expressed in the more metabolically active adult tissue (table 1, supplementary tables S2 and S3, Supplementary Material online).
Several patterns emerged from these analyses. First, we identified a number of ontology categories generally associated with blood, including “immunity and defense” and “angiogenesis” (table 1). The cortex of the ovary becomes highly vascularized after the onset of maturity (Redmer and Reynolds 1996; Abulafia and Sherer 2000), a maturational process that could account for some of the observed enrichments. In addition, follicular development in the mature ovary is correlated with increased inflammation (reviewed in Bukovsky and Caudle 2008). In keeping with this change, we identified cytokine, chemokine, and macrophage-related immune activities among the significant categories of genes that show increased expression in the adults (supplementary table S3, Supplementary Material online). Second, and perhaps unsurprisingly, genes involved in developmental processes (i.e., developmental processes and mesoderm development) tended to be enriched for differential expression (table 1 and supplementary table S2, Supplementary Material online). These enrichments emphasize that the physiological distinctions between the mainly quiescent juvenile ovary and the mature ovary are likely related, at least in part, to differences in gene regulation.
At the level of individual genes, we found a significant upregulation in the adult ovary of genes essential for ovarian function and folliculogenesis (table 2), including genes such as VGF (VGF nerve growth factor inducible), MMP19 (matrix metalloproteinase-19), and ADAMTS1 (a disintegrin and metalloproteinase motif 1) (fig. 2). MMP19 and ADAMTS1 function to remodel the extracellular matrix as follicles develop (Jo and Curry 2004; Brown et al. 2010). The role of VGF is less clear, but its essential role has been demonstrated in VGF−/− mice, which produce many primary follicles but few mature follicles (Hahm et al. 1999; Jethwa and Ebling 2008). Fewer genes are upregulated in the juveniles; however, one intriguing example is RSPO1 (R-spondin1), which is known to be critical for early human ovary development and specification (Tomaselli et al. 2011). Our data indicate that it continues to be expressed until the stages right before puberty (fig. 2).
Changes in gene regulation could reflect differences in alternative splicing and exon usage between juveniles and adults in addition to changes in transcript abundance (e.g., Barberan-Soler and Zahler 2008; Revil et al. 2010). To investigate this possibility, we looked for differential exon expression (FDR-adjusted P value < 0.05)—a proxy for alternative splicing in a transcriptome without alternative splicing gene models—in genes with more than one exon. Specifically, we identified cases in which at least one exon, but not all exons, were differentially expressed (supplementary table S4, Supplementary Material online). To avoid false positives due to limited power, if one exon was differentially expressed, we relaxed the FDR-adjusted P value for differential expression to 0.15. Thus, evidence for exon-specific differential expression required relatively strong evidence for differential expression in at least one exon and a relative absence of evidence for differential expression in at least one other exon. Twenty-four genes exhibited this pattern, including STC (stanniocalcin) and GCLC (gamma-glutamylcysteine synthetase, catalytic subunit), both of which are thought to be important in ovarian development and function (Paciga et al. 2002; Luderer et al. 2003; Luo et al. 2004; Hoang et al. 2009).
Many of the genes expressed in juvenile and adult baboon ovaries are also likely to be expressed in juvenile and adult ovaries of other primates, including humans. Thus, genes that we identified as differentially expressed across life history stages in baboons might be informative for identifying genes important in female life history evolution in humans or in primates more generally. To gain insight into the patterns of natural selection that may have acted on such genes, we therefore integrated the novel functional data from this study with evidence for selection in primates from previous studies.
We obtained estimates of positive selection on the lineage leading to humans for protein-coding regions from Nielsen et al. (2005) and for putative regulatory regions 5 kb upstream of genes from Haygood et al. (2007). Both studies took a similar approach to identify selective targets: Specifically, they compared the rate of nucleotide evolution in the focal region (protein-coding regions in Nielsen et al. 2005 and upstream regulatory regions in Haygood et al. 2007) with the rate of nucleotide evolution in a nearby region thought to be evolving neutrally (the general approach is reviewed in Yang and Bielawski 2000). An elevated rate of nucleotide evolution in the focal region relative to the nearby neutral region was interpreted as a signature of adaptive change. Likelihood ratio tests were then used to identify cases in which these rates differed across different branches of a species tree; we identified possible targets of lineage-specific selection by locating elevated rates of evolution in protein-coding or regulatory regions that occurred only on specific branches of the tree.
Combining our data with results from these studies (Nielsen et al. 2005; Haygood et al. 2007), we identified 225 genes that were both included in Haygood et al. (2007) and were differentially expressed in this study (P < 0.05 for differential expression; we relaxed this threshold to increase the sensitivity of this analysis). Of these 225 genes, we found 19 differentially expressed genes that were associated with signatures of selection in noncoding regions on the human lineage (P < 0.05 for the test for selection). In contrast, we found that none of our differentially expressed genes overlapped with signatures of positive selection in coding regions (of a total of 35 genes that were differentially expressed in this study and were included in Nielsen et al. (2005)). We did not observe a significant enrichment of ovarian differentially expressed genes among genes with a history of positive selection on the human branch. However, the target of selection in genes that were both differentially expressed between reproductively mature and immature ovarian tissue and also exhibited evidence for selection in the lineage leading to humans, was much more likely to have been a gene regulatory region than a coding region (Fisher's exact test, P = 2.367 × 10−08). If historical selection pressures on these loci were related to female maturation, changes in gene regulation may therefore have played an important role in the evolution of these traits in humans.
Genes expressed in reproductive tissue tend to be rapidly evolving exhibiting signatures of selection in multiple lineages (reviewed in Swanson and Vacquier 2002). We therefore examined whether differentially expressed genes were likely to be members of this rapidly evolving class or if they were specific to selection on the human branch. We asked whether noncoding regions that appear to have been positively selected on the chimpanzee (Pan troglodytes) lineage (Haygood et al. 2007) were similarly enriched for differential expression. We observed a similar number of differentially expressed genes by life history stage that correspond to positively selected regulatory regions in chimpanzees (21 in chimpanzees vs. the 19 seen in humans). Interestingly, 10 of these regions are shared between the two species, significantly more than expected by chance (hypergeometric test, P = 7.595 × 10−18; table 3). These results suggest that positive selection on the specific aspects of ovarian maturation controlled by these genes may be a general characteristic of primate evolution. Indeed, genes involved in reproductive and immune pathways that evolved under selection in humans are often also under selection in other primates (reviewed in Vallender and Lahn 2004) and in mammals more generally (Kosiol et al. 2008). Our data suggest that this shared pattern of positive selection may apply to regulatory regions of reproductively important genes as well.
The timing of female sexual maturity is one of many life-history traits that have shifted during primate and human evolution, probably in response to selection. Our results suggest that there has been repeated selection on the cis-regulatory regions of some sexual maturity-related genes in multiple primate lineages. These loci are therefore of special interest in relationship to phenotypic evolution during reproductive maturation. Thus, examining the overlap of signatures of selection and differential gene expression from samples obtained from natural populations may serve as a useful filter for identifying loci of particular evolutionary or phenotypic interest. Although such opportunities will be uncommon, they promise to enrich our ability to interpret the phenotypic relevance of sequence-based signatures of selection.
Samples used in this study were obtained from seven healthy females from the Amboseli baboon population in Kenya (supplementary fig. S1 and table S1, Supplementary Material online) and retrieved within 5–8 h of death. Tissue was stored in RNAlater (Ambion, Austin, TX) and transported to −20 °C storage in Nairobi within 24 h. Upon transport to the United States, samples were stored at −80 °C. To minimize the effects of cell type heterogeneity in the ovary, we sampled from the lateral ovarian cortex.
Four micrograms of total RNA were isolated for each sample using an RNeasy kit (Qiagen, Valencia, CA) (supplementary table S1, Supplementary Material online) and used as input for the mRNA-Seq 8-Sample Prep Kit (Illumina, San Diego, CA). Libraries were sequenced on an Illumina GAIIx (one lane per sample) at the Yale University Keck Sequencing Core Facility. Approximately 15 million of 75 bp sequences resulted from each lane of sequencing.
The current publicly available baboon genome assembly (Pham_1.0, 20 November 2008) contains 387,373 linear scaffolds with approximately 5.3× coverage of the genome but has not yet been assembled into chromosomes (http://www.hgsc.bcm.tmc.edu/project-species-p-Papio%20hamadryas.hgsc). We mapped the RNA-Seq reads to the subset of these scaffolds (134,448 scaffolds with mean length of 20,246 bp) that mapped unambiguously to the macaque genome (Mmul_051212, rhemac2) using lastz (Harris 2007). Overall, the subset covered 94.9% of the current rhesus macaque assembly. Gene models were obtained by mapping human RefSeq exons to the baboon genome with lastz in Galaxy (Taylor et al. 2007) with a 90% similarity cutoff based on previous estimates of human–baboon sequence conservation (Silva and Kondrashov 2002).
The RNA-Seq reads were mapped to the baboon scaffolds using “bowtie” (Langmead et al. 2009). Reads were defined as being within exon models using HTSeq (http://www.huber.embl.de/users/anders/HTSeq/doc/overview.html). Gene counts are the sum of the exon expression counts. The overall distributions of read counts were similar across all individuals and, more importantly, were not different between juveniles and adults, our primary axis of comparison (supplementary fig. S2, Supplementary Material online). Both exon counts and gene counts were normalized using the edgeR package (Robinson et al. 2010) in R (R Development Core Team 2008).
To evaluate the effect of maturation stage on specific genes, we used a generalized linear model with a negative binomial error structure to model variation in gene expression for each gene. Gene expression counts represented the response variable, and life history stage was modeled as a binary explanatory variable (juvenile or adult). We eliminated seven genes from this analysis that exhibited a significant relationship between gene expression and admixture-related genetic background as well as a relationship with life history stage (admixture between P. cynocephalus and a sister taxon, P. anubis, has previously been documented in this population and presented a possible confounder: Alberts and Altmann 2001; Tung et al. 2008). FDR corrections for multiple comparisons were performed using the Benjamini–Hochberg method (Benjamini and Hochberg 1995) at an FDR of 5% (fig. 1).
To determine functional categorical enrichment for the differentially expressed genes, we employed the PANTHER (HMM Library Version 6.0) (Mi et al. 2005) and GO (Gene Ontology Consortium 2000) databases and computed enrichment scores using Wilcoxon rank tests. Our background set of genes was composed only of genes measured in this study.
Raphael Mututua, Serah Sayialel, Moonyoi ole Parsetau, Lenkai ole Rikoyan, and Kinyua Warutere contributed skillfully and in difficult circumstances to collecting the samples and the long-term data used in this analysis. We thank the Office of the President, Republic of Kenya, the Kenya Wildlife Service, its Amboseli staff and Wardens, the National Museums of Kenya, and the members of the Amboseli-Longido pastoralist communities for permission to work in Amboseli. We thank Jeanne Altmann for providing access to long-term data and contributing samples, Baylor College of Medicine (Human Genome Sequencing Center) for the preliminary draft assembly of the baboon genome, Ralph Wiedmann and Olivier Fedrigo for aid in data analysis, Adam Pfefferle for assistance in data generation, and David Garfield for comments on the manuscript. This work was supported by the Duke Primate Genomics Initiative; National Science Foundation (NSF) BCS-0323553, NSF DEB 0919200, National Institute on Aging (NIA) R01AG034513-01, and NIA P01-AG031719 to S.C.A.; NSF-BCS-08-27552 (HOMINID) to G.A.W.; and NSF DEB-0846286 to S.C.A. and G.A.W.