|Home | About | Journals | Submit | Contact Us | Français|
The evolution of cis regulatory elements (enhancers) of developmentally regulated genes plays a large role in the evolution of animal morphology. However, the mutational path of enhancer evolution—the number, origin, effect, and order of mutations that alter enhancer function—has not been elucidated. Here, we localized a suite of substitutions in a modular enhancer of the ebony locus responsible for adaptive melanism in a Ugandan Drosophila population. We show that at least five mutations with varied effects arose recently from a combination of standing variation and new mutations and combined to create an allele of large phenotypic effect. We underscore how enhancers are distinct macromolecular entities, subject to fundamentally different, and generally more relaxed, functional constraints relative to protein sequences.
Three major challenges for understanding the genetic and molecular bases of morphological evolution are to identify loci underlying trait divergence, to pinpoint functional changes within these loci, and to trace the origin of functional variation in populations. The evolution of animal morphological diversity is generally associated with changes in the spatial expression of genes that govern development (1, 2). The divergence of particular morphological traits has been linked to changes in specific enhancers of individual loci (3–9). Mutations in individual, modular enhancers are thought to circumvent the potentially pleiotropic effects of mutations in coding sequences of genes that participate in many developmental processes (10–12).
Nonetheless, there is relatively little detailed knowledge of how enhancer sequences evolve, of the genetic path of enhancer evolution. In most instances, functional mutations have not been identified, so their individual effects and origins have not been traced. In contrast, the evolutionary paths of several proteins have been traced and revealed that many trajectories, including reversals, are not allowed because of structural constraints (13–15). To decipher the mode and tempo of regulatory sequence evolution, we must determine the following: How many mutations are involved in enhancer divergence? What effects do individual mutations have? And, what is the relative contribution of standing variation and new mutations to enhancer evolution?
To identify enhancers that have recently evolved, we have traced the recent evolution of adaptive pigmentation within African populations of Drosophila melanogaster. We elucidate a specific set of regulatory mutations that underlie changes in gene expression and pigmentation and reconstruct the path of enhancer evolution.
Across Africa, a strong correlation exists between elevation and the degree of abdominal pigmentation in D. melanogaster populations (16). This correlation is not explained by population structure, indicating that dark pigmentation is a derived adaptation to high altitude or a correlating selective pressure. Previous study of a dark population from Uganda (16) uncovered a partial selective sweep at the ebony locus, where the darkest third chromosome lines (Fig. 1) share a 14-kilobase haplotype block of nearly identical sequence extending over the noncoding region of the ebony locus (fig. S1). ebony encodes a pleiotropic, multifunctional enzyme in the biogenic amine synthesis pathway (17) that functions in a variety of processes. In the adult cuticle, expression of ebony is required in regions that will generate a yellow shade (18), and its absence causes a dark, melanic cuticular phenotype.
The partial sweep at the ebony locus and its association with dark pigmentation is evidence that genetic variation at ebony contributes to the melanic phenotype (16). To test this association directly, we undertook a series of transgenic complementation experiments with use of ebony transgenes from light (U62) and dark (U76) extraction lines. In an ebony null mutant background, we found that the pigmentation phenotypes of animals bearing the light (U62) and dark (U76) transgenes differed by about 10 pigmentation units (figs. S2A and S3). This is similar to the magnitude of pigmentation difference between the U76 and U62 extraction lines (fig. S2A). Furthermore, in the genetic background of the dark U76 line, we found that a single copy of the light (U62) transgene was sufficient to fully complement the melanic abdominal phenotype (fig. S2B). These results suggest that variation at ebony can account for much of the phenotypic variation between extraction lines.
In addition, on the basis of the identification below of haplotypes containing causative mutations, we used a standard analysis of variance approach to estimate the contribution of these haplotypes to phenotypic differences. We found that variation at ebony accounts for up to 83% of the total phenotypic variation [supporting online material (SOM) text]. These results confirm that ebony is the major locus responsible for the dark phenotype of the Ugandan extraction lines.
The association between variation at ebony and melanic pigmentation could be due to divergence in the regulation of ebony expression and/or protein function. However, among the light and dark transgenes tested, the dark allele contained no derived coding differences relative to the species consensus (fig. S4), and only one derived difference existed in the light U62 line (P46T), suggesting that causative changes lie outside the coding region. To test whether a transcriptional regulatory difference may be responsible for the dark phenotype, ebony mRNA expression in newly eclosed adults was visualized by in situ hybridization (Fig. 2). There was a marked reduction (58 to 83%) in ebony mRNA expression in darker lines (Fig. 2 and table S1). The association of the dark phenotype and haplotype with decreased ebony mRNA suggests that cis regulatory sequence mutations have accumulated that reduce ebony expression.
To localize regions of the ebony locus responsible for the dark phenotype, we tested the activity of chimeric ebony transgenes in which the upstream regulatory region of each allele was fused to the downstream first exon and coding region of the other allele. The light/dark construct performed nearly as well as the light construct in complementing the abdominal phenotype of an ebony null mutant (Fig. 3, C and G), whereas the reciprocal dark/light transgene yielded a phenotype similar to that of the complete dark allele construct (Fig. 3, D and G). The phenotypes of the chimeric transgenes indicate that the functional differences between the light and dark alleles largely reside in the 5′ noncoding region of the locus, presumably within enhancers.
To identify enhancers within the ebony regulatory region (figs. S5 and S6), we fused fragments of noncoding DNA to a green fluorescent protein (GFP) reporter gene and monitored reporter expression in adult tissues. We identified an array of modular enhancers with activities in many tissues that exhibit ebony mutant phenotypes or that express the gene (Fig. 4A and fig. S5). One enhancer active in the developing abdomen (and thorax) was localized to a 0.7-kb fragment located 3.6 kb upstream of the ebony promoter (“abd” in Fig. 4A). The abdominal element drove reporter expression in a broader domain than that of the native ebony expression pattern, including the posterior regions of each tergite (fig. S6H) and the male A5 and A6 segments (fig. S6, E and F). However, the extension of reporter constructs to include promoter-proximal and intronic sequences resulted in a precise recapitulation of the endogenous ebony expression pattern (fig. S6, N and T).
Regulatory mutations are suggested to minimize pleiotropic effects relative to coding mutations because of the modular organization of cis regulatory regions (10–12). However, the modularity of enhancers has not yet been tested with naturally occurring mutations in a comprehensively defined regulatory region. To examine whether regulatory mutations in one module impact the function of adjacent modules, we generated GFP reporter constructs bearing the upstream region fused to the first introns from both light (U53) and dark (U76) alleles and measured reporter protein activity in various tissues (Fig. 4 and fig. S7). In the developing head (Fig. 4, B and C), legs (Fig. 4, D and E), larval brain (Fig. 4, F and G), wing (Fig. 4, H and I), and haltere (Fig. 4, J and K), both light and dark regulatory regions displayed similar amounts of activity (fig. S7V). However, in the abdomen, we observed a pronounced 83% reduction in activity of the dark allele regulatory region relative to that of the light allele (17 ± 3%) (Fig. 4, L and M, and fig. S7V). This decrease is very similar in magnitude to the reduction in ebony mRNA expression in the dark lines (Fig. 2). Thus, mutations in the dark line regulatory region affect gene expression with a high degree of spatial specificity and provide direct evidence that the modular architecture of cis regulatory regions minimizes the pleiotropic effects of functional mutations.
To identify the position, number, kind, and size of effects of functionally relevant mutations within the ebony abdominal enhancer, we compared dark U76 and light U62 alleles because these represent the two extremes of ebony expression. Between the U76 and the U62 alleles, there are ~120 nucleotide differences scattered over the 2.4-kb abdominal enhancer [44 point mutations and 76 base pairs (bp) differing because of 10 insertions or deletions (indels)] that could potentially contribute to the observed difference in activity. To localize functional differences, we first created chimeric reporter constructs with groups of mutations and then narrowed these to individual changes that contribute to the activity of chimeric constructs. Our analysis below suggests that a minimum of five mutations differentiate the activities of dark and light lines, two of which are specific to the dark haplotype.
We focused on a 2.4-kb region that contained the 0.7-kb core abdominal element (“abd” in Fig. 4A) and recapitulated the difference in RNA expression between dark (U76) and light (U62) lines, such that the dark allele construct expressed 22% of the reporter activity of the light allele construct (fig. S8, B and C). We subdivided the 2.4-kb region into three subregions (X, Y, and Z, Fig. 5A) and systematically substituted individual fragments from the light allele into the dark allele construct. Of the three subregions tested, the Z fragment showed the strongest effect (fig. S8D), increasing reporter activity from 22% to 67% of the activity of the light allele. Moreover, in the reciprocal construct, swapping in the Z fragment was sufficient to decrease activity of the light allele from 100% to 46% activity (fig. S8E).
Several previously identified candidate mutations were only observed on the dark haplotype, the majority of which (five out of eight) map to the 2.4-kb regulatory region (fig. S8A, red bars labeled “Dark Specific Substitutions”). Replacement of all five substitutions in the dark allele construct with the nucleotides present in the light allele increased reporter activity to 70% of that of the light allele construct, demonstrating that they include functionally important mutations (fig. S8F). The Z fragment contains four of the five dark-specific mutations within the 2.4-kb element (fig. S8A), so we reverted the individual dark-specific substitutions of the dark allele construct. Dark-specific substitutions 2 and 3 showed no effect on the level of reporter expression (table S2), whereas substitution 4 showed a small effect on expression, raising activity from 22% to 35% (fig. S8G). Substitution 5, however, caused a dramatic increase to 64% of the light allele activity (fig. S8H). Therefore, at least two novel substitutions in the Z fragment have contributed to the divergence of the dark and light haplotypes, with substitution 5 providing the largest effect.
To account for the remaining ~50% difference in activity, we turned to the X and Y fragments. No contribution of the X fragment (which contained dark-specific substitution 1) was observed when the light allele X fragment was swapped into the dark allele construct (fig. S8I). However, the chimeric construct bearing the Y fragment from the light line increased activity from 22% to 47% (fig. S8J), demonstrating that one or more functional mutations exist in the Y fragment. The Y fragment encompasses the core abdominal activity, the smallest span of DNA sufficient to drive strong reporter expression in the abdomen (figs. S6 and S8A). Comparison of the isolated Y fragment activities of the light and dark alleles also revealed much weaker activity of the dark allele Y fragment construct (25% relative to light) (fig. S9, B and C).
In order to pinpoint causative mutations within the Y fragment, we assayed a series of Y fragment GFP reporter constructs. Twenty-five point mutations and four indels (encompassing 42 bp) exist between the light U62 and dark U76 alleles Y fragment sequences. The major contribution to expression differences (81%) mapped to the 5′ half of the Y fragment (fig. S9, D and E), which allowed us to narrow the 67 candidate nucleotides down to the eight point mutations that differ in this region of the Y fragment (Fig. 5B). Of these eight candidates, three were eliminated because they were found in other strongly expressing Y fragments. We reverted each of the five remaining candidate substitutions individually from the dark allele to that present in the light allele. Mutations at positions 27 and 32 increased dark allele Y fragment activity from a 25% baseline to 54% and 50%, respectively, of the light allele Y fragment activity (fig. S9, F and G). The third substitution at position 137 had a more dramatic effect on the Y fragment activity, raising expression to 80% of light Y fragment expression (fig. S9H) (note that the sum of effects exceeds 100%, so individual effects are not strictly additive). These results suggest that at least three substitutions within the Y fragment contributed to the overall reduction of abdominal enhancer activity.
The five mutations that functionally differentiate the dark and the light haplotypes cause a decrease in the activity of the dark allele enhancer. The mutation with the greatest effect arose at a considerable distance (270 bp) from the core element. If this mutation is in an activator binding site, we would expect that this sequence would lie in the core element. Alternatively, the mutation may represent a repressor binding site. Indeed, when we deleted this site and the five adjacent nucleotides 5′ and 3′ to it, the enhancer drove a dramatic increase in reporter expression, from 22% to 106% of the light haplotype activities (fig. S8L). The greater effect of deleting these sites relative to reverting the nucleotide raised the possibility that these sites serve a function in the light allele. When we engineered the identical deletion into the light haplotype, reporter activity also increased (fig. S8M, 170% of light haplotype), indicating that this sequence is required to repress enhancer activity and that the substitution in the dark haplotype further repressed ebony expression.
Together, these data show that multiple mutations (at least five), with varying effects (accounting for 8% to 40% of the overall difference in activity) and representing different kinds of functional change (reduced activation strength, increased repression), underlie the evolution of the ebony abdominal enhancer.
The observation that dark-specific substitutions accounted for a subset of the causative mutations raised the possibility that the path of ebony enhancer evolution involved both new mutations (the shared dark-specific substitutions) and standing variation (Fig. 5B). To assess the potential origins of the five substitutions, we examined the enhancer sequences of lines obtained from various regions in Africa (Fig. 5C).
The three causative substitutions located in the Y fragment occurred at high frequency in both light lines of the Ugandan population and in the light Kenyan population sample (Fig. 5B) and are also found in very distant African populations (Fig. 5C). Two of the substitutions were observed in all five populations sampled. The third Y fragment substitution was found in four of the five populations. These results demonstrate widespread standing genetic variation at the relevant sites in the Y fragment of the ebony abdominal enhancer.
The dark-specific substitutions were not observed in any other lines from Kenya or Uganda. Among 67 endemic fly lines from five African regions examined in our survey, the only other location where dark-specific mutations (numbers 4 and 5) occurred was in nearby Rwanda and was associated with the dark haplotype (Fig. 5C). The absence of these substitutions in isolation across the ancestral range of D. melanogaster indicates that they either arose de novo or were rare variants present in the population when the dark haplotype was selected.
The existence of both common polymorphisms and rare substitutions contributing functional changes to ebony expression raised the potential scenario that the relevant haplotype of standing variation in the Y fragment was assembled before dark-specific substitutions appeared and adaptive selection resulted in their high frequency. To test this scenario and to place these functional changes on a relative time scale, we took note of a Ugandan line (U65) that exhibited intermediate pigmentation (Fig. 1), ebony expression (Fig. 2, K and L), and abdominal enhancer activity (fig. S8K). Of all the nonmelanic lines examined, the Y fragment of U65 was most similar to the dark haplotype Y fragment, harboring all three functionally relevant Y fragment mutations (Fig. 5B) and sharing a ~1-kb region of sequence similarity with the dark line haplotype.
Within the 0.9-kb tract of polymorphisms shared between U65 and the dark strain hap-lotype, four mutations have arisen that differentiate the two, allowing us to estimate that they shared a common ancestor about 395,000 generations ago (SOM text). In contrast, the dark haplotype has accumulated just three substitutions across 14 kb among four lines, suggesting that these four alleles last shared a common ancestor only about 9000 generations ago. The 95% confidence intervals for these estimates do not overlap, which allows us to infer that the Y fragment haplotype existed long before the dark-specific substitutions arose (fig. S10). Hence, we have resolved two steps in the evolution of this adaptation: the assembly of functional standing variation followed by the recent addition of beneficial dark-specific substitutions that resulted in the full decrease of ebony expression, caused pronounced abdominal melanism, and which were swept to high frequency (fig. S11).
We have shown that the adaptive evolution of melanism in a Ugandan population of D. melanogaster occurred through multiple, stepwise substitutions in one enhancer of the ebony locus. We suggest that this genetic path of enhancer evolution with multiple substitutions of varying effect sizes, which originate from both standing variation and new mutations and combine to create an allele of large effect, may be a general feature of enhancer evolution in populations. This view is consistent with studies that have demonstrated that substitutions at multiple sites within enhancers are responsible for evolutionary changes in gene expression (6, 7, 19–22).
The pattern of multiple substitutions in enhancers also makes sense in light of their functional organization. Enhancers contain numerous transcription factor binding sites that are broadly distributed across a few hundred base pairs or more, all of which contribute to overall transcriptional output. Variation in enhancer output can and does arise from modifications at any of a large number of sites, and functional standing variation in enhancers is abundant in populations (23, 24).
Enhancers and proteins are very distinct macromolecular entities, and it is useful to consider the potentially different constraints operating on enhancers and proteins that might affect their evolutionary trajectories. At least five constraints limit variation within proteins and restrict the path of protein evolution. The first constraint is pleiotropy. Coding mutations in pleiotropic genes will generally affect all functions, which will most likely be deleterious. The second constraint is the triplet genetic code that cannot accommodate most indels. Third, proteins must fold properly, and most random amino acid replacements are destabilizing and deleterious (25, 26). Fourth, because of demands on protein structure, the order in which specific amino acid replacements may occur is typically constrained (13, 14), thus reducing the number of genetic paths adaptation may take. And lastly, many proteins often have a single active site or one or a few binding domains, such that changes at only a very limited number of positions may directly alter properties of these sites.
In contrast, more relaxed constraints appear to operate on evolving regulatory elements. The evolution of individual, modular enhancers circumvents the pleiotropic effects of coding mutations, and our results illustrate precisely why this is the case. Obviously, there is no triplet code, so a greater range of mutational events can be accommodated. Furthermore, enhancers are not constrained by three-dimensional structure; consequently, the order in which substitutions may occur would appear to be much less constrained. Indeed, we found many combinations of functionally relevant polymorphisms in our survey of ebony haplotypes. In addition, chimeric enhancers that placed more recent mutations in an ancestral context exhibited intermediate levels of function, as would be expected if multiple alternate genetic paths are viable. And lastly, because enhancers generally contain numerous binding sites for transcription factors distributed throughout their sequence, there may be more potential sites where substitutions may modify function. Here, we identified substitutions that both decreased activation and increased repression. Thus, during their respective paths of adaptation, enhancers may present a larger mutational target for functional modification and may have a greater number of possible genetic paths open to them relative to typical protein-coding sequences.
Materials and Methods
Figs. S1 to S11
Tables S1 to S4