Using a systematic genome-scale approach to inferring lineage-specific selection
acting on
cis-regulation, we found that over 100 genes belonging to
several gene sets have undergone lineage-specific selection in mouse, which may have
impacted diverse morphological and behavioral phenotypes. This work reports the
first cases of adaptive
cis-regulatory evolution in
M.
musculus, and expands the classes of traits (in any species) known to
be affected by gene expression adaptation, which previously did not include any
behavioral phenotypes. Methodologically, we augment previous work
[19] by showing
that adding information from an outgroup can suggest the likely action of positive
selection (as opposed to relaxed negative selection) when that selection was for
cis-acting upregulation. Two interesting questions for future
work are how much of this selection occurred since the introduction of these strains
to the lab, and for selection that occurred on the wild B6 ancestors, how much
occurred in
Mus musculus domesticus (the primary ancestor of B6
[26]) as
opposed to
Mus musculus musculus. Interestingly, wild
M. m.
domesticus tend to be larger than wild
M. m. castaneus
when reared in a common laboratory environment (C. Pfeifle, personal communication),
suggesting that this adaptation was likely to have occurred in the wild. Another
question raised by these findings is what are the relevant “units of
selection”
[44] for these polygenic adaptations; though regardless of the
answer, our conclusions regarding the extent of selection on
cis-regulation will not be affected.
Because the RNA-seq version of this approach can be applied rapidly and inexpensively
to hybrids between any two diverged lineages (including outbred lineages), we expect
it will find use in a wide range of taxa. In fact, it can be applied to any ASE data
from a hybrid between diverged lineages. Published ASE data sets from a variety of
species (e.g.
[45],
[46]) can now
be similarly re-analyzed for
cis-regulatory selection. This
approach can also be applied to any of the numerous published eQTL data sets
involving crosses between diverged parental lines.
Our approach is quite different from all previous studies of metazoan
cis-regulatory adaptation
[1]–
[4], which have identified single
genes with extremely strong effects on phenotypes such as pigmentation (e.g.
[21],
[47],
[48]) or skeletal
structure (e.g.
[49]). Our results reveal several important insights that
could not have been found at this single-gene level. For example, the only
previously known case of pathway-level gene expression adaptation was from our work
on the ergosterol biosynthesis pathway in
S. cerevisiae, where six
genes clustered in the pathway have undergone selection for down-regulation
[18]. Our present
results extend this considerably, demonstrating that polygenic
cis-regulatory adaptation can operate in parallel on dozens of
genes within a single functional group or pathway, and that this has occurred in
multiple gene sets during recent mouse evolution. Although each gene under such
coordinate selection may be expected to have a less extreme phenotypic effect than
those previously reported
[1],
[2],
[21],
[47]–
[49], the sum of their effects could be quite strong. One
important question that can now start to be addressed is how often
cis-regulatory adaptation proceeds via dramatic changes in
single genes, as opposed to more subtle changes distributed across an entire gene
set
[3]. Much of
the answer may ultimately depend on factors such as the strength/duration of
selection (with intense/short-term selection pressure likely favoring extreme
single-locus changes) and the genetic architecture of the trait in question.
A second open question is how often
cis-regulatory adaptation occurs
by upregulation versus downregulation of genes; our results suggest that the
majority of the adaptation we discovered was due to upregulation, in contrast to
most previous (single-locus) studies, which have predominantly identified cases of
trait loss via downregulation
[2]. Interestingly, we previously observed a preponderance of
upregulation in a genome-wide study of gene expression adaptation in
S.
cerevisiae
[18], suggesting
that this pattern may be widespread. Again, which of these is more common in a
particular species may depend on the nature of the selective pressure and the
underlying genetic architecture.
Third, it has been proposed that gene expression adaptation may be responsible for
most morphological adaptations in part because it offers a solution to the issue of
pleiotropy. For a gene expressed in many tissues or stages of development, an amino
acid change (in a constitutive exon) will affect the protein produced in all of
these different contexts. Even if this change is adaptive in one or two of them, it
has been argued that it would be highly unlikely to be advantageous in all of them
[1]. In
contrast, the modular nature of
cis-regulation allows for a change
in expression in just one tissue or stage, without affecting any other; thus
pleiotropic constraints should not be as severe, and adaptation should be able to
proceed
[1].
Predictions from this are that genes expressed more broadly will be more likely to
adapt via
cis-regulation, and that these adaptations will only
affect a small part of the genes' expression patterns. Two recent studies
attempted to test this idea. In one of these
[50], genes near noncoding elements
with accelerated evolution in the human lineage were proposed to have undergone
human-specific selection on
cis-regulation (though the authors
acknowledged that such acceleration need not indicate positive selection); however
no enrichment was found for these genes to be expressed in more tissues than
average. In the other
[51], genes were classified as either
“morphogenes” or “physiogenes” based on their mouse knockout
phenotypes; morphogenes (which tend to be expressed in fewer tissues) had higher
dN/dS (an indicator of selection on protein-coding regions), while physiogenes had a
higher magnitude of expression change between human and mouse, consistent with the
prediction of greater adaptive expression change in broadly expressed genes. However
this study did not distinguish between adaptive
versus non-adaptive
change, or
cis versus trans regulation, or tissue-specific
versus non-specific expression changes, so the relevance to
theories of tissue-specific adaptive
cis-regulatory evolution is
not clear. Our results suggest that although most of the genes in our most
significant gene sets are broadly expressed (not shown), their expression in all
three tissues was affected by the recent selection on
cis-regulation we detected (
Table S2; all gene sets from were significant in all
three tissues, except for the JAK/STAT pathway); thus these adaptations were not
tissue-specific, so do not support pleiotropy-based arguments for the expected
prevalence of tissue-specific gene expression adaptation (we note that while the
adaptations did not result in tissue-specific expression changes, the selection may
have acted to change expression in just one tissue, with the rest changing as a
side-effect). Of course, since we have only examined three tissues in two mouse
strains, much more work is required to determine how general this conclusion is.
Finally, because of its genome-scale perspective, our approach may eventually help to
address many other fundamental questions that cannot be addressed by single-locus
studies
[3], such
as what fraction of gene expression divergence is adaptive, and what fraction of
evolutionary adaptation occurs at the level of
cis-regulation.