Here, we have used yeast gene expression as a model system to describe gene–environment interaction at strain, locus, and gene levels. We showed that 2,037 transcripts were jointly dependent on strain and condition in two parental strains. Then, we performed linkage analysis on the difference in transcript levels across conditions with 109 segregants, and identified 1,555 gxeQTL. The high number of gxeQTL that we detected has allowed us to make some general observations. We have shown that local and distant linkages differ dramatically in how they act across multiple conditions. Local linkages appear to be more stable: they are less likely to be dependent on the environment, and even when they are, they are more likely to have an effect in both conditions, with the direction of effect often being the same. Distant linkages, on the other hand, are more volatile: they are more likely to be dependent on condition and to show an effect in only one condition. Entire distant peaks can change across conditions, and when they do have an effect in multiple conditions, distant loci are more likely to act in different directions. Finally, we characterized the gene responsible for influencing the largest gene–environment interaction distant peak, IRA2. We showed that the RM allele of IRA2 is a stronger inhibitor of Ras/PKA signaling than the BY allele in the conditions that we tested, and that this locus has experienced a change in selective pressure in RM.
Previous studies have reported that local linkages are more consistent than distant linkages across conditions and experiments, including worms in different temperatures [6
], different tissues in mice [15
], and in the reproducibility of transcript linkages in human studies [10
], indicating that this pattern is likely to extend beyond yeast. Since local and distant linkages are likely to differ in how they influence traits on a molecular level, we can speculate as to how they show differences in condition dependence. Although we do not know most of the causative polymorphisms involved for each type of linkage, local linkages show increased rates of polymorphism in 5′ and 3′ noncoding regions and high rates of allele-specific expression [30
]. Thus, we feel comfortable treating the two groups as distinct entities: local linkages are likely to be enriched for variants that directly influence transcript levels via changes in cis
-regulatory sites, whereas distant linkages typically influence levels via a protein intermediate (trans
factors). In a cis
-regulatory site that interacts with a binding protein to directly increase or decrease transcript levels, mutations can either disrupt or enhance binding, but are unlikely to be able to do both in a condition-specific manner. If the binding protein is an activator, a loss of binding mutation will result in lower transcript levels in all conditions where the activator is present, and will show no change when the activator is absent, resulting in either a condition-specific pattern or in a pattern in which the locus has the same effect or the same direction of effect in both conditions (Figure S5
A). Other than the case where a single polymorphism destroys a site for one binding factor while creating a site for another, it is less clear how one cis
-regulatory mutation could be associated with effects in opposite directions, although one might imagine more complicated scenarios with either multiple linked cis
-regulatory polymorphisms or transcription factors that are able to act as both activators and inhibitors. On the other hand, distant variants that influence protein intermediates have the potential to interact with many proteins, depending on the milieu present in the cell in a given condition (Figure S5
B). A single variant may be able to activate transcription in one condition and repress it in another, resulting in a change in direction of effect.
The frequent occurrence of locus effects in opposite directions in the two conditions is surprising. One possible explanation is multiple linked polymorphisms. Distant loci are much larger targets for variation than cis
-regulatory regions [45
], but this could occur in both contexts. Multiple compensatory mutations could accumulate at loci and, depending on the condition, could compensate differentially. The mean phenotype would be stable over conditions, yet the direction of the effect within a condition could vary (Figure S5
C). One example is suggested by the gxe11 peak on chromosome 14, where multiple polymorphisms that influence sporulation [46
] and high-temperature growth [48
] have been characterized, and some act in the opposite direction of the overall locus effect. At least one of these alleles (MKT1
D30G) is at least partially responsible for the gxe
QTL in this region (Figure S6
). Further characterization of the polymorphisms involved at these loci should help elucidate the underlying mechanisms behind this phenomenon.
A practical implication of the observation that distant loci often act in opposite directions in the two conditions is that such loci may be inherently difficult to detect in experiments where condition is not controlled, as when different tissues of multicellular organisms are mixed or when nonexperimental organisms (including humans) that experience different unmeasured environments are studied. This is because when conditions are not controlled, effects of opposite direction will cancel each other out, resulting in no overall association between the trait and the locus. This also has implications for selection, as these loci may be hidden from selection in organisms that experience fluctuating environmental conditions. The detection of loci that act in opposite direction is additionally complicated by our observation that the majority of these loci did not reach genome-wide significance in either condition alone, emphasizing the importance of using methods to directly test for gene–environment interaction without prior reliance on linkage in a single environment.
One general implication of our results for studies in other species, including humans, is that many genetic effects on most traits are likely to be detected without testing for gene–environment interactions, provided that the relevant environmental factors are known and controlled either experimentally or statistically. However, analyses that ignore gene–environment interactions introduce strong biases with regard to the types of loci that are detected. Moreover, gene–environment interactions play a dominant role for a minority of traits. We have studied the prevalence and importance of gene–environment interactions in a single-cell organism grown in two very different and precisely controlled environmental conditions. Our focus on transcript levels as quantitative traits allowed us to study a very large number of traits simultaneously and to delineate general patterns, as well as to provide detailed molecular examples of loci that show gene–environment interactions. The quantitative details would undoubtedly differ if different species, environments, and phenotypes were studied. It is possible that many environmental differences experienced by humans may be less drastic than those between growth on glucose and ethanol are for yeast. However, some environmental differences (for example, exposure to pathogens or shift from traditional to modern diets) can have a dramatic effect on health. Our detailed studies in a model organism provide examples of the types of effects that may be expected in humans, and thereby inform practical study design. Understanding the subset of human genetic variants whose phenotypic effects are modifiable by the environment will be key in making full use of personal genomic variation.