Complex traits, including human diseases, often involve epistatic interactions [1
], a property defined as a non-additive/non-independent effect of two loci on a quantitative trait [3
]. The detection of interactions mulit-locus linkage in general and epistatic interaction in particular is a very challenging task since the large number of traits causes a massive multiple testing issue and demands considerable computational resources. Therefore, exhaustive search methods for detecting loci interactions are usually cumbersome [6
]. To deal with such constrains, current methods often pre-filter locus pairs and zoom in on a candidate set of interactions that are expected to be enriched with such interacting locus pairs. For example, Storey et al
. proposed a step-wise strategy for multiple loci linkage analysis [7
], requiring that one locus had a significant, marginal effect on the underlying trait. In a recent study, Hannum et al
. designed a bi-clustering procedure for the identification of interacting locus pairs and applied their approach to uncover functional links between protein complexes in yeast [8
]. Litvin et al
. developed a three-stage method, which additionally utilized gene expression clustering to detect interactions in yeast [9
]. While these methods differ in the type of detected interactions they commonly relied on the step-wise identification of primary and secondary loci. However, interactions of gene products do not necessarily depend on an outstanding influence of a single factor. In fact, transcriptional regulation of most genes often involves multiple regulators that have small individual effects [10
]. Furthermore, such interactions are difficult to detect by the stepwise method unless the set of progenies is extremely large, prompting the need for approaches that are independent of strong primary locus effects.
Generally, methods to detect epistatic interactions feature two main characteristics: (i) To mitigate a potential multiple testing issue methods limit the number of upcoming statistical tests, for example by identifying strong primary loci [7
] or leveraging modularity [8
]. (ii) As a second key characteristic approaches employ different statistical models to detect certain types of interactions. For example, Hannum et al
] detected pairs of interacting loci without focusing on epistasis. Storey et al
] and Zhang et al
] first uncovered loci that jointly controlled a trait without demanding that their impact was non-additive/non-independent [9
]. Subsequently, epistatic locus pairs were identified by testing the significance of the interaction term.
Complementing existing methods, we designed a novel method, SEE (Symmetric Epistasis Estimation), allowing us to detect epistatic interactions in relatively small sets of progenies without the identification of primary loci. Our approach relied on a graph-based filtering step specifically designed to retain a large fraction of symmetric, candidate epistatic interactions. Compared to previous methods we also adopted the classical, more rigid, definition of epistasis, allowing us to rigorously check the significance of candidate interactions.
We applied our approach to uncover epistatic interaction in the malaria parasite P. falciparum
. In this organism, genetic crosses have yielded few recombinant clones, contributing to statistical challenges. While the first eQTL study of P. falciparum
by Gonzales et al
] suggested a potential role of epistatic regulation of gene expression [13
], a broader understanding of transcriptional regulatory mechanisms in P. falciparum
remained unclear.. Applying our approach to a HB3 × Dd2 parasite cross [12
] we found more than 1,500 putative epistatic interactions between locus pairs on different chromosomes and identified several epistatic interaction hotspots of biological significance. Interestingly, we found that the level of linkage disequilibrium (LD) between locus pairs was correlated with the number of genes whose expression was influenced by the corresponding epistatic interaction. Such disequilibrium provides an additional level of interactions between the loci.
We also applied our method to an eQTL dataset of S. cerevisiae
]. Surprisingly, we found much fewer epistatic interactions and no epistatic interaction hotspots. After ruling out that our results were statistical artefacts caused by the small number of P. falciparum
progenies, we hypothesized that selection pressure acting on P. falciparum
contributed to observed epistatic interactions and elevated LD, potentially reflecting host-pathogen interactions or drug induced selection.