GAIA allows researchers to apply two different tests. One test, the "interaction only" test, considers the significance of the interaction terms on their own (over and above main effects). The other test, the "overall" test, considers the overall significance of both the main (or marginal) effects and the interaction effects together (i.e. a model with the terms

*a*_{1},

*d*_{1},

*a*_{2},

*d*_{2},

*i*_{aa},

*i*_{ad},

*i*_{da},

*i*_{dd }compared with a model without them). The tests will be useful in different situations. The "interaction only" test will be most useful in candidate gene studies; for example in the schizophrenia data described here, there was evidence for statistical interaction between two biologically related genes. The "overall" test will be useful as a replacement for association testing of large numbers of loci singly. The "overall" test was discussed in this context by Marchini et al [

6]; they show that models with interaction terms can be more powerful than simpler models which ignore interaction. Power improvements were shown both for a brute force approach which tested all possible interactions and an approach which screened loci for nominal significance [

6]. In many realistic scenarios they show that the improved power outweighs the cost of the multiple-testing correction. Essentially, the increase in significance when fitting the "correct" model scales better with sample size than magnitude of multiple testing correction [

3]. Models similar to those described by Marchini et al were considered recently by Millstein et al [

15]. Millstein et al apply a slightly different set of sequential tests. Tests are done by selectively conditioning on previous results from single locus tests [

15]. Another approach which addressed some of the same issues (but not in a human genetics context) is Carlborg and Andersson [

12]. In GAIA the implemented approach for sequential testing involves applying the screening approach described in the previous section. Consider the 600 SNP example outlined earlier. With 600 SNPs, one would expect to find approximately 30 SNPs that were significant on their own (at a nominal 5% significance level and assuming only a small proportion of loci actually influence disease risk). A useful screen (similar to that described in strategy III from Marchini et al) with the "overall" test would therefore be to compute the 600 × 30 -

= 17535 possible overall tests (assuming we test the

30 SNPs against all 600). To maintain an appropriate type I error one needs to correct for multiple tests done. In this example if we take the best p-value from any test done, we would (Bonferroni) correct for 17535+600 = 18135 tests. A detailed comparison of the different approaches described for large scale association analysis with interaction [

6,

12,

15] would be an interesting area for further study.

Logistic regression based interaction has been utilised by a various authors [

6,

9,

15]. Although this method can be applied using standard statistical packages, GAIA facilitates simple application of the method with the added advantage of permutation analysis and simple screening for inclusion of SNPs in the interaction test. A non-parametric alternative to parametric analyses such as logistic regression is Multifactor Dimensionality Reduction (MDR) [

16]. The MDR approach avoids specifying a particular model for the interactions and instead bases its inferences on sets of "high" and "low" risk multilocus genotypes. This approach can be powerful for certain models of interaction with little or no main effects. However, for many realistic models of interaction, the MDR approach has been shown to be less powerful than approaches based on logistic regression [

15].

GAIA does not currently accommodate family-based association design data. Tests analogous to the family based Transmission Distortion Test (TDT, [

17] and refinements) can be conducted through the use of conditional logistic regression [

18] and this accommodates the linear modeling of interactions. However, such methods are most powerful when there are informative transmissions from heterozygote parents and the use of highly polymorphic markers (with high heterozygosity) undesirably leads to large numbers of degrees of freedom in the tests for interactions described above. This, combined with the larger sample sizes usually available, means that case-control design is likely to be most suitable for interaction analysis. It is important to differentiate between biological epistasis (e.g. where two or more genes are involved in the same biological pathway and are jointly responsible for the end phenotype) and statistical epistasis (i.e. the deviation of the terms

*i*_{aa},

*i*_{ad},

*i*_{da},

*i*_{dd }from zero in the linear model stated above). Biological epistasis occurs at the individual level whereas statistical epistasis necessarily is based upon populations. There is no direct relationship between these two definitions of epistasis and the existence of a number of possible parameterizations of the penetrances (parameters that define the genotype-phenotype relationship for binary traits) mean that the significance of the interaction terms maybe scale dependent [

9]. In GAIA we utilise the log odds of the penetrance; this function is widely used in epidemiological studies and yields results comparable to those obtained from standard contingency tables when applied to single SNPs. For further discussion of the biological/statistical epistasis issue see [

9,

14].