Our goal in this study was to develop and evaluate a computationally efficient method of hypothesis testing that is comparable to permutation testing for the assessment of the statistical significance of MDR models. We conclude that, dependent on the shape parameter, either the Gumbel EVD or the GEV distribution estimated from the distribution of MDR testing accuracies generated from 20 permutations is a reasonable alternative to 1000-fold permutation testing. Further, we have demonstrated that the EVD method and 1000-fold permutation testing generate similar results in a previously analyzed bladder cancer susceptibility study [Andrew et al. 2006
]. We showed that hypothesis testing using the Gumbel EVD or the GEV is a viable alternative to large-scale permutation testing because it preserves both the power and size of MDR. Further, a statistical test based on 20 permutations is 50 times faster than a 1000-fold permutation test and 500 times faster than 10000-fold permutation test. This means that a permutation test that might take 50 days to run will now run in a single day.
The rapid growth and availability of high-dimensional datasets from genome-wide studies makes it computationally expensive and impractical to routinely carry out large-scale permutation testing to assess the statistical significance of data mining methods such as MDR. To illustrate the intensity of the analysis alone, consider that the report from the International HapMap Consortium [Altshuler et al., 2005
] suggests that approximately 300,000 carefully selected SNPs may suffice to represent most of the relevant genetic variation across the human Caucasian genome. If this is to be regarded as the lower limit of a genome-wide association study, then approximately 4.5 × 1010
pairwise combinations (300,000 choose 2) and 4.5 × 1015
three-way combinations (300,000 choose 3) would need to be exhaustively analyzed to detect low-order epistasis using MDR. If 106
MDR evaluations can be computed each second, then evaluation of each individual SNP would require less than one second of computer time. However, computing all two-way and three-way MDR models would require more than 52,000 days of computer time. Access to a 1,000 processor supercomputer might reduce this to 52 days which is within the realm of possibility. However, then running a 1000-fold permutation test would not be feasible. This is only one of many challenges for detecting epistasis on a genome-wide scale [Ritchie and Moore, 2004].
We are not the first to suggest using the EVD to reduce the number of permutations necessary to determine statistical significance for genetic and genomic studies. For example, Dudbridge and Koeleman 
noted that it is becoming more common and feasible to conduct large-scale screens for disease associations, genome-wide linkage disequilibrium scans, and array-expression experiments. They recognized that these studies encounter issues concerning correlated data that are addressed by permutation testing which, as we have discussed, can be computationally impractical. Similarly, they propose a solution to this problem that suggests that analytic distributions, such as an EVD, can be fit to permutation distributions. They use genome-wide SNP data released by the International HapMap consortium to compare the efficiency and accuracy of their method to permutation testing and find that their method demonstrates both adequate accuracy and a 40% reduction in computation. Our results support their conclusions.
A challenging goal in human genetics is to determine which of the many thousands of SNPs are useful for predicting who is at risk for common diseases. It was nearly a decade ago that Risch and Merikangas first seriously proposed the testing of all known SNPs in the human genome for disease association either directly or by linkage disequilibrium with other SNPs [Risch and Merikangas, 1996
]. Today it is possible to measure more than one million SNPs with widely available human SNP arrays. Unfortunately, there is a lack of powerful methodology to summarize and interpret this quantity of information within a biological context. Thus, our ability to measure genetic information, and biological information in general, is far outpacing our ability to interpret it [Moore and Williams, 2002
]. In the current study, we primarily address the computational efficiency of large-scale genetic analyses of epistasis. However, another important concern with conducting these analyses with a method such as MDR is that there may be a certain amount of important information potentially lost by limiting results to one best model. An interesting future direction would be to develop hypothesis testing methods that are able to identify a best set of statistically significant MDR models rather than a single best model. The EVD could certainly be used to investigate the significance of a second-best model, a third best model, etc. As an additional goal, it would be nice to move away from permutation testing entirely. For example, it might be useful to develop a hypothesis testing approach based on the cross-validation results. These types of computationally efficient hypothesis testing methods are critical for the analysis of epistasis in genome-wide association studies.