A major controversy exists in determining significance levels for candidate gene or genome-wide association scans using single-nucleotide polymorphism (SNP) data. Regardless of whether each SNP is analyzed one at a time or as part of a haplotype, the number of individual tests can become very large and can lead to an inflated type I error rate. Bonferroni correction is not an appropriate solution, given the correlation between tests in most SNP settings. Instead, permutation testing has been the gold standard for determining the significance level for SNP genome scans and candidate gene studies; however, it is computationally intensive and time-consuming. Recently, a simpler method to determine the significance level for SNP association studies has been proposed that relies on the linkage disequilibrium (LD) structure of the genome to determine the number of independent tests [1
]. This method uses principal components (PC) on pair-wise LD measures to determine the number of independent tests and uses this number as the denominator in a Bonferroni correction to the unadjusted p
-values. However, using the pair-wise LD measures between SNPs does not explicitly take into account the haplotype block structure of the human genome. We propose to use the sum of LD blocks defined for a set of SNPs plus singleton (not block-related) SNPs as the appropriate number for multiple test correction. However, the choice of block definition is also an issue of considerable controversy. A recent paper compared 3 measures of haplotype blocks, including a LD-based method described by Gabriel et al. [2
], a recombination-based method developed by Hudson and Kaplan [3
], and a diversity-based method proposed by Patil et al. [4
], and found low levels of agreement between them; The number of haplotype blocks and the haplotype block boundaries differed greatly across methods [5
]. Therefore, the number of independent tests determined across these algorithms may differ widely from one another and from the number determined by PC analysis.
We proposed to obtain the number of independent tests using 3 haplotype blocking algorithms: the method described by Gabriel et al. [2
] (Gabriel), the 4-gamete test [3
] (4GT), and the solid spine of LD measure (SSLD), as implemented in the program HAPLOVIEW [6
], and using the number of components derived from PC analysis [1
] to use for a Bonferroni-type correction. We also considered the traditional Bonferroni correction (assuming all tests are independent) and the unadjusted p
-value type I error rate.
The blocking method of Gabriel et al. [2
] describes a LD block as a contiguous set of SNPs in which 95% of pairwise D' confidence interval (CI) values are considered to be in strong LD (CI minima for upper CI bound = 0.98; CI minima for lower CI bound = 0.70). The 4-gamete rule of Hudson and Kaplan [3
] relies on historical recombination events to determine haplotype blocks. At each pair-wise contiguous set, the frequency of observed 2-SNP haplotypes is assessed; if at least 1 haplotype is observed with a frequency of less than 1% then that SNP is added to the block. A block is terminated when a recombination event is assumed to have taken place; that is, when all 4 possible 2-SNP haplotypes are observed with a frequency of greater than 1%. The SSLD method creates blocks of SNPs that have contiguous pairwise D' values of greater than 0.8.