With the knowledge of the answers for the simulated dataset, we decided to isolate the disease locus D2 with standard methods, i.e., linkage analysis followed by association analysis. A complementary analysis presented from our group showed a significant linkage peak in the telomeric region of chromosome 3 [1
]. This peak was observed within a region expected considering the physical location of locus D2. Our aim was to determine whether association studies could be used to fine map this disease locus of interest. Analysis of a 12-cM region near the telomere of chromosome 3 revealed a statistically significant association between genotypes at 4 markers (B03T3056, B03T3057, B03T3058, and C03R0281) and the clinical characteristics e, f, h, and k. These loci were significant after correction for multiple tests. One limitation of our study is that our smaller samples (n
= 50) were slightly underpowered given the large number of loci tested. However, we were most interested in determining whether a linkage peak could be narrowed down using the association tests so we focused on the pattern of results across the region rather than the actual significance level.
Without prior knowledge of the disease locus one would feel that this study was highly successful. Linkage analysis identified a minimal disease region of approximately 3 cM linked to the disease trait [1
]. We showed significant association with the disease trait and a narrow region (<1 cM) of chromosome 3, thus fine mapping a region associated with an increased risk of disease. However, the simulated disease model clearly stated that the disease locus D2 was linked to the last marker on the chromosome. This marker was nearly 3 cM from the peak of significant association. This finding was replicated using the original affection status, indicating that this result was not due to redefining the affection status.
Several groups using various analytical methods at Genetic Analysis Workshop 14 observed the same association peak. The association peak 3 cM from the actual disease locus most likely resulted from the methods used for the data simulation. For this reason our results and conclusions may not be directly applicable to real datasets. With this in mind we are able to make some general conclusions. Case selection can greatly influence both power and locus detection. As shown in the analysis in which MERLIN was used to select the 250 cases from linked families, power increased dramatically over that observed from 250 randomly selected cases. Power may be increased by refining the phenotype definition, although this was not necessary for locus detection in this particular example. As expected, an increase in the number of selected cases not only increased the significance of association, but also decreased the variability in observed significance. However, even the slightly under-powered replicates were able to detect the correct location of association. We were able to show that a case-control approach could detect an association in a linked region but because there was very little LD present in the simulated dataset, we could not determine how well this approach could fine map an actual disease locus.