In , we present the dependence of imputation accuracy rates on a weighted average of LD at the default threshold value (qc=0.9), where we impose a posterior probability equal to 0.90 as a threshold value to accept the imputed genotypes.
Fig. 1 Imputation accuracy rate as a function of a weighted average of linkage disequilibrium (LD) ((a) MACH and (b) IMPUTE) and minor allele frequency (MAF) ((c) MACH and (d) IMPUTE) at the default confidence threshold value (qc=0.9) using HapMap 3 CEU population. (more ...)
With a reference panel (HM3CEU), MACH and IMPUTE were used to impute SNPs that are not genotyped in the sample but that are genotyped in the reference panel. As expected, imputing genotypes at SNPs that are in strong LD with genotyped markers is much more likely to produce correct genotypes. Imputation accuracy strongly depends on a weighted average of LD.
In addition, shows the accuracy of imputed genotypes as a function of the SNPs’ minor allele frequency (MAF). Imputation of SNPs with a lower MAF appears to be more accurate than imputation of SNPs with a higher MAF. Overall, IMPUTE and MACH had a similar performance.
shows the proportion of SNPs that have imputation accuracy rates equal to or exceeding 90% as a function of the number of SNPs with missing values less than 10% at different threshold values of the posterior probability. In general, the imputation accuracies increase if the threshold value for the posterior probabilities of genotypes is raised with the expense of more missing data. Within two compared reference panels (HM3CEU and G1KCEU), HM3CEU produced better imputation accuracy at different threshold values for MACH. Within three compared reference panels (HM3CEU, G1KCEU, and HM3CEU+G1KCEU), imputation accuracy rates were highest when using IMPUTE and the combined reference panel (HM3CEU+G1KCEU). This result suggests that the large increase in the number of both SNPs and samples in the reference panel allows more accurate imputation of most ungenotyped SNPs.
Proportion of SNPs that have imputation accuracy rates equal to or exceeding 90% as a function of the number of SNPs with missing values ≤ 10% at different threshold values: (a) MACH and (b) IMPUTE.
We compared the imputation accuracy of two imputation methods (MACH, IMPUTE) within the same reference panel. The results are shown in . For HM3CEU, MACH consistently yields higher imputation accuracy rates than IMPUTE. By contrast, for G1KCEU, MACH has slightly lower imputation accuracy rates.
Proportion of SNPs that have imputation accuracy rates equal to or exceeding 90% as a function of the number of SNPs with missing values less than 10% at different threshold values: (a) HM3CEU and (b) G1KCEU.