Admixture mapping is based on the assumption that some susceptibility variants will be associated with continental ancestry and that this association can be discerned in admixed populations by examining linkage to ancestry. Several different algorithms and computational programs have been developed to facilitate admixture mapping [3
]. Each of these relies on using a hidden Markov model (HMM) to determine ancestral states along the chromosome (transition probabilities) on the basis of the typing results of markers that are informative for ancestry. The model formulation relies on the prior probability that any given locus in the current generation is derived from one of the founder populations, and depends on the occurrence of different ancestral states along a chromosome that are a result of recombinant events in previous generations since admixture. This HMM approach is designed to infer the unobserved local ancestries for each individual and uses multipoint information from linked markers. Thus, the transition probabilities in HMM, simulated by Poisson arrivals, provide an approximation to the correlations in ancestry between linked markers.
Although the actual underlying model is unknown, simulation studies have shown the ability of these methods to discern ancestry linkage in various admixture models and conditions. These approaches use either case–control analyses or both case–control and case-only analyses. For case-only algorithms, the detection of linkage to ancestry is based on the difference in distribution of the ancestry of chromosomal segments for the loci associated with disease as compared with those in which there is no association. We note that appropriate genome-wide α levels (i.e. the meaningful significance level) should be less when both case-only and case–control algorithms are used to analyze the same set of probands; however, extensive simulations will be necessary to establish this relationship.
Computational programs that are readily available include AncestryMap [5
], Structure/MALDsoft [9
] and Admix-Map [4
]. In our application of these programs to simulated data based on real genotypes, my colleagues and I [8••
] have found that, although each algorithm can yield appropriate results in many models, the AdmixMap program performs the best when the admixture model is more complex (more generations and different gene flow schemes). Unlike the other methods, AdmixMap estimates both the admixture proportion and the number of generations for each gamete in each individual. We have also noted that the potential issue of LD within parental populations does not seen to be a problem with our current AIM sets using this algorithm [8••
A new Markov-HMM (MHHM) algorithm has been recently developed that explicitly accounts for LD in parental populations [27•
]. The power in real or simulated data sets using real genotyping data has not, however, been robustly examined; thus, the efficacy of this algorithm in admixture mapping is not yet clear. This study has also suggested that the MHHM algorithm will enable admixture mapping without the use of AIMs. In practice, this approach might be problematic because some or many SNPs might not have the appropriate characteristics – namely, little variation in allele frequencies within multiple parental populations that could have contributed to one continental founder population (see ‘Ancestry informative markers’ above).