For humans and model organisms, such as worms and flies, the availability of high-density sequence polymorphism maps greatly facilitates the rapid mapping and cloning of genes [1
]. Key advantages of most molecular polymorphisms are the fact that they are codominant and in general phenotypically neutral. The vast majority of sequence polymorphisms are single-nucleotide polymorphisms (SNPs).
The most direct approach for SNP detection is sequencing of a PCR product spanning the polymorphism, but this is too costly and labor intense for high-throughput genotyping. For this reason, several different strategies and methods have been developed in order to detect SNPs more efficiently. In general, assays can be grouped into strategies, where the nature of the SNP is determined by directly analyzing the primary PCR product and those that require a secondary assay performed on the primary amplification product [4
]. An important strategy of the first group is the 5' nuclease assay, where allele-specific, dual-labeled fluorescent TaqMan probes guarantee specificity [7
]. However, the need for two dual-labeled fluorescent probes, expensive specialized chemistry and specialized machinery increase the costs per assay of this approach significantly. Similarly, denaturing high-performance liquid chromatography (DHPLC) also analyses the primary amplification product [8
]. This approach is based on melting differences of homo- versus heteroduplex DNA fragments under increasingly denaturing conditions and requires no specific labeling of the PCR fragments. However, conditions have to be optimized for every assay, throughput is limited and specialized equipment is required. DHPLC has been used in small-scale genotyping projects in Drosophila melanogaster
Of the methods that detect the SNP in a secondary assay, restriction fragment length polymorphism (RFLP) analysis are very popular [10
]. For this purpose, only those SNPs that alter a restriction site are analyzed. A great advantage of RFLP analysis is that no specialized equipment is needed and it can be carried out in every laboratory. RFLP maps recently established for Caenorhabditis elegans
are used regularly in genotyping projects [2
]. However, RFLP analysis requires significant manual input. Moreover, the use of different restriction enzymes with different reaction requirements adds another level of complexity that makes this method difficult to automate. Primer-extension-based technologies have also gained some prominence [12
]. Here, a primer that anneals right next to the polymorphism is extended by one polymorphism-specific terminator nucleotide. Extension products are analyzed by size or, alternatively, by differences in the behavior of incorporated versus non-incorporated terminator nucleotides under polarized fluorescent light [13
]. Swan and colleagues [14
] have developed a set of fluorescence polarization-template directed incorporation (FP-TDI) assays for C. elegans
. However, this approach is labor intensive and requires specialized chemistry and equipment. Using DNA microarrays, large numbers of SNPs can be analyzed in parallel, but the number of individuals that can be analyzed is low because of the high cost per chip [15
Besides SNPs, short tandem repeats (STRs) or microsatellites represent another class of sequence polymorphisms used for genotyping [17
]. STRs result in fragment length differences that are either detected on gel-based or capillary sequencers or high-resolution hydrogels (Elchrom Scientific Inc.). One advantage of STRs over SNPs is that they are highly polymorphic and are thus ideal for measuring the degree of variability in natural populations. STRs are, however, present at much lower density than SNPs and are therefore not suitable for high-resolution mapping of genes.
Interestingly, a significant proportion of the currently available polymorphisms are caused by small insertions or deletions (InDels). Weber et al
] identified a genome-wide set of about 2,000 human InDel polymorphisms and estimated that InDels comprise at least 8% and up to 20% of all human polymorphisms. This is in line with the findings of Berger and co-workers [2
] who found that 16.2% of polymorphisms in Drosophila
are of the InDel type. Also, two independent studies in C. elegans
found that InDels constitute between 25% and 28% of all polymorphisms [3
]. In addition, those studies found that the vast majority of InDels are due to 1-2 base-pair (bp) differences (65% in Drosophila
], 84% in C. elegans
To take full advantage of this class of small InDel polymorphisms, we have developed a strategy that allows us to detect most, if not all, InDels by analyzing the lengths of primary PCR products on a capillary sequencer at single base-pair resolution. We call these assays fragment length polymorphism (FLP) assays. Importantly, this approach can easily be automated on standard robotic pipetting platforms as it involves a simple PCR reaction setup. Furthermore, allele calling is performed automatically using the Applied Biosystems GeneMapper software commonly used for genotyping STRs (Materials and methods).
To demonstrate the feasibility of this strategy, we have validated 112 evenly spaced FLP assays at 3 centimorgan (cM) resolution in C. elegans (one every 0.9 megabase-pair (Mbp)) and 54 FLP assays at 4 cM resolution for the Drosophila autosomes. This set of FLP assays allows us to rapidly map mutations to small chromosomal subregions with a minimum of manual input. Furthermore, we provide a list of predicted InDels for which additional assays can be readily designed in the chromosomal subregion of interest. Those non-validated FLPs enhance the resolution of the map by a factor of 5.6 and 17.9, respectively.
We show the usefulness of this approach by identifying novel alleles of previously characterized genes. In summary, we have taken advantage of a publicly available dataset to adapt a technology widely used for STR analysis to genetic mapping. Thanks to the complete automation of genotyping, this approach is considerably faster, more reliable and cheaper than previously used mapping strategies in C. elegans or Drosophila.