Single nucleotide polymorphisms (SNPs) and insertion/deletion events (Indels) represent the most frequent polymorphisms found in eukaryotic genomes. For example, in humans the frequency of SNP polymorphisms is one per kilobase and given the large size of the human genome the total number of SNPs has been estimated to be over of 3.1 million [
1,
2]. Similarly, high SNP frequencies have been reported in plant genomes, especially in out-crossing species, but the discovery process has been slower despite the small genomes of some species. Examples include grapevine (
Vitis vinifera L.), an out-crossing species, where one SNP occurs every 78 bp [
3] or maize (
Zea mays L.) where the average frequency of SNPs was one every 43 bases in 1,088 maize gene sequences and where Indels were also common [
4]. In a self pollinated species such as soybean (
Glycine max (L.) Merr), the SNP frequency was reported as one SNP every 191 bp in non-coding regions and one SNP every two kilobases in coding regions based on 15 genotypes and 35 genomic or gene fragments [
5]. Rice (
Oryza sativa L.), another inbreeding species, had one SNP every 300 bp in coding regions and one SNP every 37 bp in transposable elements when comparing
indica and
japonica subspecies [
6] and recently 159,478 high-quality, non-redundant SNPs were found across the entire rice genome [
7]. In this study, our interest was to develop SNP and Indel based markers for common bean (
Phaseolus vulgaris L.), an important legume in terms of food security but one that has been less well studied as it is found mostly in developing countries.
Expressed sequence tag (EST) libraries offer important information for species that have not been sequenced and are a central source of gene-based markers and SNP or Indel polymorphisms. Discovery of these polymorphisms usually involves alignment of sequences obtained from the sequencing of EST libraries from different genotypes of the same species [
8] or from re-sequencing of PCR fragments [
9]. EST-based markers are valuable because they represent sequences that are transcribed and therefore can potentially be associated with phenotypic differences. Furthermore, EST based markers are often highly conserved between species allowing the construction of transcript maps and synteny comparisons between genomes.
EST analysis in common bean shows that SNP frequency appears to be similar or higher than in other self-pollinating species although fairly few studies have analyzed their relative abundance across different regions of the genome or across the wide diversity of common bean accessions. In a pioneering study for the crop, Ramirez et al. [
10] found that SNP frequency in EST sequences from two genotypes of common bean (the Andean G19833 versus the Mesoamerican Negro Jamapa) was 529 SNPs in 214 kb of SNP-containing contigs, with a frequency of one SNP every 387 bp in this inter-genepool comparison. Recently, Gaitán-Solís et al. [
11] reported 239 SNPs and 133 Indels in 45 gene-coding and non-coding fragments analyzed in 10 cultivated and wild bean genotypes belonging to the Mesoamerican and Andean gene pools finding an average frequency of one SNP every 88 bp and one Indel every 157 bp. The high frequency of SNPs and overall genotype diversity in common bean makes this species amenable to SNP marker development.
EST conversion to SNP based molecular markers and their use in saturation or comparative mapping has been an important recent area of research and several techniques for SNP analysis have been reported [
12]. For example, in common bean three methods have been used for EST marker conversion based on SNP polymorphisms. In the first, cleaved amplified PCR fragment techniques (CAPS and dCAPS) were used to convert EST based polymorphisms into genetic markers [
13]. A second attempt involved a high-throughput system named Luminex-100 which was used to confirm SNP calls in DNA from 10 common bean genotypes, finding 2.5% of SNPs were miscalled and 1% had no signal as compared with direct sequencing [
11]. In an effort to simplify SNP analysis, Galeano et al. [
14] used CEL I mismatch digestions to analyze and map SNP-based, EST-derived markers, finding that the method worked well with SNPs located in the middle of amplification fragments and that digestion products could be visualized on agarose gels.
Some of these techniques require specialized equipment or ingredients, which some molecular marker laboratories may not have. Therefore, a recent goal in our laboratory has been to identify a gel-based alternative that does not require restriction enzyme digestion. In this regard, we have found single strand conformational polymorphism (SSCP) analysis to be a good alternative. The SSCP technique is based on conformational differences of single stranded DNA fragments that can be detected as mobility shifts in non-denaturing polyacryilamide gel electrophoresis [
15]. This technique is easy and inexpensive to implement as we show in this study and has been used to analyzing gene or EST derived SNP markers in various plant species such as wheat (
Triticum aestivum L.)[
16,
17], barley (
Hordeum vulgare L.)[
18], grapevine (
Vitis vinifera L.)[
3], cassava (
Manihot esculenta Crantz)[
19], pearl millet [
Pennisetum glaucum (L.) R.Br.][
20] and
Pinus species [
21].
In this study, our objective was to develop and map SSCP markers on an integrated genetic map for common bean using EST or gene-based markers from various sources. The molecular mapping of genic SNPs and Indels through this technique also provided the basis to analyze synteny of homologous loci across the legumes. In relation to this, the genetic map information and the marker sequences were used for an analysis of macro-synteny between the genome of common bean and the genomes of Glycine max, Lotus japonicus (Lotus) and Medicago truncatula (Medicago).