BAC end sequences have been shown to be a powerful tool for developing molecular markers. BAC-derived markers can be used to integrate physical maps with genetic maps [36
], and also facilitate map-based cloning [44
]. About 32.8 Mbp of peanut genomic sequences obtained from BAC end sequences was used for mining of SSR markers in this study. We also report a detailed analysis of these BAC derived SSR. Surprisingly, a large proportion of BES possess similarity to gene sequences (44.9% of the wild A. duranensis
BES and 38.7% of the cultivated A. hypogaea
BES), though we note that a significant fraction of these gene-containing BES may derive from retroelements. SSR frequency was 44.2 and 36.6 SSR/Mbp in wild and cultivated BAC clones, respectively. These differences may reflect differences between the genomes of the cultivated tetraploid genome and the wild diploid genome, but perhaps more likely reflect differences in the nature of the BAC clones (i.e., enriched for disease resistance genes in the case of A. hypogaea
and randomly selected clones in the case of A. duranensis
) or differences in the construction methods of the two BAC libraries (random-shear and partial Hin
dIII cleavage, respectively).
Allelic diversity estimated for 148 BES derived SSR polymorphic markers was an average of 3.2 alleles per locus, and ranged from 2 to 8 alleles at each locus based on eight genotypes tested. This level of allelic diversity is lower than that reported in previous studies, including allele of 3 to 19 (mean 6.9) for 48 Valencia genotypes, 2 to 27 (mean 8.4) for 60 Brazilian genotypes [20
], and 2 to 20 (mean 10.1) among 141 genotypes from the US mini core collection and wild species. However, Cuc et al. (2008) reported allele numbers ranging from 2 to 5 with a mean of 2.44 in 32 genotypes [23
]. Although allelic diversity can be used as an indicator of genetic variation, such values are relative and depend on the number of polymorphic loci and the relatedness of genotypes analyzed. In this study, only 8 genotypes were used, all representing cultivated materials, both of which set upper ranges on the number of polymorphic loci that could be identified.
Much publicly available SSR data has been derived from AG repeat motif sequences using enrichment methods involving hybridization to SSR probes [17
]. The use of AT sequences in such procedures is generally avoided because of the potential for the probe to form a hairpin structure, and thus to function inefficiently. Interestingly the current analysis of BES derived SSRs found that SSRs with AT motifs were the most frequent. The randomly selected BAC clones used for developing SSR markers most likely are a good representation of the diploid peanut genome, and thus the distributions and frequencies of the SSRs identified in this study are likely to be a good reflection of their genome-wide frequencies. Comparison of polymorphism rates among AT SSRs and AG SSRs shows that the former has a somewhat higher polymorphism rate than the latter (Table ). For trinucleotide SSRs, polymorphism of the AAT SSRs is 3.2-fold higher than the polymorphism rate of AAG SSRs (Table ). This result suggests that AT-rich SSR loci may have relatively high variability in peanut. Several studies have reported that SSRs with larger numbers of repeats have correspondingly higher rates of polymorphism [21
]. Temnykh et al. [41
] have suggested that SSRs could be divided into two classes: Class I were long and hypervariable markers, and Class II were short and typically less variable markers. The mutation rate of SSRs increases with repeat number, but long SSRs in eukaryotic genomes have a mutation bias to become shorter SSRs [47
]. Our data also showed that both dinucleotide and trinucleotide SSRs in Class I detected more polymorphism than those in Class II. The finding suggested that it is worth developing markers based on long AT-rich SSRs to provide new informative SSR markers in peanut.
In peanut, several efforts have been made to construct genetic linkage maps and to meet the pre-requisites for marker-assisted selection in breeding and map-based cloning of desirable genes. The first genetic linkage map in peanut was constructed by Halward et al. [48
] in an F2 population derived from a cross between two diploid wild species A. stenosperma
and A. cardenasii
using RFLP markers. Another RFLP-based map was developed from a BC1 tetraploid population of a synthetic amphidiploid A. batizocoi
x (A. cardenasii
x A. diogoi
) crossed with cv. Florunner [49
]. However, insufficient variability detected by RFLPs or RAPDs within A. hypogaea
germplasm has hindered the construction of a genetic linkage map directly in cultivated peanut [48
]. As more SSRs have been developed during the past decade, several SSR-based maps have been constructed, including AA genome and BB genome maps in wild x wild species populations [21
], and some genetic maps in cultivated x cultivated populations [29
]. These cultivated maps contain only ~200 SSR loci and thus require additional markers if they are to have utility for peanut breeding. The genetic map constructed by Hong et al. (2010) consisted of 175 SSR loci with a total coverage of 885.4 cM [50
], this map was a consensus constructed using three cultivated x cultivated mapping populations. In this study, a large number of BES-SSRs were incorporated into a cultivated genetic map, increasing the number of mapped markers to 318 in a single cultivated mapping population, enlarging the coverage to 1,674 cM, and reducing average distance between two adjacent markers to 5.3 cM. However, there were seven groups containing only 3 or fewer loci and 49 loci could not be mapped, indicating that the linkage map remains incomplete.
The use of common markers between the present map and previous maps allows a comparison of recombination frequency and marker order among mapping populations. Five linkage groups in the present map were chosen to compare with the first SSR-based peanut map [30
], because there were several markers incommon between two maps (Figure ). Comparison of these two maps reveals both conservation of marker order and rearranged order the between two populations. For instance, LG7 and LG10 in present map had the same marker order as LG_AhIII and LG_AhVI in previous map. Three other linkage groups, however, revealed rearrangements in marker order between two mapping populations (Figure ).
Comparison of marker colinear in present and previous linkage maps.
Ten percent of polymorphic SSRs surveyed more than one genetic locus. This situation, which has been noted previously by other authors [30
], is presumed to derive from the high similarity of the two subgenomes of tetraploid A. hypogaea
. In this study, most (82%) of the SSRs that amplified more than one locus were long SSRs (> 30 bp) or compound SSRs. Strachan and Read (1999) reported that SSRs with high repeat numbers are unstable during mitosis and meiosis in humans [51
], and as a consequence are highly variable. By analogy, long SSRs are more likely to reveal distinct polymorphisms in the two subgenomes of allotetraploid peanut.
In parallel to the analysis of BAC-derived SSR markers, we have also used the TRAP marker technique to determine the potential application in peanut. Only seven (~2%) out of 400 TRAP primer pairs were polymorphic and were placed on the linkage map. This indicates that TRAP markers, the generation of which involves the use of arbitrary primers, have only a low chance of detecting polymorphism in the narrow genetic base in peanut.
One advantage of using BAC-derived markers in genetic linkage map construction is that physical contigs can be anchored on genetic map by mapping BAC-derived markers [37
]. The integrated map would be useful in marker-assisted selection to introgress genes of interest into elite cultivars when BAC-derived markers reside in these genes, and would facilitate map-based cloning of genes and QTLs. In this study, 105 BAC-derived SSRs related to RGH-based physical contigs and singletons were developed. Only three of these SSRs were polymorphic between the two parents; two of these SSRs were mapped, thus anchoring 1 contig and 1 singleton into the current genetic map. The other marker, related to one RGH contig consisting of three clones was unmapped. Genetic linkage of candidate gene BAC contigs with phenotypes should enhance the opportunity for targeted marker development. In particular, the sequences of the BAC clones can provide the substrate for new marker development. Although we have mapped only a small number of RGH-containing BAC clones in this work, we have identified BAC contigs that contain the vast majority of ~580 peanut NBS-LRR RGH sequences (Rosen, He and Cook, unpublished data). Targeted marker development from these BAC clones holds potential to enhance the current peanut SSR framework with a large number of high value disease resistance candidate genes, and thus define the landscape of R-genes in peanut genome.