Expressed Sequence Tags (ESTs) are a source of simple sequence repeats (SSRs) that can be used to develop molecular markers for genetic studies. The availability of ESTs for Quercus robur and Quercus petraea provided a unique opportunity to develop microsatellite markers to accelerate research aimed at studying adaptation of these long-lived species to their environment. As a first step toward the construction of a SSR-based linkage map of oak for quantitative trait locus (QTL) mapping, we describe the mining and survey of EST-SSRs as well as a fast and cost-effective approach (bin mapping) to assign these markers to an approximate map position. We also compared the level of polymorphism between genomic and EST-derived SSRs and address the transferability of EST-SSRs in Castanea sativa (chestnut).
A catalogue of 103,000 Sanger ESTs was assembled into 28,024 unigenes from which 18.6% presented one or more SSR motifs. More than 42% of these SSRs corresponded to trinucleotides. Primer pairs were designed for 748 putative unigenes. Overall 37.7% (283) were found to amplify a single polymorphic locus in a reference full-sib pedigree of Quercus robur. The usefulness of these loci for establishing a genetic map was assessed using a bin mapping approach. Bin maps were constructed for the male and female parental tree for which framework linkage maps based on AFLP markers were available. The bin set consisting of 14 highly informative offspring selected based on the number and position of crossover sites. The female and male maps comprised 44 and 37 bins, with an average bin length of 16.5 cM and 20.99 cM, respectively. A total of 256 EST-SSRs were assigned to bins and their map position was further validated by linkage mapping. EST-SSRs were found to be less polymorphic than genomic SSRs, but their transferability rate to chestnut, a phylogenetically related species to oak, was higher.
We have generated a bin map for oak comprising 256 EST-SSRs. This resource constitutes a first step toward the establishment of a gene-based map for this genus that will facilitate the dissection of QTLs affecting complex traits of ecological importance.
The Fagaceae family comprises about 1,000 woody species worldwide. About half belong to the Quercus family. These oaks are often a source of raw material for biomass wood and fiber. Pedunculate and sessile oaks, are among the most important deciduous forest tree species in Europe. Despite their ecological and economical importance, very few genomic resources have yet been generated for these species. Here, we describe the development of an EST catalogue that will support ecosystem genomics studies, where geneticists, ecophysiologists, molecular biologists and ecologists join their efforts for understanding, monitoring and predicting functional genetic diversity.
We generated 145,827 sequence reads from 20 cDNA libraries using the Sanger method. Unexploitable chromatograms and quality checking lead us to eliminate 19,941 sequences. Finally a total of 125,925 ESTs were retained from 111,361 cDNA clones. Pyrosequencing was also conducted for 14 libraries, generating 1,948,579 reads, from which 370,566 sequences (19.0%) were eliminated, resulting in 1,578,192 sequences. Following clustering and assembly using TGICL pipeline, 1,704,117 EST sequences collapsed into 69,154 tentative contigs and 153,517 singletons, providing 222,671 non-redundant sequences (including alternative transcripts). We also assembled the sequences using MIRA and PartiGene software and compared the three unigene sets. Gene ontology annotation was then assigned to 29,303 unigene elements. Blast search against the SWISS-PROT database revealed putative homologs for 32,810 (14.7%) unigene elements, but more extensive search with Pfam, Refseq_protein, Refseq_RNA and eight gene indices revealed homology for 67.4% of them. The EST catalogue was examined for putative homologs of candidate genes involved in bud phenology, cuticle formation, phenylpropanoids biosynthesis and cell wall formation. Our results suggest a good coverage of genes involved in these traits. Comparative orthologous sequences (COS) with other plant gene models were identified and allow to unravel the oak paleo-history. Simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs) were searched, resulting in 52,834 SSRs and 36,411 SNPs. All of these are available through the Oak Contig Browser http://genotoul-contigbrowser.toulouse.inra.fr:9092/Quercus_robur/index.html.
This genomic resource provides a unique tool to discover genes of interest, study the oak transcriptome, and develop new markers to investigate functional diversity in natural populations.
Genetic markers and linkage mapping are basic prerequisites for marker-assisted selection and map-based cloning. In the case of the key grassland species Lolium spp., numerous mapping populations have been developed and characterised for various traits. Although some genetic linkage maps of these populations have been aligned with each other using publicly available DNA markers, the number of common markers among genetic maps is still low, limiting the ability to compare candidate gene and QTL locations across germplasm.
A set of 204 expressed sequence tag (EST)-derived simple sequence repeat (SSR) markers has been assigned to map positions using eight different ryegrass mapping populations. Marker properties of a subset of 64 EST-SSRs were assessed in six to eight individuals of each mapping population and revealed 83% of the markers to be polymorphic in at least one population and an average number of alleles of 4.88. EST-SSR markers polymorphic in multiple populations served as anchor markers and allowed the construction of the first comprehensive consensus map for ryegrass. The integrated map was complemented with 97 SSRs from previously published linkage maps and finally contained 284 EST-derived and genomic SSR markers. The total map length was 742 centiMorgan (cM), ranging for individual chromosomes from 70 cM of linkage group (LG) 6 to 171 cM of LG 2.
The consensus linkage map for ryegrass based on eight mapping populations and constructed using a large set of publicly available Lolium EST-SSRs mapped for the first time together with previously mapped SSR markers will allow for consolidating existing mapping and QTL information in ryegrass. Map and markers presented here will prove to be an asset in the development for both molecular breeding of ryegrass as well as comparative genetics and genomics within grass species.
One of the key goals of oak genomics research is to identify genes of adaptive significance. This information may help to improve the conservation of adaptive genetic variation and the management of forests to increase their health and productivity. Deep-coverage large-insert genomic libraries are a crucial tool for attaining this objective. We report herein the construction of a BAC library for Quercus robur, its characterization and an analysis of BAC end sequences.
The EcoRI library generated consisted of 92,160 clones, 7% of which had no insert. Levels of chloroplast and mitochondrial contamination were below 3% and 1%, respectively. Mean clone insert size was estimated at 135 kb. The library represents 12 haploid genome equivalents and, the likelihood of finding a particular oak sequence of interest is greater than 99%. Genome coverage was confirmed by PCR screening of the library with 60 unique genetic loci sampled from the genetic linkage map. In total, about 20,000 high-quality BAC end sequences (BESs) were generated by sequencing 15,000 clones. Roughly 5.88% of the combined BAC end sequence length corresponded to known retroelements while ab initio repeat detection methods identified 41 additional repeats. Collectively, characterized and novel repeats account for roughly 8.94% of the genome. Further analysis of the BESs revealed 1,823 putative genes suggesting at least 29,340 genes in the oak genome. BESs were aligned with the genome sequences of Arabidopsis thaliana, Vitis vinifera and Populus trichocarpa. One putative collinear microsyntenic region encoding an alcohol acyl transferase protein was observed between oak and chromosome 2 of V. vinifera.
This BAC library provides a new resource for genomic studies, including SSR marker development, physical mapping, comparative genomics and genome sequencing. BES analysis provided insight into the structure of the oak genome. These sequences will be used in the assembly of a future genome sequence for oak.
The construction of genetic linkage maps for cultivated peanut (Arachis hypogaea L.) has and continues to be an important research goal to facilitate quantitative trait locus (QTL) analysis and gene tagging for use in a marker-assisted selection in breeding. Even though a few maps have been developed, they were constructed using diploid or interspecific tetraploid populations. The most recently published intra-specific map was constructed from the cross of cultivated peanuts, in which only 135 simple sequence repeat (SSR) markers were sparsely populated in 22 linkage groups. The more detailed linkage map with sufficient markers is necessary to be feasible for QTL identification and marker-assisted selection. The objective of this study was to construct a genetic linkage map of cultivated peanut using simple sequence repeat (SSR) markers derived primarily from peanut genomic sequences, expressed sequence tags (ESTs), and by "data mining" sequences released in GenBank.
Three recombinant inbred lines (RILs) populations were constructed from three crosses with one common female parental line Yueyou 13, a high yielding Spanish market type. The four parents were screened with 1044 primer pairs designed to amplify SSRs and 901 primer pairs produced clear PCR products. Of the 901 primer pairs, 146, 124 and 64 primer pairs (markers) were polymorphic in these populations, respectively, and used in genotyping these RIL populations. Individual linkage maps were constructed from each of the three populations and a composite map based on 93 common loci were created using JoinMap. The composite linkage maps consist of 22 composite linkage groups (LG) with 175 SSR markers (including 47 SSRs on the published AA genome maps), representing the 20 chromosomes of A. hypogaea. The total composite map length is 885.4 cM, with an average marker density of 5.8 cM. Segregation distortion in the 3 populations was 23.0%, 13.5% and 7.8% of the markers, respectively. These distorted loci tended to cluster on LG1, LG3, LG4 and LG5. There were only 15 EST-SSR markers mapped due to low polymorphism. By comparison, there were potential synteny, collinear order of some markers and conservation of collinear linkage groups among the maps and with the AA genome but not fully conservative.
A composite linkage map was constructed from three individual mapping populations with 175 SSR markers in 22 composite linkage groups. This composite genetic linkage map is among the first "true" tetraploid peanut maps produced. This map also consists of 47 SSRs that have been used in the published AA genome maps, and could be used in comparative mapping studies. The primers described in this study are PCR-based markers, which are easy to share for genetic mapping in peanuts. All 1044 primer pairs are provided as additional files and the three RIL populations will be made available to public upon request for quantitative trait loci (QTL) analysis and linkage map improvement.
Cultivated peanut or groundnut (Arachis hypogaea L.) is an important oilseed crop with an allotetraploid genome (AABB, 2n = 4x = 40). Both the low level of genetic variation within the cultivated gene pool and its polyploid nature limit the utilization of molecular markers to explore genome structure and facilitate genetic improvement. Nevertheless, a wealth of genetic diversity exists in diploid Arachis species (2n = 2x = 20), which represent a valuable gene pool for cultivated peanut improvement. Interspecific populations have been used widely for genetic mapping in diploid species of Arachis. However, an intraspecific mapping strategy was essential to detect chromosomal rearrangements among species that could be obscured by mapping in interspecific populations. To develop intraspecific reference linkage maps and gain insights into karyotypic evolution within the genus, we comparatively mapped the A- and B-genome diploid species using intraspecific F2 populations. Exploring genome organization among diploid peanut species by comparative mapping will enhance our understanding of the cultivated tetraploid peanut genome. Moreover, new sources of molecular markers that are highly transferable between species and developed from expressed genes will be required to construct saturated genetic maps for peanut.
A total of 2,138 EST-SSR (expressed sequence tag-simple sequence repeat) markers were developed by mining a tetraploid peanut EST assembly including 101,132 unigenes (37,916 contigs and 63,216 singletons) derived from 70,771 long-read (Sanger) and 270,957 short-read (454) sequences. A set of 97 SSR markers were also developed by mining 9,517 genomic survey sequences of Arachis. An SSR-based intraspecific linkage map was constructed using an F2 population derived from a cross between K 9484 (PI 298639) and GKBSPSc 30081 (PI 468327) in the B-genome species A. batizocoi. A high degree of macrosynteny was observed when comparing the homoeologous linkage groups between A (A. duranensis) and B (A. batizocoi) genomes. Comparison of the A- and B-genome genetic linkage maps also showed a total of five inversions and one major reciprocal translocation between two pairs of chromosomes under our current mapping resolution.
Our findings will contribute to understanding tetraploid peanut genome origin and evolution and eventually promote its genetic improvement. The newly developed EST-SSR markers will enrich current molecular marker resources in peanut.
Peanut (Arachis hypogaea); SSR; Genetic linkage map; Intraspecific cross; EST
Cotton, with a large genome, is an important crop throughout the world. A high-density genetic linkage map is the prerequisite for cotton genetics and breeding. A genetic map based on simple polymerase chain reaction markers will be efficient for marker-assisted breeding in cotton, and markers from transcribed sequences have more chance to target genes related to traits. To construct a genome-wide, functional marker-based genetic linkage map in cotton, we isolated and mapped expressed sequence tag-simple sequence repeats (EST-SSRs) from cotton ESTs derived from the A1, D5, (AD)1, and (AD)2 genome.
A total of 3177 new EST-SSRs developed in our laboratory and other newly released SSRs were used to enrich our interspecific BC1 genetic linkage map. A total of 547 loci and 911 loci were obtained from our EST-SSRs and the newly released SSRs, respectively. The 1458 loci together with our previously published data were used to construct an updated genetic linkage map. The final map included 2316 loci on the 26 cotton chromosomes, 4418.9 cM in total length and 1.91 cM in average distance between adjacent markers. To our knowledge, this map is one of the three most dense linkage maps in cotton. Twenty-one segregation distortion regions (SDRs) were found in this map; three segregation distorted chromosomes, Chr02, Chr16, and Chr18, were identified with 99.9% of distorted markers segregating toward the heterozygous allele. Functional analysis of SSR sequences showed that 1633 loci of this map (70.6%) were transcribed loci and 1332 loci (57.5%) were translated loci.
This map lays groundwork for further genetic analyses of important quantitative traits, marker-assisted selection, and genome organization architecture in cotton as well as for comparative genomics between cotton and other species. The segregation distorted chromosomes can be a guide to identify segregation distortion loci in cotton. The annotation of SSR sequences identified frequent and rare gene ontology items on each chromosome, which is helpful to discover functions of cotton chromosomes.
Alfalfa (Medicago sativa) is a major forage crop. The genetic progress is slow in this legume species because of its autotetraploidy and allogamy. The genetic structure of this species makes the construction of genetic maps difficult. To reach this objective, and to be able to detect QTLs in segregating populations, we used the available codominant microsatellite markers (SSRs), most of them identified in the model legume Medicago truncatula from EST database. A genetic map was constructed with AFLP and SSR markers using specific mapping procedures for autotetraploids. The tetrasomic inheritance was analysed in an alfalfa mapping population.
We have demonstrated that 80% of primer pairs defined on each side of SSR motifs in M. truncatula EST database amplify with the alfalfa DNA. Using a F1 mapping population of 168 individuals produced from the cross of 2 heterozygous parental plants from Magali and Mercedes cultivars, we obtained 599 AFLP markers and 107 SSR loci. All but 3 SSR loci showed a clear tetrasomic inheritance. For most of the SSR loci, the double-reduction was not significant. For the other loci no specific genotypes were produced, so the significant double-reduction could arise from segregation distortion. For each parent, the genetic map contained 8 groups of four homologous chromosomes. The lengths of the maps were 2649 and 3045 cM, with an average distance of 7.6 and 9.0 cM between markers, for Magali and Mercedes parents, respectively. Using only the SSR markers, we built a composite map covering 709 cM.
Compared to diploid alfalfa genetic maps, our maps cover about 88–100% of the genome and are close to saturation. The inheritance of the codominant markers (SSR) and the pattern of linkage repulsions between markers within each homology group are consistent with the hypothesis of a tetrasomic meiosis in alfalfa. Except for 2 out of 107 SSR markers, we found a similar order of markers on the chromosomes between the tetraploid alfalfa and M. truncatula genomes indicating a high level of colinearity between these two species. These maps will be a valuable tool for alfalfa breeding and are being used to locate QTLs.
Earlier comparative maps between the genomes of rice (Oryza sativa L.), barley (Hordeum vulgare L.) and wheat (Triticum aestivum L.) were linkage maps based on cDNA-RFLP markers. The low number of polymorphic RFLP markers has limited the development of dense genetic maps in wheat and the number of available anchor points in comparative maps. Higher density comparative maps using PCR-based anchor markers are necessary to better estimate the conservation of colinearity among cereal genomes. The purposes of this study were to characterize the proportion of transcribed DNA sequences containing simple sequence repeats (SSR or microsatellites) by length and motif for wheat, barley and rice and to determine in-silico rice genome locations for primer sets developed for wheat and barley Expressed Sequence Tags.
The proportions of SSR types (di-, tri-, tetra-, and penta-nucleotide repeats) and motifs varied with the length of the SSRs within and among the three species, with trinucleotide SSRs being the most frequent. Distributions of genomic microsatellites (gSSRs), EST-derived microsatellites (EST-SSRs), and transcribed regions in the contiguous sequence of rice chromosome 1 were highly correlated. More than 13,000 primer pairs were developed for use by the cereal research community as potential markers in wheat, barley and rice.
Trinucleotide SSRs were the most common type in each of the species; however, the relative proportions of SSR types and motifs differed among rice, wheat, and barley. Genomic microsatellites were found to be primarily located in gene-rich regions of the rice genome. Microsatellite markers derived from the use of non-redundant EST-SSRs are an economic and efficient alternative to RFLP for comparative mapping in cereals.
Simple sequence repeat (SSR) markers are highly informative and widely used for genetic and breeding studies in several plant species. They are used for cultivar identification, variety protection, as anchor markers in genetic mapping, and in marker-assisted breeding. Currently, a limited number of SSR markers are publicly available for perennial ryegrass (Lolium perenne). We report on the exploitation of a comprehensive EST collection in L. perenne for SSR identification. The objectives of this study were 1) to analyse the frequency, type, and distribution of SSR motifs in ESTs derived from three genotypes of L. perenne, 2) to perform a comparative analysis of SSR motif polymorphisms between allelic sequences, 3) to conduct a comparative analysis of SSR motif polymorphisms between orthologous sequences of L. perenne, Festuca arundinacea, Brachypodium distachyon, and O. sativa, 4) to identify functionally associated EST-SSR markers for application in comparative genomics and breeding.
From 25,744 ESTs, representing 8.53 megabases of nucleotide information from three genotypes of L. perenne, 1,458 ESTs (5.7%) contained one or more SSRs. Of these SSRs, 955 (3.7%) were non-redundant. Tri-nucleotide repeats were the most abundant type of repeats followed by di- and tetra-nucleotide repeats. The EST-SSRs from the three genotypes were analysed for allelic- and/or genotypic SSR motif polymorphisms. Most of the SSR motifs (97.7%) showed no polymorphisms, whereas 22 EST-SSRs showed allelic- and/or genotypic polymorphisms. All polymorphisms identified were changes in the number of repeat units. Comparative analysis of the L. perenne EST-SSRs with sequences of Festuca arundinacea, Brachypodium distachyon, and Oryza sativa identified 19 clusters of orthologous sequences between these four species. Analysis of the clusters showed that the SSR motif generally is conserved in the closely related species F. arundinacea, but often differs in length of the SSR motif. In contrast, SSR motifs are often lost in the more distant related species B. distachyon and O. sativa.
The results indicate that the L. perenne EST-SSR markers are a valuable resource for genetic mapping, as well as evaluation of co-location between QTLs and functionally associated markers.
Expressed sequence tag (EST) databases represent a valuable resource for the identification of genes in organisms with uncharacterized genomes and for development of molecular markers. One class of markers derived from EST sequences are simple sequence repeat (SSR) markers, also known as EST-SSRs. These are useful in plant genetic and evolutionary studies because they are located in transcribed genes and a putative function can often be inferred from homology searches. Another important feature of EST-SSR markers is their expected high level of transferability to related species that makes them very promising for comparative mapping. In the present study we constructed a normalized EST library from floral tissue of Silene latifolia with the aim to identify expressed genes and to develop polymorphic molecular markers.
We obtained a total of 3662 high quality sequences from a normalized Silene cDNA library. These represent 3105 unigenes, with 73% of unigenes matching genes in other species. We found 255 sequences containing one or more SSR motifs. More than 60% of these SSRs were trinucleotides. A total of 30 microsatellite loci were identified from 106 ESTs having sufficient flanking sequences for primer design. The inheritance of these loci was tested via segregation analyses and their usefulness for linkage mapping was assessed in an interspecific cross. Tests for crossamplification of the EST-SSR loci in other Silene species established their applicability to related species.
The newly characterized genes and gene-derived markers from our Silene EST library represent a valuable genetic resource for future studies on Silene latifolia and related species. The polymorphism and transferability of EST-SSR markers facilitate comparative linkage mapping and analyses of genetic diversity in the genus Silene.
Pearl millet [Pennisetum glaucum (L.) R. Br.] is a widely cultivated drought- and high-temperature tolerant C4 cereal grown under dryland, rainfed and irrigated conditions in drought-prone regions of the tropics and sub-tropics of Africa, South Asia and the Americas. It is considered an orphan crop with relatively few genomic and genetic resources. This study was undertaken to increase the EST-based microsatellite marker and genetic resources for this crop to facilitate marker-assisted breeding.
Newly developed EST-SSR markers (99), along with previously mapped EST-SSR (17), genomic SSR (53) and STS (2) markers, were used to construct linkage maps of four F7 recombinant inbred populations (RIP) based on crosses ICMB 841-P3 × 863B-P2 (RIP A), H 77/833-2 × PRLT 2/89-33 (RIP B), 81B-P6 × ICMP 451-P8 (RIP C) and PT 732B-P2 × P1449-2-P1 (RIP D). Mapped loci numbers were greatest for RIP A (104), followed by RIP B (78), RIP C (64) and RIP D (59). Total map lengths (Haldane) were 615 cM, 690 cM, 428 cM and 276 cM, respectively. A total of 176 loci detected by 171 primer pairs were mapped among the four crosses. A consensus map of 174 loci (899 cM) detected by 169 primer pairs was constructed using MergeMap to integrate the individual linkage maps. Locus order in the consensus map was well conserved for nearly all linkage groups. Eighty-nine EST-SSR marker loci from this consensus map had significant BLAST hits (top hits with e-value ≤ 1E-10) on the genome sequences of rice, foxtail millet, sorghum, maize and Brachypodium with 35, 88, 58, 48 and 38 loci, respectively.
The consensus map developed in the present study contains the largest set of mapped SSRs reported to date for pearl millet, and represents a major consolidation of existing pearl millet genetic mapping information. This study increased numbers of mapped pearl millet SSR markers by >50%, filling important gaps in previously published SSR-based linkage maps for this species and will greatly facilitate SSR-based QTL mapping and applied marker-assisted selection programs.
EST-SSR markers; EST; Linkage map; Consensus map; Drought stress; Pearl millet; Synteny
Previous loblolly pine (Pinus taeda L.) genetic linkage maps have been based on a variety of DNA polymorphisms, such as AFLPs, RAPDs, RFLPs, and ESTPs, but only a few SSRs (simple sequence repeats), also known as simple tandem repeats or microsatellites, have been mapped in P. taeda. The objective of this study was to integrate a large set of SSR markers from a variety of sources and published cDNA markers into a composite P. taeda genetic map constructed from two reference mapping pedigrees. A dense genetic map that incorporates SSR loci will benefit complete pine genome sequencing, pine population genetics studies, and pine breeding programs. Careful marker annotation using a variety of references further enhances the utility of the integrated SSR map.
The updated P. taeda genetic map, with an estimated genome coverage of 1,515 cM(Kosambi) across 12 linkage groups, incorporated 170 new SSR markers and 290 previously reported SSR, RFLP, and ESTP markers. The average marker interval was 3.1 cM. Of 233 mapped SSR loci, 84 were from cDNA-derived sequences (EST-SSRs) and 149 were from non-transcribed genomic sequences (genomic-SSRs). Of all 311 mapped cDNA-derived markers, 77% were associated with NCBI Pta UniGene clusters, 67% with RefSeq proteins, and 62% with functional Gene Ontology (GO) terms. Duplicate (i.e., redundant accessory) and paralogous markers were tentatively identified by evaluating marker sequences by their UniGene cluster IDs, clone IDs, and relative map positions. The average gene diversity, He, among polymorphic SSR loci, including those that were not mapped, was 0.43 for 94 EST-SSRs and 0.72 for 83 genomic-SSRs. The genetic map can be viewed and queried at http://www.conifergdb.org/pinemap.
Many polymorphic and genetically mapped SSR markers are now available for use in P. taeda population genetics, studies of adaptive traits, and various germplasm management applications. Annotating mapped genes with UniGene clusters and GO terms allowed assessment of redundant and paralogous EST markers and further improved the quality and utility of the genetic map for P. taeda.
Analysis of interspecific gene flow is crucial for the understanding of speciation processes and maintenance of species integrity. Oaks (genus Quercus, Fagaceae) are among the model species for the study of hybridization. Natural co-occurrence of four closely related oak species is a very rare case in the temperate forests of Europe. We used both morphological characters and genetic markers to characterize hybridization in a natural community situated in west-central Romania and which consists of Quercus robur, Q. petraea, Q. pubescens, and Q. frainetto, respectively.
On the basis of pubescence and leaf morphological characters ~94% of the sampled individuals were assigned to pure species. Only 16 (~6%) individual trees exhibited intermediate morphologies or a combination of characters of different species. Four chloroplast DNA haplotypes were identified in the study area. The distribution of haplotypes within the white oak complex showed substantial differences among species. However, the most common haplotypes were present in all four species. Furthermore, based on a set of 7 isozyme and 6 microsatellite markers and using a Bayesian admixture analysis without any a priori information on morphology we found that four genetic clusters best fit the data. There was a very good correspondence of each species with one of the inferred genetic clusters. The estimated introgression level varied markedly between pairs of species ranging from 1.7% between Q. robur and Q. frainetto to 16.2% between Q. pubescens and Q. frainetto. Only nine individuals (3.4%) appeared to be first-generation hybrids.
Our data indicate that natural hybridization has occurred at relatively low rates. The different levels of gene flow among species might be explained by differences in flowering time and spatial position within the stand. In addition, a partial congruence between phenotypically and genetically intermediate individuals was found, suggesting that intermediate appearance does not necessarily mean hybridization. However, it appears that natural hybridization did not seriously affect the species identity in this area of sympatry.
The Apiaceae family includes several vegetable and spice crop species among which carrot is the most economically important member, with ~21 million tons produced yearly worldwide. Despite its importance, molecular resources in this species are relatively underdeveloped. The availability of informative, polymorphic, and robust PCR-based markers, such as microsatellites (or SSRs), will facilitate genetics and breeding of carrot and other Apiaceae, including integration of linkage maps, tagging of phenotypic traits and assisting positional gene cloning. Thus, with the purpose of isolating carrot microsatellites, two different strategies were used; a hybridization-based library enrichment for SSRs, and bioinformatic mining of SSRs in BAC-end sequence and EST sequence databases. This work reports on the development of 300 carrot SSR markers and their characterization at various levels.
Evaluation of microsatellites isolated from both DNA sources in subsets of 7 carrot F2 mapping populations revealed that SSRs from the hybridization-based method were longer, had more repeat units and were more polymorphic than SSRs isolated by sequence search. Overall, 196 SSRs (65.1%) were polymorphic in at least one mapping population, and the percentage of polymophic SSRs across F2 populations ranged from 17.8 to 24.7. Polymorphic markers in one family were evaluated in the entire F2, allowing the genetic mapping of 55 SSRs (38 codominant) onto the carrot reference map. The SSR loci were distributed throughout all 9 carrot linkage groups (LGs), with 2 to 9 SSRs/LG. In addition, SSR evaluations in carrot-related taxa indicated that a significant fraction of the carrot SSRs transfer successfully across Apiaceae, with heterologous amplification success rate decreasing with the target-species evolutionary distance from carrot. SSR diversity evaluated in a collection of 65 D. carota accessions revealed a high level of polymorphism for these selected loci, with an average of 19 alleles/locus and 0.84 expected heterozygosity.
The addition of 55 SSRs to the carrot map, together with marker characterizations in six other mapping populations, will facilitate future comparative mapping studies and integration of carrot maps. The markers developed herein will be a valuable resource for assisting breeding, genetic, diversity, and genomic studies of carrot and other Apiaceae.
The oomycete pathogen Phytophthora ramorum is responsible for sudden oak death (SOD) in California coastal forests. P. ramorum is a generalist pathogen with over 100 known host species. Three or four closely related genotypes of P. ramorum (from a single lineage) were originally introduced in California forests and the pathogen reproduces clonally. Because of this the genetic diversity of P. ramorum is extremely low in Californian forests. However, P. ramorum shows diverse phenotypic variation in colony morphology, colony senescence, and virulence. In this study, we show that phenotypic variation among isolates is associated with the host species from which the microbe was originally cultured. Microarray global mRNA profiling detected derepression of transposable elements (TEs) and down-regulation of crinkler effector homologs (CRNs) in the majority of isolates originating from coast live oak (Quercus agrifolia), but this expression pattern was not observed in isolates from California bay laurel (Umbellularia californica). In some instances, oak and bay laurel isolates originating from the same geographic location had identical genotypes based on multilocus simples sequence repeat (SSR) marker analysis but had different phenotypes. Expression levels of the two marker genes analyzed by quantitative reverse transcription PCR were correlated with originating host species, but not with multilocus genotypes. Because oak is a nontransmissive dead-end host for P. ramorum, our observations are congruent with an epi-transposon hypothesis; that is, physiological stress is triggered on P. ramorum while colonizing oak stems and disrupts epigenetic silencing of TEs. This then results in TE reactivation and possibly genome diversification without significant epidemiological consequences. We propose the P. ramorum-oak host system in California forests as an ad hoc model for epi-transposon mediated diversification.
A number of molecular marker linkage maps have been developed for melon (Cucumis melo L.) over the last two decades. However, these maps were constructed using different marker sets, thus, making comparative analysis among maps difficult. In order to solve this problem, a consensus genetic map in melon was constructed using primarily highly transferable anchor markers that have broad potential use for mapping, synteny, and comparative quantitative trait loci (QTL) analysis, increasing breeding effectiveness and efficiency via marker-assisted selection (MAS).
Under the framework of the International Cucurbit Genomics Initiative (ICuGI, http://www.icugi.org), an integrated genetic map has been constructed by merging data from eight independent mapping experiments using a genetically diverse array of parental lines. The consensus map spans 1150 cM across the 12 melon linkage groups and is composed of 1592 markers (640 SSRs, 330 SNPs, 252 AFLPs, 239 RFLPs, 89 RAPDs, 15 IMAs, 16 indels and 11 morphological traits) with a mean marker density of 0.72 cM/marker. One hundred and ninety-six of these markers (157 SSRs, 32 SNPs, 6 indels and 1 RAPD) were newly developed, mapped or provided by industry representatives as released markers, including 27 SNPs and 5 indels from genes involved in the organic acid metabolism and transport, and 58 EST-SSRs. Additionally, 85 of 822 SSR markers contributed by Syngenta Seeds were included in the integrated map. In addition, 370 QTL controlling 62 traits from 18 previously reported mapping experiments using genetically diverse parental genotypes were also integrated into the consensus map. Some QTL associated with economically important traits detected in separate studies mapped to similar genomic positions. For example, independently identified QTL controlling fruit shape were mapped on similar genomic positions, suggesting that such QTL are possibly responsible for the phenotypic variability observed for this trait in a broad array of melon germplasm.
Even though relatively unsaturated genetic maps in a diverse set of melon market types have been published, the integrated saturated map presented herein should be considered the initial reference map for melon. Most of the mapped markers contained in the reference map are polymorphic in diverse collection of germplasm, and thus are potentially transferrable to a broad array of genetic experimentation (e.g., integration of physical and genetic maps, colinearity analysis, map-based gene cloning, epistasis dissection, and marker-assisted selection).
There has been increased consumption of blueberries in recent years fueled in part because of their many recognized health benefits. Blueberry fruit is very high in anthocyanins, which have been linked to improved night vision, prevention of macular degeneration, anti-cancer activity, and reduced risk of heart disease. Very few genomic resources have been available for blueberry, however. Further development of genomic resources like expressed sequence tags (ESTs), molecular markers, and genetic linkage maps could lead to more rapid genetic improvement. Marker-assisted selection could be used to combine traits for climatic adaptation with fruit and nutritional quality traits.
Efforts to sequence the transcriptome of the commercial highbush blueberry (Vaccinium corymbosum) cultivar Bluecrop and use the sequences to identify genes associated with cold acclimation and fruit development and develop SSR markers for mapping studies are presented here. Transcriptome sequences were generated from blueberry fruit at different stages of development, flower buds at different stages of cold acclimation, and leaves by next-generation Roche 454 sequencing. Over 600,000 reads were assembled into approximately 15,000 contigs and 124,000 singletons. The assembled sequences were annotated and functionally mapped to Gene Ontology (GO) terms. Frequency of the most abundant sequences in each of the libraries was compared across all libraries to identify genes that are potentially differentially expressed during cold acclimation and fruit development. Real-time PCR was performed to confirm their differential expression patterns. Overall, 14 out of 17 of the genes examined had differential expression patterns similar to what was predicted from their reads alone. The assembled sequences were also mined for SSRs. From these sequences, 15,886 blueberry EST-SSR loci were identified. Primers were designed from 7,705 of the SSR-containing sequences with adequate flanking sequence. One hundred primer pairs were tested for amplification and polymorphism among parents of two blueberry populations currently being used for genetic linkage map construction. The tetraploid mapping population was based on a cross between the highbush cultivars Draper and Jewel (V. darrowii is also in the background of 'Jewel'). The diploid mapping population was based on a cross between an F1 hybrid of V. darrowii and diploid V. corymbosum and another diploid V. corymbosum. The overall amplification rate of the SSR primers was 68% and the polymorphism rate was 43%.
These results indicate that this large collection of 454 ESTs will be a valuable resource for identifying genes that are potentially differentially expressed and play important roles in flower bud development, cold acclimation, chilling unit accumulation, and fruit development in blueberry and related species. In addition, the ESTs have already proved useful for the development of SSR and EST-PCR markers, and are currently being used for construction of genetic linkage maps in blueberry.
Genic microsatellite markers, also known as functional markers, are preferred over anonymous markers as they reveal the variation in transcribed genes among individuals. In this study, we developed a total of 707 expressed sequence tag-derived simple sequence repeat markers (EST-SSRs) and used for development of a high-density integrated map using four individual mapping populations of B. rapa. This map contains a total of 1426 markers, consisting of 306 EST-SSRs, 153 intron polymorphic markers, 395 bacterial artificial chromosome-derived SSRs (BAC-SSRs), and 572 public SSRs and other markers covering a total distance of 1245.9 cM of the B. rapa genome. Analysis of allelic diversity in 24 B. rapa germplasm using 234 mapped EST-SSR markers showed amplification of 2 alleles by majority of EST-SSRs, although amplification of alleles ranging from 2 to 8 was found. Transferability analysis of 167 EST-SSRs in 35 species belonging to cultivated and wild brassica relatives showed 42.51% (Sysimprium leteum) to 100% (B. carinata, B. juncea, and B. napus) amplification. Our newly developed EST-SSRs and high-density linkage map based on highly transferable genic markers would facilitate the molecular mapping of quantitative trait loci and the positional cloning of specific genes, in addition to marker-assisted selection and comparative genomic studies of B. rapa with other related species.
Brassica rapa; expressed sequence-derived SSRs; integrated map; polymorphism information content; transferability
Tetraploid cotton contains two sets of homologous chromosomes, the At- and Dt-subgenomes. Consequently, many markers in cotton were mapped to multiple positions during linkage genetic map construction, posing a challenge to anchoring linkage groups and mapping economically-important genes to particular chromosomes. Chromosome-specific markers could solve this problem. Recently, the genomes of two diploid species were sequenced whose progenitors were putative contributors of the At- and Dt-subgenomes to tetraploid cotton. These sequences provide a powerful tool for developing chromosome-specific markers given the high level of synteny among tetraploid and diploid cotton genomes. In this study, simple sequence repeats (SSRs) on each chromosome in the two diploid genomes were characterized. Chromosome-specific SSRs were developed by comparative analysis and proved to distinguish chromosomes.
A total of 200,744 and 142,409 SSRs were detected on the 13 chromosomes of Gossypium arboreum L. and Gossypium raimondii Ulbrich, respectively. Chromosome-specific SSRs were obtained by comparing SSR flanking sequences from each chromosome with those from the other 25 chromosomes. The average was 7,996 per chromosome. To confirm their chromosome specificity, these SSRs were used to distinguish two homologous chromosomes in tetraploid cotton through linkage group construction. The chromosome-specific SSRs and previously-reported chromosome markers were grouped together, and no marker mapped to another homologous chromosome, proving that the chromosome-specific SSRs were unique and could distinguish homologous chromosomes in tetraploid cotton. Because longer dinucleotide AT-rich repeats were the most polymorphic in previous reports, the SSRs on each chromosome were sorted by motif type and repeat length for convenient selection. The primer sequences of all chromosome-specific SSRs were also made publicly available.
Chromosome-specific SSRs are efficient tools for chromosome identification by anchoring linkage groups to particular chromosomes during genetic mapping and are especially useful in mapping of qualitative-trait genes or quantitative trait loci with just a few markers. The SSRs reported here will facilitate a number of genetic and genomic studies in cotton, including construction of high-density genetic maps, positional gene cloning, fingerprinting, and genetic diversity and comparative evolutionary analyses among Gossypium species.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1265-2) contains supplementary material, which is available to authorized users.
Chromosome-specific; SSR; Tetraploid cotton; Genome-wide
Genetic linkage maps are important tools for many genetic applications including mapping of quantitative trait loci (QTLs), identifying DNA markers for fingerprinting, and map-based gene cloning. Carnation (Dianthus caryophyllus L.) is an important ornamental flower worldwide. We previously reported a random amplified polymorphic DNA (RAPD)-based genetic linkage map derived from Dianthus capitatus ssp. andrezejowskianus and a simple sequence repeat (SSR)-based genetic linkage map constructed using data from intraspecific F2 populations; however, the number of markers was insufficient, and so the number of linkage groups (LGs) did not coincide with the number of chromosomes (x = 15). Therefore, we aimed to produce a high-density genetic map to improve its usefulness for breeding purposes and genetic research.
We improved the SSR-based genetic linkage map using SSR markers derived from a genomic library, expression sequence tags, and RNA-seq data. Linkage analysis revealed that 412 SSR loci (including 234 newly developed SSR loci) could be mapped to 17 linkage groups (LGs) covering 969.6 cM. Comparison of five minor LGs covering less than 50 cM with LGs in our previous RAPD-based genetic map suggested that four LGs could be integrated into two LGs by anchoring common SSR loci. Consequently, the number of LGs corresponded to the number of chromosomes (x = 15). We added 192 new SSRs, eight RAPD, and two sequence-tagged site loci to refine the RAPD-based genetic linkage map, which comprised 15 LGs consisting of 348 loci covering 978.3 cM. The two maps had 125 SSR loci in common, and most of the positions of markers were conserved between them. We identified 635 loci in carnation using the two linkage maps. We also mapped QTLs for two traits (bacterial wilt resistance and anthocyanin pigmentation in the flower) and a phenotypic locus for flower-type by analyzing previously reported genotype and phenotype data.
The improved genetic linkage maps and SSR markers developed in this study will serve as reference genetic linkage maps for members of the genus Dianthus, including carnation, and will be useful for mapping QTLs associated with various traits, and for improving carnation breeding programs.
Carnation; Dianthus caryophyllus L; EST; Linkage map; Next-generation sequencing technology (NGS); RAPD; STS; SSR
The cultivated strawberry (Fragaria× ananassa) is an octoploid (2n = 8x = 56) of the Rosaceae family whose genomic architecture is still controversial. Several recent studies support the AAA′A′BBB′B′ model, but its complexity has hindered genetic and genomic analysis of this important crop. To overcome this difficulty and to assist genome-wide analysis of F. × ananassa, we constructed an integrated linkage map by organizing a total of 4474 of simple sequence repeat (SSR) markers collected from published Fragaria sequences, including 3746 SSR markers [Fragaria vesca expressed sequence tag (EST)-derived SSR markers] derived from F. vesca ESTs, 603 markers (F. × ananassa EST-derived SSR markers) from F. × ananassa ESTs, and 125 markers (F. × ananassa transcriptome-derived SSR markers) from F. × ananassa transcripts. Along with the previously published SSR markers, these markers were mapped onto five parent-specific linkage maps derived from three mapping populations, which were then assembled into an integrated linkage map. The constructed map consists of 1856 loci in 28 linkage groups (LGs) that total 2364.1 cM in length. Macrosynteny at the chromosome level was observed between the LGs of F. × ananassa and the genome of F. vesca. Variety distinction on 129 F. × ananassa lines was demonstrated using 45 selected SSR markers.
Fragaria × ananassa; SSR marker; integrated linkage map; comparative mapping
Prunus fruit development, growth, ripening, and senescence includes major biochemical and sensory changes in texture, color, and flavor. The genetic dissection of these complex processes has important applications in crop improvement, to facilitate maximizing and maintaining stone fruit quality from production and processing through to marketing and consumption. Here we present an integrated fruit quality gene map of Prunus containing 133 genes putatively involved in the determination of fruit texture, pigmentation, flavor, and chilling injury resistance.
A genetic linkage map of 211 markers was constructed for an intraspecific peach (Prunus persica) progeny population, Pop-DG, derived from a canning peach cultivar 'Dr. Davis' and a fresh market cultivar 'Georgia Belle'. The Pop-DG map covered 818 cM of the peach genome and included three morphological markers, 11 ripening candidate genes, 13 cold-responsive genes, 21 novel EST-SSRs from the ChillPeach database, 58 previously reported SSRs, 40 RAFs, 23 SRAPs, 14 IMAs, and 28 accessory markers from candidate gene amplification. The Pop-DG map was co-linear with the Prunus reference T × E map, with 39 SSR markers in common to align the maps. A further 158 markers were bin-mapped to the reference map: 59 ripening candidate genes, 50 cold-responsive genes, and 50 novel EST-SSRs from ChillPeach, with deduced locations in Pop-DG via comparative mapping. Several candidate genes and EST-SSRs co-located with previously reported major trait loci and quantitative trait loci for chilling injury symptoms in Pop-DG.
The candidate gene approach combined with bin-mapping and availability of a community-recognized reference genetic map provides an efficient means of locating genes of interest in a target genome. We highlight the co-localization of fruit quality candidate genes with previously reported fruit quality QTLs. The fruit quality gene map developed here is a valuable tool for dissecting the genetic architecture of fruit quality traits in Prunus crops.
There are few genomic tools available in melon (Cucumis melo L.), a member of the Cucurbitaceae, despite its importance as a crop. Among these tools, genetic maps have been constructed mainly using marker types such as simple sequence repeats (SSR), restriction fragment length polymorphisms (RFLP) and amplified fragment length polymorphisms (AFLP) in different mapping populations. There is a growing need for saturating the genetic map with single nucleotide polymorphisms (SNP), more amenable for high throughput analysis, especially if these markers are located in gene coding regions, to provide functional markers. Expressed sequence tags (ESTs) from melon are available in public databases, and resequencing ESTs or validating SNPs detected in silico are excellent ways to discover SNPs.
EST-based SNPs were discovered after resequencing ESTs between the parental lines of the PI 161375 (SC) × 'Piel de sapo' (PS) genetic map or using in silico SNP information from EST databases. In total 200 EST-based SNPs were mapped in the melon genetic map using a bin-mapping strategy, increasing the map density to 2.35 cM/marker. A subset of 45 SNPs was used to study variation in a panel of 48 melon accessions covering a wide range of the genetic diversity of the species. SNP analysis correctly reflected the genetic relationships compared with other marker systems, being able to distinguish all the accessions and cultivars.
This is the first example of a genetic map in a cucurbit species that includes a major set of SNP markers discovered using ESTs. The PI 161375 × 'Piel de sapo' melon genetic map has around 700 markers, of which more than 500 are gene-based markers (SNP, RFLP and SSR). This genetic map will be a central tool for the construction of the melon physical map, the step prior to sequencing the complete genome. Using the set of SNP markers, it was possible to define the genetic relationships within a collection of forty-eight melon accessions as efficiently as with SSR markers, and these markers may also be useful for cultivar identification in Occidental melon varieties.
Apple is an economically important fruit crop worldwide. Developing a genetic linkage map is a critical step towards mapping and cloning of genes responsible for important horticultural traits in apple. To facilitate linkage map construction, we surveyed and characterized the distribution and frequency of perfect microsatellites in assembled contig sequences of the apple genome.
A total of 28,538 SSRs have been identified in the apple genome, with an overall density of 40.8 SSRs per Mb. Di-nucleotide repeats are the most frequent microsatellites in the apple genome, accounting for 71.9% of all microsatellites. AT/TA repeats are the most frequent in genomic regions, accounting for 38.3% of all the G-SSRs, while AG/GA dimers prevail in transcribed sequences, and account for 59.4% of all EST-SSRs. A total set of 310 SSRs is selected to amplify eight apple genotypes. Of these, 245 (79.0%) are found to be polymorphic among cultivars and wild species tested. AG/GA motifs in genomic regions have detected more alleles and higher PIC values than AT/TA or AC/CA motifs. Moreover, AG/GA repeats are more variable than any other dimers in apple, and should be preferentially selected for studies, such as genetic diversity and linkage map construction. A total of 54 newly developed apple SSRs have been genetically mapped. Interestingly, clustering of markers with distorted segregation is observed on linkage groups 1, 2, 10, 15, and 16. A QTL responsible for malic acid content of apple fruits is detected on linkage group 8, and accounts for ~13.5% of the observed phenotypic variation.
This study demonstrates that di-nucleotide repeats are prevalent in the apple genome and that AT/TA and AG/GA repeats are the most frequent in genomic and transcribed sequences of apple, respectively. All SSR motifs identified in this study as well as those newly mapped SSRs will serve as valuable resources for pursuing apple genetic studies, aiding the apple breeding community in marker-assisted breeding, and for performing comparative genomic studies in Rosaceae.