Due to the lack of availability of large genomic sequences for peach or other Prunus species, the degree of synteny conservation between the Prunus species and Arabidopsis has not been systematically assessed. Using the recently available peach EST sequences that are anchored to Prunus genetic maps and to peach physical map, we analyzed the extent of conserved synteny between the Prunus and the Arabidopsis genomes. The reconstructed pseudo-ancestral Arabidopsis genome, existed prior to the proposed recent polyploidy event, was also utilized in our analysis to further elucidate the evolutionary relationship.
We analyzed the synteny conservation between the Prunus and the Arabidopsis genomes by comparing 475 peach ESTs that are anchored to Prunus genetic maps and their Arabidopsis homologs detected by sequence similarity. Microsyntenic regions were detected between all five Arabidopsis chromosomes and seven of the eight linkage groups of the Prunus reference map. An additional 1097 peach ESTs that are anchored to 431 BAC contigs of the peach physical map and their Arabidopsis homologs were also analyzed. Microsyntenic regions were detected in 77 BAC contigs. The syntenic regions from both data sets were short and contained only a couple of conserved gene pairs. The synteny between peach and Arabidopsis was fragmentary; all the Prunus linkage groups containing syntenic regions matched to more than two different Arabidopsis chromosomes, and most BAC contigs with multiple conserved syntenic regions corresponded to multiple Arabidopsis chromosomes. Using the same peach EST datasets and their Arabidopsis homologs, we also detected conserved syntenic regions in the pseudo-ancestral Arabidopsis genome. In many cases, the gene order and content of peach regions was more conserved in the ancestral genome than in the present Arabidopsis region. Statistical significance of each syntenic group was calculated using simulated Arabidopsis genome.
We report here the result of the first extensive analysis of the conserved microsynteny using DNA sequences across the Prunus genome and their Arabidopsis homologs. Our study also illustrates that both the ancestral and present Arabidopsis genomes can provide a useful resource for marker saturation and candidate gene search, as well as elucidating evolutionary relationships between species.
Fragaria belongs to the Rosaceae, an economically important family that includes a number of important fruit producing genera such as Malus and Prunus. Using genomic sequences from 50 Fragaria fosmids, we have examined the microsynteny between Fragaria and other plant models.
In more than half of the strawberry fosmids, we found syntenic regions that are conserved in Populus, Vitis, Medicago and/or Arabidopsis with Populus containing the greatest number of syntenic regions with Fragaria. The longest syntenic region was between LG VIII of the poplar genome and the strawberry fosmid 72E18, where seven out of twelve predicted genes were collinear. We also observed an unexpectedly high level of conserved synteny between Fragaria (rosid I) and Vitis (basal rosid). One of the strawberry fosmids, 34E24, contained a cluster of R gene analogs (RGAs) with NBS and LRR domains. We detected clusters of RGAs with high sequence similarity to those in 34E24 in all the genomes compared. In the phylogenetic tree we have generated, all the NBS-LRR genes grouped together with Arabidopsis CNL-A type NBS-LRR genes. The Fragaria RGA grouped together with those of Vitis and Populus in the phylogenetic tree.
Our analysis shows considerable microsynteny between Fragaria and other plant genomes such as Populus, Medicago, Vitis, and Arabidopsis to a lesser degree. We also detected a cluster of NBS-LRR type genes that are conserved in all the genomes compared.
Detailed comparative genome analyses within the economically important Rosaceae family have not been conducted. This is largely due to the lack of conserved gene-based molecular markers that are transferable among the important crop genera within the family [e.g. Malus (apple), Fragaria (strawberry), and Prunus (peach, cherry, apricot and almond)]. The lack of molecular markers and comparative whole genome sequence analysis for this family severely hampers crop improvement efforts as well as QTL confirmation and validation studies.
We identified a set of 3,818 rosaceaous unigenes comprised of two or more ESTs that correspond to single copy Arabidopsis genes. From this Rosaceae Conserved Orthologous Set (RosCOS), 1039 were selected from which 857 were used for the development of intron-flanking primers and allele amplification. This led to successful amplification and subsequent mapping of 613 RosCOS onto the Prunus TxE reference map resulting in a genome-wide coverage of 0.67 to 1.06 gene-based markers per cM per linkage group. Furthermore, the RosCOS primers showed amplification success rates from 23 to 100% across the family indicating that a substantial part of the RosCOS primers can be directly employed in other less studied rosaceaous crops. Comparisons of the genetic map positions of the RosCOS with the physical locations of the orthologs in the Populus trichocarpa genome identified regions of colinearity between the genomes of Prunus-Rosaceae and Populus-Salicaceae.
Conserved orthologous genes are extremely useful for the analysis of genome evolution among closely and distantly related species. The results presented in this study demonstrate the considerable potential of the mapped Prunus RosCOS for genome-wide marker employment and comparative whole genome studies within the Rosaceae family. Moreover, these markers will also function as useful anchor points for the genome sequencing efforts currently ongoing in this family as well as for comparative QTL analyses.
Despite a high genetic similarity to peach, almonds (Prunus dulcis) have a fleshless fruit and edible kernel, produced as a crop for human consumption. While the release of peach genome v1.0 provides an excellent opportunity for almond genetic and genomic studies, well-assessed segregating populations and the respective saturated genetic linkage maps lay the foundation for such studies to be completed in almond.
Using an almond intraspecific cross between 'Nonpareil' and 'Lauranne' (N × L), we constructed a moderately saturated map with SSRs, SNPs, ISSRs and RAPDs. The N × L map covered 591.4 cM of the genome with 157 loci. The average marker distance of the map was 4.0 cM. The map displayed high synteny and colinearity with the Prunus T × E reference map in all eight linkage groups (G1-G8). The positions of 14 mapped gene-anchored SNPs corresponded approximately with the positions of homologous sequences in the peach genome v1.0. Analysis of Mendelian segregation ratios showed that 17.9% of markers had significantly skewed genotype ratios at the level of P < 0.05. Due to the large number of skewed markers in the linkage group 7, the potential existence of deleterious gene(s) was assessed in the group. Integrated maps produced by two different mapping methods using JoinMap® 3 were compared, and their high degree of similarity was evident despite the positional inconsistency of a few markers.
We presented a moderately saturated Australian almond map, which is highly syntenic and collinear with the Prunus reference map and peach genome V1.0. Therefore, the well-assessed almond population reported here can be used to investigate the traits of interest under Australian growing conditions, and provides more information on the almond genome for the international community.
The development of genetic markers is complex and costly in species with little pre-existing genomic information. Faba bean possesses one of the largest and least studied genomes among cultivated crop plants and no gene-based genetic maps exist. Gene-based orthologous markers allow chromosomal regions and levels of synteny to be characterised between species, reveal phylogenetic relationships and chromosomal evolution, and enable targeted identification of markers for crop breeding. In this study orthologous codominant cross-species markers have been deployed to produce the first exclusively gene-based genetic linkage map of faba bean (Vicia faba), using an F6 population developed from a cross between the lines Vf6 (equina type) and Vf27 (paucijuga type).
Of 796 intron-targeted amplified polymorphic (ITAP) markers screened, 151 markers could be used to construct a comparative genetic map. Linkage analysis revealed seven major and five small linkage groups (LGs), one pair and 12 unlinked markers. Each LG was comprised of three to 30 markers and varied in length from 23.6 cM to 324.8 cM. The map spanned a total length of 1685.8 cM. A simple and direct macrosyntenic relationship between faba bean and Medicago truncatula was evident, while faba bean and lentil shared a common rearrangement relative to M. truncatula. One hundred and four of the 127 mapped markers in the 12 LGs, which were previously assigned to M. truncatula genetic and physical maps, were found in regions syntenic between the faba bean and M. truncatula genomes. However chromosomal rearrangements were observed that could explain the difference in chromosome numbers between these three legume species. These rearrangements suggested high conservation of M. truncatula chromosomes 1, 5 and 8; moderate conservation of chromosomes 2, 3, 4 and 7 and no conservation with M. truncatula chromosome 6. Multiple PCR amplicons and comparative mapping were suggestive of small-scale duplication events in faba bean. This study also provides a preliminary indication for finer scale macrosynteny between M. truncatula, lentil and faba bean. Markers originally designed from genes on the same M. truncatula BACs were found to be grouped together in corresponding syntenic areas in lentil and faba bean.
Despite the large size of the faba bean genome, comparative mapping did not reveal evidence for polyploidisation, segmental duplication, or significant rearrangements compared to M. truncatula, although a bias in the use of single locus markers may have limited the detection of duplications. Non-coding repetitive DNA or transposable element content provides a possible explanation for the difference in genome sizes. Similar patterns of rearrangements in faba bean and lentil compared to M. truncatula support phylogenetic studies dividing these species into the tribes Viceae and Trifoliae. However, substantial macrosynteny was apparent between faba bean and M. truncatula, with the exception of chromosome 6 where no orthologous markers were found, confirming previous investigations suggesting chromosome 6 is atypical. The composite map, anchored with orthologous markers mapped in M. truncatula, provides a central reference map for future use of genomic and genetic information in faba bean genetic analysis and breeding.
The nuclear DNA is conventionally used to assess the diversity and relatedness among different species, but variations at the DNA genome level has also been used to study the relationship among different organisms. In most species, mitochondrial and chloroplast genomes are inherited maternally; therefore it is anticipated that organelle DNA remains completely associated. Many research studies were conducted simultaneously on organelle genome. The objectives of this study was to analyze the genetic relationship between chloroplast and mitochondrial DNA in three Chinese Prunus genotypes viz., Prunus persica, Prunus domestica, and Prunus avium.
We investigated the genetic diversity of Prunus genotypes using simple sequence repeat (SSR) markers relevant to the chloroplast and mitochondria. Most of the genotypes were genetically similar as revealed by phylogenetic analysis. The Y2 Wu Xing (Cherry) and L2 Hong Xin Li (Plum) genotypes have a high similarity index (0.89), followed by Zi Ye Li (0.85), whereas; L1 Tai Yang Li (plum) has the lowest genetic similarity (0.35). In case of cpSSR, Hong Tao (Peach) and L1 Tai Yang Li (Plum) genotypes demonstrated similarity index of 0.85 and Huang Tao has the lowest similarity index of 0.50. The mtSSR nucleotide sequence analysis revealed that each genotype has similar amplicon length (509 bp) except M5Y1 i.e., 505 bp with CCB256 primer; while in case of NAD6 primer, all genotypes showed different sizes. The MEHO (Peach), MEY1 (Cherry), MEL2 (Plum) and MEL1 (Plum) have 586 bps; while MEY2 (Cherry), MEZI (Plum) and MEHU (Peach) have 585, 584 and 566 bp, respectively. The CCB256 primer showed highly conserved sequences and minute single polymorphic nucleotides with no deletion or mutation. The cpSSR (ARCP511) microsatellites showed the harmonious amplicon length. The CZI (Plum), CHO (Peach) and CL1 (Plum) showed 182 bp; whileCHU (Peach), CY2 (Cherry), CL2 (Plum) and CY1 (Cherry) showed 181 bp amplicon lengths.
These results demonstrated high conservation in chloroplast and mitochondrial genome among Prunus species during the evolutionary process. These findings are valuable to study the organelle DNA diversity in different species and genotypes of Prunus to provide in depth insight in to the mitochondrial and chloroplast genomes.
Organelle DNA sequences; Prunus; SSR markers; Genetic diversity; Prunus persica; Prunus domestica; Prunus avium
The Rosaceae encompass a large number of economically-important diploid and polyploid fruit and ornamental species in many different genera. The basic chromosome numbers of these genera are x = 7, 8 and 9 and all have compact and relatively similar genome sizes. Comparative mapping between distantly-related genera has been performed to a limited extent in the Rosaceae including a comparison between Malus (subfamily Maloideae) and Prunus (subfamily Prunoideae); however no data has been published to date comparing Malus or Prunus to a member of the subfamily Rosoideae. In this paper we compare the genome of Fragaria, a member of the Rosoideae, to Prunus, a member of the Prunoideae.
The diploid genomes of Prunus (2n = 2x = 16) and Fragaria (2n = 2x = 14) were compared through the mapping of 71 anchor markers – 40 restriction fragment length polymorphisms (RFLPs), 29 indels or single nucleotide polymorphisms (SNPs) derived from expressed sequence tags (ESTs) and two simple-sequence repeats (SSRs) – on the reference maps of both genera. These markers provided good coverage of the Prunus (78%) and Fragaria (78%) genomes, with maximum gaps and average densities of 22 cM and 7.3 cM/marker in Prunus and 32 cM and 8.0 cM/marker in Fragaria.
Our results indicate a clear pattern of synteny, with most markers of each chromosome of one of these species mapping to one or two chromosomes of the other. A large number of rearrangements (36), most of which produced by inversions (27) and the rest (9) by translocations or fission/fusion events could also be inferred. We have provided the first framework for the comparison of the position of genes or DNA sequences of these two economically valuable and yet distantly-related genera of the Rosaceae.
The passion fruit (Passiflora edulis) is a tropical crop of economic importance both for juice production and consumption as fresh fruit. The juice is also used in concentrate blends that are consumed worldwide. However, very little is known about the genome of the species. Therefore, improving our understanding of passion fruit genomics is essential and to some degree a pre-requisite if its genetic resources are to be used more efficiently. In this study, we have constructed a large-insert BAC library and provided the first view on the structure and content of the passion fruit genome, using BAC-end sequence (BES) data as a major resource.
The library consisted of 82,944 clones and its levels of organellar DNA were very low. The library represents six haploid genome equivalents, and the average insert size was 108 kb. To check its utility for gene isolation, successful macroarray screening experiments were carried out with probes complementary to eight Passiflora gene sequences available in public databases. BACs harbouring those genes were used in fluorescent in situ hybridizations and unique signals were detected for four BACs in three chromosomes (n = 9). Then, we explored 10,000 BES and we identified reads likely to contain repetitive mobile elements (19.6% of all BES), simple sequence repeats and putative proteins, and to estimate the GC content (~42%) of the reads. Around 9.6% of all BES were found to have high levels of similarity to plant genes and ontological terms were assigned to more than half of the sequences analysed (940). The vast majority of the top-hits made by our sequences were to Populus trichocarpa (24.8% of the total occurrences), Theobroma cacao (21.6%), Ricinus communis (14.3%), Vitis vinifera (6.5%) and Prunus persica (3.8%).
We generated the first large-insert library for a member of Passifloraceae. This BAC library provides a new resource for genetic and genomic studies, as well as it represents a valuable tool for future whole genome study. Remarkably, a number of BAC-end pair sequences could be mapped to intervals of the sequenced Arabidopsis thaliana, V. vinifera and P. trichocarpa chromosomes, and putative collinear microsyntenic regions were identified.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-816) contains supplementary material, which is available to authorized users.
Passiflora; Passion fruit; Genomics; BAC-end sequencing; Repetitive elements; Gene content; Microsynteny; Fluorescent in situ hybridization
Comparative genome mapping studies in Rosaceae have been conducted until now by aligning genetic maps within the same genus, or closely related genera and using a limited number of common markers. The growing body of genomics resources and sequence data for both Prunus and Fragaria permits detailed comparisons between these genera and the recently released Malus × domestica genome sequence.
We generated a comparative analysis using 806 molecular markers that are anchored genetically to the Prunus and/or Fragaria reference maps, and physically to the Malus genome sequence. Markers in common for Malus and Prunus, and Malus and Fragaria, respectively were 784 and 148. The correspondence between marker positions was high and conserved syntenic blocks were identified among the three genera in the Rosaceae. We reconstructed a proposed ancestral genome for the Rosaceae.
A genome containing nine chromosomes is the most likely candidate for the ancestral Rosaceae progenitor. The number of chromosomal translocations observed between the three genera investigated was low. However, the number of inversions identified among Malus and Prunus was much higher than any reported genome comparisons in plants, suggesting that small inversions have played an important role in the evolution of these two genera or of the Rosaceae.
Despite the agronomical importance and high synteny with other Prunus species, breeding improvements for cherry have been slow compared to other temperate fruits, such as apple or peach. However, the recent release of the peach genome v1.0 by the International Peach Genome Initiative and the sequencing of cherry accessions to identify Single Nucleotide Polymorphisms (SNPs) provide an excellent basis for the advancement of cherry genetic and genomic studies. The availability of dense genetic linkage maps in phenotyped segregating progenies would be a valuable tool for breeders and geneticists. Using two sweet cherry (Prunus avium L.) intra-specific progenies derived from crosses between ‘Black Tartarian’ × ‘Kordia’ (BT×K) and ‘Regina’ × ‘Lapins’(R×L), high-density genetic maps of the four parental lines and the two segregating populations were constructed. For BT×K and R×L, 89 and 121 F1 plants were used for linkage mapping, respectively. A total of 5,696 SNP markers were tested in each progeny. As a result of these analyses, 723 and 687 markers were mapped into eight linkage groups (LGs) in BT×K and R×L, respectively. The resulting maps spanned 752.9 and 639.9 cM with an average distance of 1.1 and 0.9 cM between adjacent markers in BT×K and R×L, respectively. The maps displayed high synteny and co-linearity between each other, with the Prunus bin map, and with the peach genome v1.0 for all eight LGs (LG1–LG8). These maps provide a useful tool for investigating traits of interest in sweet cherry and represent a qualitative advance in the understanding of the cherry genome and its synteny with other members of the Rosaceae family.
Coffee trees (Rubiaceae) and tomato (Solanaceae) belong to the Asterid clade, while grapevine (Vitaceae) belongs to the Rosid clade. Coffee and tomato separated from grapevine 125 million years ago, while coffee and tomato diverged 83-89 million years ago. These long periods of divergent evolution should have permitted the genomes to reorganize significantly. So far, very few comparative mappings have been performed between very distantly related species belonging to different clades. We report the first multiple comparison between species from Asterid and Rosid clades, to examine both macro-and microsynteny relationships.
Thanks to a set of 867 COSII markers, macrosynteny was detected between coffee, tomato and grapevine. While coffee and tomato genomes share 318 orthologous markers and 27 conserved syntenic segments (CSSs), coffee and grapevine also share a similar number of syntenic markers and CSSs: 299 and 29 respectively. Despite large genome macrostructure reorganization, several large chromosome segments showed outstanding macrosynteny shedding new insights into chromosome evolution between Asterids and Rosids. We also analyzed a sequence of 174 kb containing the ovate gene, conserved in a syntenic block between coffee, tomato and grapevine that showed a high-level of microstructure conservation. A higher level of conservation was observed between coffee and grapevine, both woody and long life-cycle plants, than between coffee and tomato. Out of 16 coffee genes of this syntenic segment, 7 and 14 showed complete synteny between coffee and tomato or grapevine, respectively.
These results show that significant conservation is found between distantly related species from the Asterid (Coffea canephora and Solanum sp.) and Rosid (Vitis vinifera) clades, at the genome macrostructure and microstructure levels. At the ovate locus, conservation did not decline in relation to increasing phylogenetic distance, suggesting that the time factor alone does not explain divergences. Our results are considerably useful for syntenic studies between supposedly remote species for the isolation of important genes for agronomy.
Comparative genomics; Synteny; Genome evolution; Coffea; Vitis; Solanum
Recent phylogenetic analyses have identified Amborella trichopoda, an understory tree species endemic to the forests of New Caledonia, as sister to a clade including all other known flowering plant species. The Amborella genome is a unique reference for understanding the evolution of angiosperm genomes because it can serve as an outgroup to root comparative analyses. A physical map, BAC end sequences and sample shotgun sequences provide a first view of the 870 Mbp Amborella genome.
Analysis of Amborella BAC ends sequenced from each contig suggests that the density of long terminal repeat retrotransposons is negatively correlated with that of protein coding genes. Syntenic, presumably ancestral, gene blocks were identified in comparisons of the Amborella BAC contigs and the sequenced Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera and Oryza sativa genomes. Parsimony mapping of the loss of synteny corroborates previous analyses suggesting that the rate of structural change has been more rapid on lineages leading to Arabidopsis and Oryza compared with lineages leading to Populus and Vitis. The gamma paleohexiploidy event identified in the Arabidopsis, Populus and Vitis genomes is shown to have occurred after the divergence of all other known angiosperms from the lineage leading to Amborella.
When placed in the context of a physical map, BAC end sequences representing just 5.4% of the Amborella genome have facilitated reconstruction of gene blocks that existed in the last common ancestor of all flowering plants. The Amborella genome is an invaluable reference for inferences concerning the ancestral angiosperm and subsequent genome evolution.
Through the diversity of cytokinin regulated processes, this phytohormone has a profound impact on plant growth and development. Cytokinin signaling is involved in the control of apical and lateral meristem activity, branching pattern of the shoot, and leaf senescence. These processes influence several traits, including the stem diameter, shoot architecture, and perennial life cycle, which define the development of woody plants. To facilitate research about the role of cytokinin in regulation of woody plant development, we have identified genes associated with cytokinin signaling and homeostasis pathways from two hardwood tree species.
Taking advantage of the sequenced black cottonwood (Populus trichocarpa) and peach (Prunus persica) genomes, we have compiled a comprehensive list of genes involved in these pathways. We identified genes belonging to the six families of cytokinin oxidases (CKXs), isopentenyl transferases (IPTs), LONELY GUY genes (LOGs), two-component receptors, histidine containing phosphotransmitters (HPts), and response regulators (RRs). All together 85 Populus and 45 Prunus genes were identified, and compared to their Arabidopsis orthologs through phylogenetic analyses.
In general, when compared to Arabidopsis, differences in gene family structure were often seen in only one of the two tree species. However, one class of genes associated with cytokinin signal transduction, the CKI1-like family of two-component histidine kinases, was larger in both Populus and Prunus than in Arabidopsis.
Cytokinin signaling; Cytokinin homeostasis; Populus trichocarpa; Black cottonwood; Prunus persica; Peach
Fruits from several species of the Rosaceae family are reported to cause allergic reactions in certain populations. The allergens identified belong to mainly four protein families: pathogenesis related 10 proteins, thaumatin-like proteins, lipid transfer proteins and profilins. These families of putative allergen genes in apple (Mal d 1 to 4) have been mapped on linkage maps and subsequent genetic study on allelic diversity and hypoallergenic traits has been carried out recently. In peach (Prunus persica), these allergen gene families are denoted as Pru p 1 to 4 and for almond (Prunus dulcis)Pru du 1 to 4. Genetic analysis using current molecular tools may be helpful to establish the cause of allergenicity differences observed among different peach cultivars. This study was to characterize putative peach allergen genes for their genomic sequences and linkage map positions, and to compare them with previously characterized homologous genes in apple (Malus domestica).
Eight Pru p/du 1 genes were identified, four of which were new. All the Pru p/du 1 genes were mapped in a single bin on the top of linkage group 1 (G1). Five Pru p/du 2 genes were mapped on four different linkage groups, two very similar Pru p/du 2.01 genes (A and B) were on G3, Pru p/du 2.02 on G7,Pru p/du 2.03 on G8 and Pru p/du 2.04 on G1. There were differences in the intron and exon structure in these Pru p/du 2 genes and in their amino acid composition. Three Pru p/du 3 genes (3.01–3.03) containing an intron and a mini exon of 10 nt were mapped in a cluster on G6. Two Pru p/du 4 genes (Pru p/du 4.01 and 4.02) were located on G1 and G7, respectively. The Pru p/du 1 cluster on G1 aligned to the Mal d 1 clusters on LG16; Pru p/du 2.01A and B on G3 to Mal d 2.01A and B on LG9; the Pru p/du 3 cluster on G6 to Mal d 3.01 on LG12; Pru p/du 4.01 on G1 to Mal d 4.03 on LG2; and Pru p/du 4.02 on G7 to Mal d 4.02 on LG2.
A total of 18 putative peach/almond allergen genes have been mapped on five linkage groups. Their positions confirm the high macro-synteny between peach/almond and apple. The insight gained will help to identify key genes causing differences in allergenicity among different cultivars of peach and other Prunus species.
Expressed sequence tags (ESTs) are an important source of gene-based markers such as those based on insertion-deletions (Indels) or single-nucleotide polymorphisms (SNPs). Several gel based methods have been reported for the detection of sequence variants, however they have not been widely exploited in common bean, an important legume crop of the developing world. The objectives of this project were to develop and map EST based markers using analysis of single strand conformation polymorphisms (SSCPs), to create a transcript map for common bean and to compare synteny of the common bean map with sequenced chromosomes of other legumes.
A set of 418 EST based amplicons were evaluated for parental polymorphisms using the SSCP technique and 26% of these presented a clear conformational or size polymorphism between Andean and Mesoamerican genotypes. The amplicon based markers were then used for genetic mapping with segregation analysis performed in the DOR364 × G19833 recombinant inbred line (RIL) population. A total of 118 new marker loci were placed into an integrated molecular map for common bean consisting of 288 markers. Of these, 218 were used for synteny analysis and 186 presented homology with segments of the soybean genome with an e-value lower than 7 × 10-12. The synteny analysis with soybean showed a mosaic pattern of syntenic blocks with most segments of any one common bean linkage group associated with two soybean chromosomes. The analysis with Medicago truncatula and Lotus japonicus presented fewer syntenic regions consistent with the more distant phylogenetic relationship between the galegoid and phaseoloid legumes.
The SSCP technique is a useful and inexpensive alternative to other SNP or Indel detection techniques for saturating the common bean genetic map with functional markers that may be useful in marker assisted selection. In addition, the genetic markers based on ESTs allowed the construction of a transcript map and given their high conservation between species allowed synteny comparisons to be made to sequenced genomes. This synteny analysis may support positional cloning of target genes in common bean through the use of genomic information from these other legumes.
Loss of pollen-S function in Prunus self-compatible cultivars has been mostly associated with deletions or insertions in the S-haplotype-specific F-box (SFB) genes. However, self-compatible pollen-part mutants defective for non-S-locus factors have also been found, for instance, in the apricot (Prunus armeniaca) cv. ‘Canino’. In the present study, we report the genetic and molecular analysis of another self-compatible apricot cv. termed ‘Katy’. S-genotype of ‘Katy’ was determined as S1S2 and S-RNase PCR-typing of selfing and outcrossing populations from ‘Katy’ showed that pollen gametes bearing either the S1- or the S2-haplotype were able to overcome self-incompatibility (SI) barriers. Sequence analyses showed no SNP or indel affecting the SFB1 and SFB2 alleles from ‘Katy’ and, moreover, no evidence of pollen-S duplication was found. As a whole, the obtained results are compatible with the hypothesis that the loss-of-function of a S-locus unlinked factor gametophytically expressed in pollen (M’-locus) leads to SI breakdown in ‘Katy’. A mapping strategy based on segregation distortion loci mapped the M’-locus within an interval of 9.4 cM at the distal end of chr.3 corresponding to ∼1.29 Mb in the peach (Prunus persica) genome. Interestingly, pollen-part mutations (PPMs) causing self-compatibility (SC) in the apricot cvs. ‘Canino’ and ‘Katy’ are located within an overlapping region of ∼273 Kb in chr.3. No evidence is yet available to discern if they affect the same gene or not, but molecular markers seem to indicate that both cultivars are genetically unrelated suggesting that every PPM may have arisen independently. Further research will be necessary to reveal the precise nature of ‘Katy’ PPM, but fine-mapping already enables SC marker-assisted selection and paves the way for future positional cloning of the underlying gene.
The woodland strawberry, Fragaria vesca (2n = 2x = 14), is a versatile experimental plant system. This diminutive herbaceous perennial has a small genome (240 Mb), is amenable to genetic transformation and shares substantial sequence identity with the cultivated strawberry (Fragaria × ananassa) and other economically important rosaceous plants. Here we report the draft F. vesca genome, which was sequenced to ×39 coverage using second-generation technology, assembled de novo and then anchored to the genetic linkage map into seven pseudochromosomes. This diploid strawberry sequence lacks the large genome duplications seen in other rosids. Gene prediction modeling identified 34,809 genes, with most being supported by transcriptome mapping. Genes critical to valuable horticultural traits including flavor, nutritional value and flowering time were identified. Macrosyntenic relationships between Fragaria and Prunus predict a hypothetical ancestral Rosaceae genome that had nine chromosomes. New phylogenetic analysis of 154 protein-coding genes suggests that assignment of Populus to Malvidae, rather than Fabidae, is warranted.
Rosaceae include numerous economically important and morphologically diverse species. Comparative mapping between the member species in Rosaceae have indicated some level of synteny. Recently the whole genome of three crop species, peach, apple and strawberry, which belong to different genera of the Rosaceae family, have been sequenced, allowing in-depth comparison of these genomes.
Our analysis using the whole genome sequences of peach, apple and strawberry identified 1399 orthologous regions between the three genomes, with a mean length of around 100 kb. Each peach chromosome showed major orthology mostly to one strawberry chromosome, but to more than two apple chromosomes, suggesting that the apple genome went through more chromosomal fissions in addition to the whole genome duplication after the divergence of the three genera. However, the distribution of contiguous ancestral regions, identified using the multiple genome rearrangements and ancestors (MGRA) algorithm, suggested that the Fragaria genome went through a greater number of small scale rearrangements compared to the other genomes since they diverged from a common ancestor. Using the contiguous ancestral regions, we reconstructed a hypothetical ancestral genome for the Rosaceae 7 composed of nine chromosomes and propose the evolutionary steps from the ancestral genome to the extant Fragaria, Prunus and Malus genomes.
Our analysis shows that different modes of evolution may have played major roles in different subfamilies of Rosaceae. The hypothetical ancestral genome of Rosaceae and the evolutionary steps that lead to three different lineages of Rosaceae will facilitate our understanding of plant genome evolution as well as have a practical impact on knowledge transfer among member species of Rosaceae.
Rosaceae; Comparative genomics; Evolution
Plastids are actively involved in numerous plant processes critical to growth, development and adaptation. They play a primary role in photosynthesis, pigment and monoterpene synthesis, gravity sensing, starch and fatty acid synthesis, as well as oil, and protein storage. We applied two complementary methods to analyze the recently published apple genome (Malus × domestica) to identify putative plastid-targeted proteins, the first using TargetP and the second using a custom workflow utilizing a set of predictive programs. Apple shares roughly 40% of its 10,492 putative plastid-targeted proteins with that of the Arabidopsis (Arabidopsis thaliana) plastid-targeted proteome as identified by the Chloroplast 2010 project and ∼57% of its entire proteome with Arabidopsis. This suggests that the plastid-targeted proteomes between apple and Arabidopsis are different, and interestingly alludes to the presence of differential targeting of homologs between the two species. Co-expression analysis of 2,224 genes encoding putative plastid-targeted apple proteins suggests that they play a role in plant developmental and intermediary metabolism. Further, an inter-specific comparison of Arabidopsis, Prunus persica (Peach), Malus × domestica (Apple), Populus trichocarpa (Black cottonwood), Fragaria vesca (Woodland Strawberry), Solanum lycopersicum (Tomato) and Vitis vinifera (Grapevine) also identified a large number of novel species-specific plastid-targeted proteins. This analysis also revealed the presence of alternatively targeted homologs across species. Two separate analyses revealed that a small subset of proteins, one representing 289 protein clusters and the other 737 unique protein sequences, are conserved between seven plastid-targeted angiosperm proteomes. Majority of the novel proteins were annotated to play roles in stress response, transport, catabolic processes, and cellular component organization. Our results suggest that the current state of knowledge regarding plastid biology, preferentially based on model systems is deficient. New plant genomes are expected to enable the identification of potentially new plastid-targeted proteins that will aid in studying novel roles of plastids.
The narrow-leafed lupin, Lupinus angustifolius L., is a grain legume species with a relatively compact genome. The species has 2n = 40 chromosomes and its genome size is 960 Mbp/1C. During the last decade, L. angustifolius genomic studies have achieved several milestones, such as molecular-marker development, linkage maps, and bacterial artificial chromosome (BAC) libraries. Here, these resources were integratively used to identify and sequence two gene-rich regions (GRRs) of the genome.
The genome was screened with a probe representing the sequence of a microsatellite fragment length polymorphism (MFLP) marker linked to Phomopsis stem blight resistance. BAC clones selected by hybridization were subjected to restriction fingerprinting and contig assembly, and 232 BAC-ends were sequenced and annotated. BAC fluorescence in situ hybridization (BAC-FISH) identified eight single-locus clones. Based on physical mapping, cytogenetic localization, and BAC-end annotation, five clones were chosen for sequencing. Within the sequences of clones that hybridized in FISH to a single-locus, two large GRRs were identified. The GRRs showed strong and conserved synteny to Glycine max duplicated genome regions, illustrated by both identical gene order and parallel orientation. In contrast, in the clones with dispersed FISH signals, more than one-third of sequences were transposable elements. Sequenced, single-locus clones were used to develop 12 genetic markers, increasing the number of L. angustifolius chromosomes linked to appropriate linkage groups by five pairs.
In general, probes originating from MFLP sequences can assist genome screening and gene discovery. However, such probes are not useful for positional cloning, because they tend to hybridize to numerous loci. GRRs identified in L. angustifolius contained a low number of interspersed repeats and had a high level of synteny to the genome of the model legume G. max. Our results showed that not only was the gene nucleotide sequence conserved between soybean and lupin GRRs, but the order and orientation of particular genes in syntenic blocks was homologous, as well. These findings will be valuable to the forthcoming sequencing of the lupin genome.
Narrow-leafed lupin; Glycine max; MFLP; Genome mapping; contigs; DNA sequencing; Synteny; BAC-FISH
Mei (Prunus mume Sieb. et Zucc.) is a famous ornamental plant and fruit crop grown in East Asian countries. Limited genetic resources, especially molecular markers, have hindered the progress of mei breeding projects. Here, we performed low-depth whole-genome sequencing of Prunus mume ‘Fenban’ and Prunus mume ‘Kouzi Yudie’ to identify high-quality polymorphic markers between the two cultivars on a large scale.
A total of 1464.1 Mb and 1422.1 Mb of ‘Fenban’ and ‘Kouzi Yudie’ sequencing data were uniquely mapped to the mei reference genome with about 6-fold coverage, respectively. We detected a large number of putative polymorphic markers from the 196.9 Mb of sequencing data shared by the two cultivars, which together contained 200,627 SNPs, 4,900 InDels, and 7,063 SSRs. Among these markers, 38,773 SNPs, 174 InDels, and 418 SSRs were distributed in the 22.4 Mb CDS region, and 63.0% of these marker-containing CDS sequences were assigned to GO terms. Subsequently, 670 selected SNPs were validated using an Agilent’s SureSelect solution phase hybridization assay. A subset of 599 SNPs was used to assess the genetic similarity of a panel of mei germplasm samples and a plum (P. salicina) cultivar, producing a set of informative diversity data. We also analyzed the frequency and distribution of detected InDels and SSRs in mei genome and validated their usefulness as DNA markers. These markers were successfully amplified in the cultivars and in their segregating progeny.
A large set of high-quality polymorphic SNPs, InDels, and SSRs were identified in parallel between ‘Fenban’ and ‘Kouzi Yudie’ using low-depth whole-genome sequencing. The study presents extensive data on these polymorphic markers, which can be useful for constructing high-resolution genetic maps, performing genome-wide association studies, and designing genomic selection strategies in mei.
Low-depth genome sequencing; SNPs; InDels; SSRs; SNP array
It is difficult to accurately interpret chromosomal correspondences such as true orthology and paralogy due to significant divergence of genomes from a common ancestor. Analyses are particularly problematic among lineages that have repeatedly experienced whole genome duplication (WGD) events. To compare multiple "subgenomes" derived from genome duplications, we need to relax the traditional requirements of "one-to-one" syntenic matchings of genomic regions in order to reflect "one-to-many" or more generally "many-to-many" matchings. However this relaxation may result in the identification of synteny blocks that are derived from ancient shared WGDs that are not of interest. For many downstream analyses, we need to eliminate weak, low scoring alignments from pairwise genome comparisons. Our goal is to objectively select subset of synteny blocks whose total scores are maximized while respecting the duplication history of the genomes in comparison. We call this "quota-based" screening of synteny blocks in order to appropriately fill a quota of syntenic relationships within one genome or between two genomes having WGD events.
We have formulated the synteny block screening as an optimization problem known as "Binary Integer Programming" (BIP), which is solved using existing linear programming solvers. The computer program QUOTA-ALIGN performs this task by creating a clear objective function that maximizes the compatible set of synteny blocks under given constraints on overlaps and depths (corresponding to the duplication history in respective genomes). Such a procedure is useful for any pairwise synteny alignments, but is most useful in lineages affected by multiple WGDs, like plants or fish lineages. For example, there should be a 1:2 ploidy relationship between genome A and B if genome B had an independent WGD subsequent to the divergence of the two genomes. We show through simulations and real examples using plant genomes in the rosid superorder that the quota-based screening can eliminate ambiguous synteny blocks and focus on specific genomic evolutionary events, like the divergence of lineages (in cross-species comparisons) and the most recent WGD (in self comparisons).
The QUOTA-ALIGN algorithm screens a set of synteny blocks to retain only those compatible with a user specified ploidy relationship between two genomes. These blocks, in turn, may be used for additional downstream analyses such as identifying true orthologous regions in interspecific comparisons. There are two major contributions of QUOTA-ALIGN: 1) reducing the block screening task to a BIP problem, which is novel; 2) providing an efficient software pipeline starting from all-against-all BLAST to the screened synteny blocks with dot plot visualizations. Python codes and full documentations are publicly available http://github.com/tanghaibao/quota-alignment. QUOTA-ALIGN program is also integrated as a major component in SynMap http://genomevolution.com/CoGe/SynMap.pl, offering easier access to thousands of genomes for non-programmers.
Although a large number of single nucleotide polymorphism (SNP) markers covering the entire genome are needed to enable molecular breeding efforts such as genome wide association studies, fine mapping, genomic selection and marker-assisted selection in peach [Prunus persica (L.) Batsch] and related Prunus species, only a limited number of genetic markers, including simple sequence repeats (SSRs), have been available to date. To address this need, an international consortium (The International Peach SNP Consortium; IPSC) has pursued a coordinated effort to perform genome-scale SNP discovery in peach using next generation sequencing platforms to develop and characterize a high-throughput Illumina Infinium® SNP genotyping array platform. We performed whole genome re-sequencing of 56 peach breeding accessions using the Illumina and Roche/454 sequencing technologies. Polymorphism detection algorithms identified a total of 1,022,354 SNPs. Validation with the Illumina GoldenGate® assay was performed on a subset of the predicted SNPs, verifying ∼75% of genic (exonic and intronic) SNPs, whereas only about a third of intergenic SNPs were verified. Conservative filtering was applied to arrive at a set of 8,144 SNPs that were included on the IPSC peach SNP array v1, distributed over all eight peach chromosomes with an average spacing of 26.7 kb between SNPs. Use of this platform to screen a total of 709 accessions of peach in two separate evaluation panels identified a total of 6,869 (84.3%) polymorphic SNPs.
The almost 7,000 SNPs verified as polymorphic through extensive empirical evaluation represent an excellent source of markers for future studies in genetic relatedness, genetic mapping, and dissecting the genetic architecture of complex agricultural traits. The IPSC peach SNP array v1 is commercially available and we expect that it will be used worldwide for genetic studies in peach and related stone fruit and nut species.
In view of the immense value of Brassica rapa in the fields of agriculture and molecular biology, the multinational Brassica rapa Genome Sequencing Project (BrGSP) was launched in 2003 by five countries. The developing BrGSP has valuable resources for the community, including a reference genetic map and seed BAC sequences. Although the initial B. rapa linkage map served as a reference for the BrGSP, there was ambiguity in reconciling the linkage groups with the ten chromosomes of B. rapa. Consequently, the BrGSP assigned each of the linkage groups to the project members as chromosome substitutes for sequencing.
We identified simple sequence repeat (SSR) motifs in the B. rapa genome with the sequences of seed BACs used for the BrGSP. By testing 749 amplicons containing SSR motifs, we identified polymorphisms that enabled the anchoring of 188 BACs onto the B. rapa reference linkage map consisting of 719 loci in the 10 linkage groups with an average distance of 1.6 cM between adjacent loci. The anchored BAC sequences enabled the identification of 30 blocks of conserved synteny, totaling 534.9 cM in length, between the genomes of B. rapa and Arabidopsis thaliana. Most of these were consistent with previously reported duplication and rearrangement events that differentiate these genomes. However, we were able to identify the collinear regions for seven additional previously uncharacterized sections of the A genome. Integration of the linkage map with the B. rapa cytogenetic map was accomplished by FISH with probes representing 20 BAC clones, along with probes for rDNA and centromeric repeat sequences. This integration enabled unambiguous alignment and orientation of the maps representing the 10 B. rapa chromosomes.
We developed a second generation reference linkage map for B. rapa, which was aligned unambiguously to the B. rapa cytogenetic map. Furthermore, using our data, we confirmed and extended the comparative genome analysis between B. rapa and A. thaliana. This work will serve as a basis for integrating the genetic, physical, and chromosome maps of the BrGSP, as well as for studies on polyploidization, speciation, and genome duplication in the genus Brassica.
The resistance of plants to pathogens relies on two lines of defense: a basal defense response and a pathogen-specific system, in which resistance (R) genes induce defense reactions after detection of pathogen-associated molecular patterns (PAMPS). In the specific system, a so-called arms race has developed in which the emergence of new races of a pathogen leads to the diversification of plant resistance genes to counteract the pathogens’ effect. The mechanism of resistance gene diversification has been elucidated well for short-lived annual species, but data are mostly lacking for long-lived perennial and clonally propagated plants, such as roses. We analyzed the rose black spot resistance gene, Rdr1, in five members of the Rosaceae: Rosa multiflora, Rosa rugosa, Fragaria vesca (strawberry), Malus x domestica (apple) and Prunus persica (peach), and we present the deduced possible mechanism of R-gene diversification.
We sequenced a 340.4-kb region from R. rugosa orthologous to the Rdr1 locus in R. multiflora. Apart from some deletions and rearrangements, the two loci display a high degree of synteny. Additionally, less pronounced synteny is found with an orthologous locus in strawberry but is absent in peach and apple, where genes from the Rdr1 locus are distributed on two different chromosomes. An analysis of 20 TIR-NBS-LRR (TNL) genes obtained from R. rugosa and R. multiflora revealed illegitimate recombination, gene conversion, unequal crossing over, indels, point mutations and transposable elements as mechanisms of diversification.
A phylogenetic analysis of 53 complete TNL genes from the five Rosaceae species revealed that with the exception of some genes from apple and peach, most of the genes occur in species-specific clusters, indicating that recent TNL gene diversification began prior to the split of Rosa from Fragaria in the Rosoideae and peach from apple in the Spiraeoideae and continued after the split in individual species. Sequence similarity of up to 99% is obtained between two R. multiflora TNL paralogs, indicating a very recent duplication.
The mechanisms by which TNL genes from perennial Rosaceae diversify are mainly similar to those from annual plant species. However, most TNL genes appear to be of recent origin, likely due to recent duplications, supporting the hypothesis that TNL genes in woody perennials are generally younger than those from annuals. This recent origin might facilitate the development of new resistance specificities, compensating for longer generation times in woody perennials.