High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus). A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs). Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs.
Fruit quality features resulting from ripening processes need to be preserved throughout storage for economical reasons. However, during this period several physiological disorders can occur, of which superficial scald is one of the most important, due to the development of large brown areas on the fruit skin surface.
This study examined the variation in polyphenolic content with the progress of superficial scald in apple, also with respect to 1-MCP, an ethylene competitor interacting with the hormone receptors and known to interfere with this etiology. The change in the accumulation of these metabolites was further correlated with the gene set involved in this pathway, together with two specific VOCs (Volatile Organic Compounds), α-farnesene and its oxidative form, 6-methyl-5-hepten-2-one. Metabolite profiling and qRT-PCR assay showed these volatiles are more heavily involved in the signalling system, while the browning coloration would seem to be due more to a specific accumulation of chlorogenic acid (as a consequence of the activation of MdPAL and MdC3H), and its further oxidation carried out by a polyphenol oxidase gene (MdPPO). In this physiological scenario, new evidence regarding the involvement of an anti-apoptotic regulatory mechanism for the compartmentation of this phenomenon in the skin alone was also hypothesized, as suggested by the expression profile of the MdDAD1, MdDND1 and MdLSD1 genes.
The results presented in this work represent a step forward in understanding the physiological mechanisms of superficial scald in apple, shedding light on the regulation of the specific physiological cascade.
Malus domestica; Cold storage; Postharvest; Superficial scald; 1-MCP; Polyphenol oxidase; Polyphenols; α-farnesene; Programmed death cell
Next-generation DNA sequencing (NGS) produces vast amounts of DNA sequence data, but it is not specifically designed to generate data suitable for genetic mapping. Recently developed DNA library preparation methods for NGS have helped solve this problem, however, by combining the use of reduced representation libraries with DNA sample barcoding to generate genome-wide genotype data from a common set of genetic markers across a large number of samples. Here we use such a method, called genotyping-by-sequencing (GBS), to produce a data set for genetic mapping in an F1 population of apples (Malus × domestica) segregating for skin color. We show that GBS produces a relatively large, but extremely sparse, genotype matrix: over 270,000 SNPs were discovered but most SNPs have too much missing data across samples to be useful for genetic mapping. After filtering for genotype quality and missing data, only 6% of the 85 million DNA sequence reads contributed to useful genotype calls. Despite this limitation, using existing software and a set of simple heuristics, we generated a final genotype matrix containing 3967 SNPs from 89 DNA samples from a single lane of Illumina HiSeq and used it to create a saturated genetic linkage map and to identify a known QTL underlying apple skin color. We therefore demonstrate that GBS is a cost-effective method for generating genome-wide SNP data suitable for genetic mapping in a highly diverse and heterozygous agricultural species. We anticipate future improvements to the GBS analysis pipeline presented here that will enhance the utility of next-generation DNA sequence data for the purposes of genetic mapping across diverse species.
next-generation DNA sequencing; genotyping-by-sequencing; apple; Malus; QTL; SNP
We present a draft assembly of the genome of European pear (Pyrus communis) ‘Bartlett’. Our assembly was developed employing second generation sequencing technology (Roche 454), from single-end, 2 kb, and 7 kb insert paired-end reads using Newbler (version 2.7). It contains 142,083 scaffolds greater than 499 bases (maximum scaffold length of 1.2 Mb) and covers a total of 577.3 Mb, representing most of the expected 600 Mb Pyrus genome. A total of 829,823 putative single nucleotide polymorphisms (SNPs) were detected using re-sequencing of ‘Louise Bonne de Jersey’ and ‘Old Home’. A total of 2,279 genetically mapped SNP markers anchor 171 Mb of the assembled genome. Ab initio gene prediction combined with prediction based on homology searching detected 43,419 putative gene models. Of these, 1219 proteins (556 clusters) are unique to European pear compared to 12 other sequenced plant genomes. Analysis of the expansin gene family provided an example of the quality of the gene prediction and an insight into the relationships among one class of cell wall related genes that control fruit softening in both European pear and apple (Malus×domestica). The ‘Bartlett’ genome assembly v1.0 (http://www.rosaceae.org/species/pyrus/pyrus_communis/genome_v1.0) is an invaluable tool for identifying the genetic control of key horticultural traits in pear and will enable the wide application of marker-assisted and genomic selection that will enhance the speed and efficiency of pear cultivar development.
Analyzing genome structure in different species allows to gain an insight into the evolution of plant genome size. Olive (Olea europaea L.) has a medium-sized haploid genome of 1.4 Gb, whose structure is largely uncharacterized, despite the growing importance of this tree as oil crop. Next-generation sequencing technologies and different computational procedures have been used to study the composition of the olive genome and its repetitive fraction. A total of 2.03 and 2.3 genome equivalents of Illumina and 454 reads from genomic DNA, respectively, were assembled following different procedures, which produced more than 200,000 differently redundant contigs, with mean length higher than 1,000 nt. Mapping Illumina reads onto the assembled sequences was used to estimate their redundancy. The genome data set was subdivided into highly and medium redundant and nonredundant contigs. By combining identification and mapping of repeated sequences, it was established that tandem repeats represent a very large portion of the olive genome (∼31% of the whole genome), consisting of six main families of different length, two of which were first discovered in these experiments. The other large redundant class in the olive genome is represented by transposable elements (especially long terminal repeat-retrotransposons). On the whole, the results of our analyses show the peculiar landscape of the olive genome, related to the massive amplification of tandem repeats, more than that reported for any other sequenced plant genome.
genome landscape; Olea europaea; repetitive DNA; tandem repeats; retrotransposons; assembly of NGS reads
The family of resistance gene analogues (RGAs) with a nucleotide-binding site (NBS) domain accounts for the largest number of disease resistance genes and is one of the largest gene families in plants. We have identified 868 RGAs in the genome of the apple (Malus × domestica Borkh.) cultivar ‘Golden Delicious’. This represents 1.51% of the total number of predicted genes for this cultivar. Several evolutionary features are pronounced in M. domestica, including a high fraction (80%) of RGAs occurring in clusters. This suggests frequent tandem duplication and ectopic translocation events. Of the identified RGAs, 56% are located preferentially on six chromosomes (Chr 2, 7, 8, 10, 11, and 15), and 25% are located on Chr 2. TIR-NBS and non-TIR-NBS classes of RGAs are primarily exclusive of different chromosomes, and 99% of non-TIR-NBS RGAs are located on Chr 11. A phylogenetic reconstruction was conducted to study the evolution of RGAs in the Rosaceae family. More than 1400 RGAs were identified in six species based on their NBS domain, and a neighbor-joining analysis was used to reconstruct the phylogenetic relationships among the protein sequences. Specific phylogenetic clades were found for RGAs of Malus, Fragaria, and Rosa, indicating genus-specific evolution of resistance genes. However, strikingly similar RGAs were shared in Malus, Pyrus, and Prunus, indicating high conservation of specific RGAs and suggesting a monophyletic origin of these three genera.
In terms of the quality of minimally processed fruit, flesh browning is fundamentally important in the development of an aesthetically unpleasant appearance, with consequent off-flavours. The development of browning depends on the enzymatic action of the polyphenol oxidase (PPO). In the ‘Golden Delicious’ apple genome ten PPO genes were initially identified and located on three main chromosomes (2, 5 and 10). Of these genes, one element in particular, here called Md-PPO, located on chromosome 10, was further investigated and genetically mapped in two apple progenies (‘Fuji x Pink Lady’ and ‘Golden Delicious x Braeburn’). Both linkage maps, made up of 481 and 608 markers respectively, were then employed to find QTL regions associated with fruit flesh browning, allowing the detection of 25 QTLs related to several browning parameters. These were distributed over six linkage groups with LOD values spanning from 3.08 to 4.99 and showed a rate of phenotypic variance from 26.1 to 38.6%. Anchoring of these intervals to the apple genome led to the identification of several genes involved in polyphenol synthesis and cell wall metabolism. Finally, the expression profile of two specific candidate genes, up and downstream of the polyphenolic pathway, namely phenylalanine ammonia lyase (PAL) and polyphenol oxidase (PPO), provided insight into flesh browning physiology. Md-PPO was further analyzed and two haplotypes were characterised and associated with fruit flesh browning in apple.
We have used new generation sequencing (NGS) technologies to identify single nucleotide polymorphism (SNP) markers from three European pear (Pyrus communis L.) cultivars and subsequently developed a subset of 1096 pear SNPs into high throughput markers by combining them with the set of 7692 apple SNPs on the IRSC apple Infinium® II 8K array. We then evaluated this apple and pear Infinium® II 9K SNP array for large-scale genotyping in pear across several species, using both pear and apple SNPs. The segregating populations employed for array validation included a segregating population of European pear (‘Old Home’×‘Louise Bon Jersey’) and four interspecific breeding families derived from Asian (P. pyrifolia Nakai and P. bretschneideri Rehd.) and European pear pedigrees. In total, we mapped 857 polymorphic pear markers to construct the first SNP-based genetic maps for pear, comprising 78% of the total pear SNPs included in the array. In addition, 1031 SNP markers derived from apple (13% of the total apple SNPs included in the array) were polymorphic and were mapped in one or more of the pear populations. These results are the first to demonstrate SNP transferability across the genera Malus and Pyrus. Our construction of high density SNP-based and gene-based genetic maps in pear represents an important step towards the identification of chromosomal regions associated with a range of horticultural characters, such as pest and disease resistance, orchard yield and fruit quality.
Second generation sequencing has permitted detailed sequence characterisation at the whole genome level of a growing number of non-model organisms, but the data produced have short read-lengths and biased genome coverage leading to fragmented genome assemblies. The PacBio RS long-read sequencing platform offers the promise of increased read length and unbiased genome coverage and thus the potential to produce genome sequence data of a finished quality containing fewer gaps and longer contigs. However, these advantages come at a much greater cost per nucleotide and with a perceived increase in error-rate. In this investigation, we evaluated the performance of the PacBio RS sequencing platform through the sequencing and de novo assembly of the Potentilla micrantha chloroplast genome.
Following error-correction, a total of 28,638 PacBio RS reads were recovered with a mean read length of 1,902 bp totalling 54,492,250 nucleotides and representing an average depth of coverage of 320× the chloroplast genome. The dataset covered the entire 154,959 bp of the chloroplast genome in a single contig (100% coverage) compared to seven contigs (90.59% coverage) recovered from an Illumina data, and revealed no bias in coverage of GC rich regions. Post-assembly the data were largely concordant with the Illumina data generated and allowed 187 ambiguities in the Illumina data to be resolved. The additional read length also permitted small differences in the two inverted repeat regions to be assigned unambiguously.
This is the first report to our knowledge of a chloroplast genome assembled de novo using PacBio sequence data. The PacBio RS data generated here were assembled into a single large contig spanning the P. micrantha chloroplast genome, with a higher degree of accuracy than an Illumina dataset generated at a much greater depth of coverage, due to longer read lengths and lower GC bias in the data. The results we present suggest PacBio data will be of immense utility for the development of genome sequence assemblies containing fewer unresolved gaps and ambiguities and a significantly smaller number of contigs than could be produced using short-read sequence data alone.
Third-generation sequencing; NGen; Genomics; Assembly; Annotation; Oxford nanopore; Pacific BioSciences; Roche 454
High throughput arrays for the simultaneous genotyping of thousands of single-nucleotide polymorphisms (SNPs) have made the rapid genetic characterisation of plant genomes and the development of saturated linkage maps a realistic prospect for many plant species of agronomic importance. However, the correct calling of SNP genotypes in divergent polyploid genomes using array technology can be problematic due to paralogy, and to divergence in probe sequences causing changes in probe binding efficiencies. An Illumina Infinium II whole-genome genotyping array was recently developed for the cultivated apple and used to develop a molecular linkage map for an apple rootstock progeny (M432), but a large proportion of segregating SNPs were not mapped in the progeny, due to unexpected genotype clustering patterns. To investigate the causes of this unexpected clustering we performed BLAST analysis of all probe sequences against the ‘Golden Delicious’ genome sequence and discovered evidence for paralogous annealing sites and probe sequence divergence for a high proportion of probes contained on the array. Following visual re-evaluation of the genotyping data generated for 8,788 SNPs for the M432 progeny using the array, we manually re-scored genotypes at 818 loci and mapped a further 797 markers to the M432 linkage map. The newly mapped markers included the majority of those that could not be mapped previously, as well as loci that were previously scored as monomorphic, but which segregated due to divergence leading to heterozygosity in probe annealing sites. An evaluation of the 8,788 probes in a diverse collection of Malus germplasm showed that more than half the probes returned genotype clustering patterns that were difficult or impossible to interpret reliably, highlighting implications for the use of the array in genome-wide association studies.
Apple is a widely cultivated fruit crop for its quality properties and extended storability. Among the several quality factors, texture is the most important and appreciated, and within the apple variety panorama the cortex texture shows a broad range of variability. Anatomically these variations depend on degradation events occurring in both fruit primary cell wall and middle lamella. This physiological process is regulated by an enzymatic network generally encoded by large gene families, among which polygalacturonase is devoted to the depolymerization of pectin. In apple, Md-PG1, a key gene belonging to the polygalacturonase gene family, was mapped on chromosome 10 and co-localized within the statistical interval of a major hot spot QTL associated to several fruit texture sub-phenotypes.
In this work, a QTL corresponding to the position of Md-PG1 was validated and new functional alleles associated to the fruit texture properties in 77 apple cultivars were discovered. 38 SNPs genotyped by gene full length resequencing and 2 SSR markers ad hoc targeted in the gene metacontig were employed. Out of this SNP set, eleven were used to define three significant haplotypes statistically associated to several texture components. The impact of Md-PG1 in the fruit cell wall disassembly was further confirmed by the cortex structure electron microscope scanning in two apple varieties characterized by opposite texture performance, such as ‘Golden Delicious’ and ‘Granny Smith’.
The results here presented step forward into the genetic dissection of fruit texture in apple. This new set of haplotypes, and microsatellite alleles, can represent a valuable toolbox for a more efficient parental selection as well as the identification of new apple accessions distinguished by superior fruit quality features.
Rapid development of highly saturated genetic maps aids molecular breeding, which can accelerate gain per breeding cycle in woody perennial plants such as Rubus idaeus (red raspberry). Recently, robust genotyping methods based on high-throughput sequencing were developed, which provide high marker density, but result in some genotype errors and a large number of missing genotype values. Imputation can reduce the number of missing values and can correct genotyping errors, but current methods of imputation require a reference genome and thus are not an option for most species.
Genotyping by Sequencing (GBS) was used to produce highly saturated maps for a R. idaeus pseudo-testcross progeny. While low coverage and high variance in sequencing resulted in a large number of missing values for some individuals, a novel method of imputation based on maximum likelihood marker ordering from initial marker segregation overcame the challenge of missing values, and made map construction computationally tractable. The two resulting parental maps contained 4521 and 2391 molecular markers spanning 462.7 and 376.6 cM respectively over seven linkage groups. Detection of precise genomic regions with segregation distortion was possible because of map saturation. Microsatellites (SSRs) linked these results to published maps for cross-validation and map comparison.
GBS together with genome-independent imputation provides a rapid method for genetic map construction in any pseudo-testcross progeny. Our method of imputation estimates the correct genotype call of missing values and corrects genotyping errors that lead to inflated map size and reduced precision in marker placement. Comparison of SSRs to published R. idaeus maps showed that the linkage maps constructed with GBS and our method of imputation were robust, and marker positioning reliable. The high marker density allowed identification of genomic regions with segregation distortion in R. idaeus, which may help to identify deleterious alleles that are the basis of inbreeding depression in the species.
Genotyping by sequencing; GBS; RADseq; Imputation; Raspberry; Rubus idaeus; Psuedotestcross; Linkage map; Segregation distortion
Downy mildew, caused by Plasmopara viticola, is one of the most severe diseases of grapevine and is commonly controlled by fungicide treatments. The beneficial microorganism Trichoderma harzianum T39 (T39) can induce resistance to downy mildew, although the molecular events associated with this process have not yet been elucidated in grapevine. A next generation RNA sequencing (RNA-Seq) approach was used to study global transcriptional changes associated with resistance induced by T39 in Vitis vinifera Pinot Noir leaves. The long-term aim was to develop strategies to optimize the use of this agent for downy mildew control.
More than 14.8 million paired-end reads were obtained for each biological replicate of T39-treated and control leaf samples collected before and 24 h after P. viticola inoculation. RNA-Seq analysis resulted in the identification of 7,024 differentially expressed genes, highlighting the complex transcriptional reprogramming of grapevine leaves during resistance induction and in response to pathogen inoculation. Our data show that T39 has a dual effect: it directly modulates genes related to the microbial recognition machinery, and it enhances the expression of defence-related processes after pathogen inoculation. Whereas several genes were commonly affected by P. viticola in control and T39-treated plants, opposing modulation of genes related to responses to stress and protein metabolism was found. T39-induced resistance partially inhibited some disease-related processes and specifically activated defence responses after P. viticola inoculation, causing a significant reduction of downy mildew symptoms.
The global transcriptional analysis revealed that defence processes known to be implicated in the reaction of resistant genotypes to downy mildew were partially activated by T39-induced resistance in susceptible grapevines. Genes identified in this work are an important source of markers for selecting novel resistance inducers and for the analysis of environmental conditions that might affect induced resistance mechanisms.
Induced resistance; Next generation sequencing; RNA-Seq; Transcriptomics; Gene expression; Vitis vinifera; Plant-pathogen interactions
Somatic mutation is a natural mechanism which allows plant growers to develop new cultivars. As a source of variation within a uniform genetic background, it also represents an ideal tool for studying the genetic make-up of important traits and for establishing gene functions. Layer-specific molecular characterization of the Pinot family of grape cultivars was conducted to provide an evolutionary explanation for the somatic mutations that have affected the locus of berry colour. Through the study of the structural dynamics along chromosome 2, a very large deletion present in a single Pinot gris cell layer was identified and characterized. This mutation reveals that Pinot gris and Pinot blanc arose independently from the ancestral Pinot noir, suggesting a novel parallel evolutionary model. This proposed ‘Pinot-model’ represents a breakthrough towards the full understanding of the mechanisms behind the formation of white, grey, red, and pink grape cultivars, and eventually of their specific enological aptitude.
Berry colour; grapevine; layer; molecular characterization; SSRs and SNPs; Vitis vinifera
Carotenoids are a heterogeneous group of plant isoprenoids primarily involved in photosynthesis. In plants the cleavage of carotenoids leads to the formation of the phytohormones abscisic acid and strigolactone, and C13-norisoprenoids involved in the characteristic flavour and aroma compounds in flowers and fruits and are of specific importance in the varietal character of grapes and wine. This work extends the previous reports of carotenoid gene expression and photosynthetic pigment analysis by providing an up-to-date pathway analysis and an important framework for the analysis of carotenoid metabolic pathways in grapevine.
Comparative genomics was used to identify 42 genes putatively involved in carotenoid biosynthesis/catabolism in grapevine. The genes are distributed on 16 of the 19 chromosomes and have been localised to the physical map of the heterozygous ENTAV115 grapevine sequence. Nine of the genes occur as single copies whereas the rest of the carotenoid metabolic genes have more than one paralogue. The cDNA copies of eleven corresponding genes from Vitis vinifera L. cv. Pinotage were characterised, and four where shown to be functional. Microarrays provided expression profiles of 39 accessions in the metabolic pathway during three berry developmental stages in Sauvignon blanc, whereas an optimised HPLC analysis provided the concentrations of individual carotenoids. This provides evidence of the functioning of the lutein epoxide cycle and the respective genes in grapevine. Similarly, orthologues of genes leading to the formation of strigolactone involved in shoot branching inhibition were identified: CCD7, CCD8 and MAX1. Moreover, the isoforms typically have different expression patterns, confirming the complex regulation of the pathway. Of particular interest is the expression pattern of the three VvNCEDs: Our results support previous findings that VvNCED3 is likely the isoform linked to ABA content in berries.
The carotenoid metabolic pathway is well characterised, and the genes and enzymes have been studied in a number of plants. The study of the 42 carotenoid pathway genes of grapevine showed that they share a high degree of similarity with other eudicots. Expression and pigment profiling of developing berries provided insights into the most complete grapevine carotenoid pathway representation. This study represents an important reference study for further characterisation of carotenoid biosynthesis and catabolism in grapevine.
A whole-genome genotyping array has previously been developed for Malus using SNP data from 28 Malus genotypes. This array offers the prospect of high throughput genotyping and linkage map development for any given Malus progeny. To test the applicability of the array for mapping in diverse Malus genotypes, we applied the array to the construction of a SNP-based linkage map of an apple rootstock progeny.
Of the 7,867 Malus SNP markers on the array, 1,823 (23.2%) were heterozygous in one of the two parents of the progeny, 1,007 (12.8%) were heterozygous in both parental genotypes, whilst just 2.8% of the 921 Pyrus SNPs were heterozygous. A linkage map spanning 1,282.2 cM was produced comprising 2,272 SNP markers, 306 SSR markers and the S-locus. The length of the M432 linkage map was increased by 52.7 cM with the addition of the SNP markers, whilst marker density increased from 3.8 cM/marker to 0.5 cM/marker. Just three regions in excess of 10 cM remain where no markers were mapped. We compared the positions of the mapped SNP markers on the M432 map with their predicted positions on the ‘Golden Delicious’ genome sequence. A total of 311 markers (13.7% of all mapped markers) mapped to positions that conflicted with their predicted positions on the ‘Golden Delicious’ pseudo-chromosomes, indicating the presence of paralogous genomic regions or mis-assignments of genome sequence contigs during the assembly and anchoring of the genome sequence.
We incorporated data for the 2,272 SNP markers onto the map of the M432 progeny and have presented the most complete and saturated map of the full 17 linkage groups of M. pumila to date. The data were generated rapidly in a high-throughput semi-automated pipeline, permitting significant savings in time and cost over linkage map construction using microsatellites. The application of the array will permit linkage maps to be developed for QTL analyses in a cost-effective manner, and the identification of SNPs that have been assigned erroneous positions on the ‘Golden Delicious’ reference sequence will assist in the continued improvement of the genome sequence assembly for that variety.
Infinium; Golden Gate; Breeding; Selection; Genome sequence; Marker
The woodland strawberry, Fragaria vesca (2n = 2x = 14), is a versatile experimental plant system. This diminutive herbaceous perennial has a small genome (240 Mb), is amenable to genetic transformation and shares substantial sequence identity with the cultivated strawberry (Fragaria × ananassa) and other economically important rosaceous plants. Here we report the draft F. vesca genome, which was sequenced to ×39 coverage using second-generation technology, assembled de novo and then anchored to the genetic linkage map into seven pseudochromosomes. This diploid strawberry sequence lacks the large genome duplications seen in other rosids. Gene prediction modeling identified 34,809 genes, with most being supported by transcriptome mapping. Genes critical to valuable horticultural traits including flavor, nutritional value and flowering time were identified. Macrosyntenic relationships between Fragaria and Prunus predict a hypothetical ancestral Rosaceae genome that had nine chromosomes. New phylogenetic analysis of 154 protein-coding genes suggests that assignment of Populus to Malvidae, rather than Fabidae, is warranted.
Rosaceae include numerous economically important and morphologically diverse species. Comparative mapping between the member species in Rosaceae have indicated some level of synteny. Recently the whole genome of three crop species, peach, apple and strawberry, which belong to different genera of the Rosaceae family, have been sequenced, allowing in-depth comparison of these genomes.
Our analysis using the whole genome sequences of peach, apple and strawberry identified 1399 orthologous regions between the three genomes, with a mean length of around 100 kb. Each peach chromosome showed major orthology mostly to one strawberry chromosome, but to more than two apple chromosomes, suggesting that the apple genome went through more chromosomal fissions in addition to the whole genome duplication after the divergence of the three genera. However, the distribution of contiguous ancestral regions, identified using the multiple genome rearrangements and ancestors (MGRA) algorithm, suggested that the Fragaria genome went through a greater number of small scale rearrangements compared to the other genomes since they diverged from a common ancestor. Using the contiguous ancestral regions, we reconstructed a hypothetical ancestral genome for the Rosaceae 7 composed of nine chromosomes and propose the evolutionary steps from the ancestral genome to the extant Fragaria, Prunus and Malus genomes.
Our analysis shows that different modes of evolution may have played major roles in different subfamilies of Rosaceae. The hypothetical ancestral genome of Rosaceae and the evolutionary steps that lead to three different lineages of Rosaceae will facilitate our understanding of plant genome evolution as well as have a practical impact on knowledge transfer among member species of Rosaceae.
Rosaceae; Comparative genomics; Evolution
Predicting protein function has become increasingly demanding in the era of next generation sequencing technology. The task to assign a curator-reviewed function to every single sequence is impracticable. Bioinformatics tools, easy to use and able to provide automatic and reliable annotations at a genomic scale, are necessary and urgent. In this scenario, the Gene Ontology has provided the means to standardize the annotation classification with a structured vocabulary which can be easily exploited by computational methods.
Argot2 is a web-based function prediction tool able to annotate nucleic or protein sequences from small datasets up to entire genomes. It accepts as input a list of sequences in FASTA format, which are processed using BLAST and HMMER searches vs UniProKB and Pfam databases respectively; these sequences are then annotated with GO terms retrieved from the UniProtKB-GOA database and the terms are weighted using the e-values from BLAST and HMMER. The weighted GO terms are processed according to both their semantic similarity relations described by the Gene Ontology and their associated score. The algorithm is based on the original idea developed in a previous tool called Argot. The entire engine has been completely rewritten to improve both accuracy and computational efficiency, thus allowing for the annotation of complete genomes.
The revised algorithm has been already employed and successfully tested during in-house genome projects of grape and apple, and has proven to have a high precision and recall in all our benchmark conditions. It has also been successfully compared with Blast2GO, one of the methods most commonly employed for sequence annotation. The server is freely accessible at http://www.medcomp.medicina.unipd.it/Argot2.
As high-throughput genetic marker screening systems are essential for a range of genetics studies and plant breeding applications, the International RosBREED SNP Consortium (IRSC) has utilized the Illumina Infinium® II system to develop a medium- to high-throughput SNP screening tool for genome-wide evaluation of allelic variation in apple (Malus×domestica) breeding germplasm. For genome-wide SNP discovery, 27 apple cultivars were chosen to represent worldwide breeding germplasm and re-sequenced at low coverage with the Illumina Genome Analyzer II. Following alignment of these sequences to the whole genome sequence of ‘Golden Delicious’, SNPs were identified using SoapSNP. A total of 2,113,120 SNPs were detected, corresponding to one SNP to every 288 bp of the genome. The Illumina GoldenGate® assay was then used to validate a subset of 144 SNPs with a range of characteristics, using a set of 160 apple accessions. This validation assay enabled fine-tuning of the final subset of SNPs for the Illumina Infinium® II system. The set of stringent filtering criteria developed allowed choice of a set of SNPs that not only exhibited an even distribution across the apple genome and a range of minor allele frequencies to ensure utility across germplasm, but also were located in putative exonic regions to maximize genotyping success rate. A total of 7867 apple SNPs was established for the IRSC apple 8K SNP array v1, of which 5554 were polymorphic after evaluation in segregating families and a germplasm collection. This publicly available genomics resource will provide an unprecedented resolution of SNP haplotypes, which will enable marker-locus-trait association discovery, description of the genetic architecture of quantitative traits, investigation of genetic variation (neutral and functional), and genomic selection in apple.
Apple (Malus×domestica Borkh) is among the main sources of phenolic compounds in the human diet. The genetic basis of the quantitative variations of these potentially beneficial phenolic compounds was investigated. A segregating F1 population was used to map metabolite quantitative trait loci (mQTLs). Untargeted metabolic profiling of peel and flesh tissues of ripe fruits was performed using liquid chromatography–mass spectrometry (LC-MS), resulting in the detection of 418 metabolites in peel and 254 in flesh. In mQTL mapping using MetaNetwork, 669 significant mQTLs were detected: 488 in the peel and 181 in the flesh. Four linkage groups (LGs), LG1, LG8, LG13, and LG16, were found to contain mQTL hotspots, mainly regulating metabolites that belong to the phenylpropanoid pathway. The genetics of annotated metabolites was studied in more detail using MapQTL®. A number of quercetin conjugates had mQTLs on LG1 or LG13. The most important mQTL hotspot with the largest number of metabolites was detected on LG16: mQTLs for 33 peel-related and 17 flesh-related phenolic compounds. Structural genes involved in the phenylpropanoid biosynthetic pathway were located, using the apple genome sequence. The structural gene leucoanthocyanidin reductase (LAR1) was in the mQTL hotspot on LG16, as were seven transcription factor genes. The authors believe that this is the first time that a QTL analysis was performed on such a high number of metabolites in an outbreeding plant species.
Malus×domestica Borkh; genetical metabolomics; LC-MS; MapQTL; MetaNetwork; untargeted and targeted mQTL mapping
Plants have followed a reticulate type of evolution and taxa have frequently merged via allopolyploidization. A polyploid structure of sequenced genomes has often been proposed, but the chromosomes belonging to putative component genomes are difficult to identify. The 19 grapevine chromosomes are evolutionary stable structures: their homologous triplets have strongly conserved gene order, interrupted by rare translocations. The aim of this study is to examine how the grapevine nucleotide-binding site (NBS)-encoding resistance (NBS-R) genes have evolved in the genomic context and to understand mechanisms for the genome evolution. We show that, in grapevine, i) helitrons have significantly contributed to transposition of NBS-R genes, and ii) NBS-R gene cluster similarity indicates the existence of two groups of chromosomes (named as Va and Vc) that may have evolved independently. Chromosome triplets consist of two Va and one Vc chromosomes, as expected from the tetraploid and diploid conditions of the two component genomes. The hexaploid state could have been derived from either allopolyploidy or the separation of the Va and Vc component genomes in the same nucleus before fusion, as known for Rosaceae species. Time estimation indicates that grapevine component genomes may have fused about 60 mya, having had at least 40–60 mya to evolve independently. Chromosome number variation in the Vitaceae and related families, and the gap between the time of eudicot radiation and the age of Vitaceae fossils, are accounted for by our hypothesis.
Although flowering in mature fruit trees is recurrent, floral induction can be strongly inhibited by concurrent fruiting, leading to a pattern of irregular fruiting across consecutive years referred to as biennial bearing. The genetic determinants of biennial bearing in apple were investigated using the 114 flowering individuals from an F1 population of 122 genotypes, from a ‘Starkrimson’ (strong biennial bearer)×‘Granny Smith’ (regular bearer) cross. The number of inflorescences, and the number and the mass of harvested fruit were recorded over 6 years and used to calculate 26 variables and indices quantifying yield, precocity of production, and biennial bearing. Inflorescence traits exhibited the highest genotypic effect, and three quantitative trait loci (QTLs) on linkage group (LG) 4, LG8, and LG10 explained 50% of the phenotypic variability for biennial bearing. Apple orthologues of flowering and hormone-related genes were retrieved from the whole-genome assembly of ‘Golden Delicious’ and their position was compared with QTLs. Four main genomic regions that contain floral integrator genes, meristem identity genes, and gibberellin oxidase genes co-located with QTLs. The results indicated that flowering genes are less likely to be responsible for biennial bearing than hormone-related genes. New hypotheses for the control of biennial bearing emerged from QTL and candidate gene co-locations and suggest the involvement of different physiological processes such as the regulation of flowering genes by hormones. The correlation between tree architecture and biennial bearing is also discussed.
Auxin; floral induction; gibberellin; irregular production; Malus×domestica; precocity
Downy mildew, caused by the oomycete Plasmopara viticola, is a serious disease in Vitis vinifera, the most commonly cultivated grapevine species. Several wild Vitis species have instead been found to be resistant to this pathogen and have been used as a source to introgress resistance into a V. vinifera background. Stilbenoids represent the major phytoalexins in grapevine, and their toxicity is closely related to the specific compound. The aim of this study was to assess the resistance response to P. viticola of the Merzling × Teroldego cross by profiling the stilbenoid content of the leaves of an entire population and the transcriptome of resistant and susceptible individuals following infection.
A three-year analysis of the population's response to artificial inoculation showed that individuals were distributed in nine classes ranging from total resistance to total susceptibility. In addition, quantitative metabolite profiling of stilbenoids in the population, carried out using HPLC-DAD-MS, identified three distinct groups differing according to the concentrations present and the complexity of their profiles. The high producers were characterized by the presence of trans-resveratrol, trans-piceid, trans-pterostilbene and up to thirteen different viniferins, nine of them new in grapevine.
Accumulation of these compounds is consistent with a resistant phenotype and suggests that they may contribute to the resistance response.
A preliminary transcriptional study using cDNA-AFLP selected a set of genes modulated by the oomycete in a resistant genotype. The expression of this set of genes in resistant and susceptible genotypes of the progeny population was then assessed by comparative microarray analysis.
A group of 57 genes was found to be exclusively modulated in the resistant genotype suggesting that they are involved in the grapevine-P. viticola incompatible interaction. Functional annotation of these transcripts revealed that they belong to the categories defense response, photosynthesis, primary and secondary metabolism, signal transduction and transport.
This study reports the results of a combined metabolic and transcriptional profiling of a grapevine population segregating for resistance to P. viticola. Some resistant individuals were identified and further characterized at the molecular level. These results will be valuable to future grapevine breeding programs.