Despite recent sequencing efforts, local genetic resources remain underexploited, even though they carry alleles that can bring agronomic benefits. Taking advantage of the recent genotyping with 22,000 single-nucleotide polymorphism markers of a core collection of 180 Vietnamese rice varieties originating from provinces from North to South Vietnam and from different agrosystems characterized by contrasted water regimes, we have performed a genome-wide association study for different root parameters. Roots contribute to water stress avoidance and are a still underexploited target for breeding purpose due to the difficulty to observe them.
The panel of 180 rice varieties was phenotyped under greenhouse conditions for several root traits in an experimental design with 3 replicates. The phenotyping system consisted of long plastic bags that were filled with sand and supplemented with fertilizer. Root length, root mass in different layers, root thickness, and the number of crown roots, as well as several derived root parameters and shoot traits, were recorded. The results were submitted to association mapping using a mixed model involving structure and kinship to enable the identification of significant associations. The analyses were conducted successively on the whole panel and on its indica (115 accessions) and japonica (64 accessions) subcomponents. The two associations with the highest significance were for root thickness on chromosome 2 and for crown root number on chromosome 11. No common associations were detected between the indica and japonica subpanels, probably because of the polymorphism repartition between the subspecies. Based on orthology with Arabidopsis, the possible candidate genes underlying the quantitative trait loci are reviewed.
Some of the major quantitative trait loci we detected through this genome-wide association study contain promising candidate genes encoding regulatory elements of known key regulators of root formation and development.
Electronic supplementary material
The online version of this article (doi:10.1186/s12870-016-0747-y) contains supplementary material, which is available to authorized users.
Rice; Genotyping by sequencing; Root development; Association mapping; Structure
We report the quantitative trait loci (QTL) mapping of reproductive isolation traits between Ostrinia nubilalis (the European corn borer) and its sibling species O. scapulalis (the Adzuki bean borer), focusing on two traits: mating isolation (mi) and pheromone production (Pher). Four genetic maps were generated from two backcross families, with two maps (one chromosomal map and one linkage map) per backcross. We located 165–323 AFLP markers on these four maps, resulting in the identification of 27–31 linkage groups, depending on the map considered. No-choice mating experiments with the offspring of each backcross led to the detection of at least two QTLs for mi in different linkage groups. QTLs underlying Pher were located in a third linkage group. The Z heterochromosome was identified by a specific marker (Tpi) and did not carry any of these QTLs. Finally, we considered the global divergence between the two sibling species, distortions of segregation throughout the genome, and the location and effect of mi and Pher QTLs in light of the known candidate genes for reproductive isolation within the genus Ostrinia and, more broadly, in phytophagous insects.
QTL mapping; European corn borer; Adzuki bean borer; Ostrinia; mating isolation; ecological speciation
Allelic variants of floral repressor genes have been artificially selected to reduce sensitivity to photoperiod of rice varieties cultivated in Europe, allowing cultivation of a tropical species at higher latitudes.
The capacity to discriminate variations in day length allows plants to align flowering with the most favourable season of the year. This capacity has been altered by artificial selection when cultivated varieties became adapted to environments different from those of initial domestication. Rice flowering is promoted by short days when HEADING DATE 1 (Hd1) and EARLY HEADING DATE 1 (Ehd1) induce the expression of florigenic proteins encoded by HEADING DATE 3a (Hd3a) and RICE FLOWERING LOCUS T 1 (RFT1). Repressors of flowering antagonize such induction under long days, maintaining vegetative growth and delaying flowering. To what extent artificial selection of long day repressor loci has contributed to expand rice cultivation to Europe is currently unclear. This study demonstrates that European varieties activate both Hd3a and RFT1 expression regardless of day length and their induction is caused by loss-of-function mutations at major long day floral repressors. However, their contribution to flowering time control varies between locations. Pyramiding of mutations is frequently observed in European germplasm, but single mutations are sufficient to adapt rice to flower at higher latitudes. Expression of Ehd1 is increased in varieties showing reduced or null Hd1 expression under natural long days, as well as in single hd1 mutants in isogenic backgrounds. These data indicate that loss of repressor genes has been a key strategy to expand rice cultivation to Europe, and that Ehd1 is a central node integrating floral repressive signals.
Adaptation; Ehd1; Hd1; Hd3a; heading date; photoperiodic flowering; rice; RFT1.
The development of genome-wide association studies (GWAS) in crops has made it possible to mine interesting alleles hidden in gene bank resources. However, only a small fraction of the rice genetic diversity of any given country has been exploited in the studies with worldwide sampling conducted to date. This study presents the development of a panel of rice varieties from Vietnam for GWAS purposes.
The panel, initially composed of 270 accessions, was characterized for simple agronomic traits (maturity class, grain shape and endosperm type) commonly used to classify rice varieties. We first genotyped the panel using Diversity Array Technology (DArT) markers. We analyzed the panel structure, identified two subpanels corresponding to the indica and japonica sub-species and selected 182 non-redundant accessions. However, the number of usable DArT markers (241 for an initial library of 6444 clones) was too small for GWAS purposes. Therefore, we characterized the panel of 182 accessions with 25,971 markers using genotyping by sequencing. The same indica and japonica subpanels were identified. The indica subpanel was further divided into six populations (I1 to I6) using a model-based approach. The japonica subpanel, which was more highly differentiated, was divided into 4 populations (J1 to J4), including a temperate type (J2). Passport data and phenotypic traits were used to characterize these populations. Some populations were exclusively composed of glutinous types (I3 and J2). Some of the upland rice varieties appeared to belong to indica populations, which is uncommon in this region of the world. Linkage disequilibrium decayed faster in the indica subpanel (r2 below 0.2 at 101 kb) than in the japonica subpanel (r2 below 0.2 at 425 kb), likely because of the strongest differentiation of the japonica subpanel. A matrix adapted for GWAS was built by eliminating the markers with a minor allele frequency below 5% and imputing the missing data. This matrix contained 21,814 markers. A GWAS was conducted on time to flowering to prove the utility of this panel.
This publicly available panel constitutes an important resource giving access to original allelic diversity. It will be used for GWAS on root and panicle traits.
Electronic supplementary material
The online version of this article (doi:10.1186/s12870-014-0371-7) contains supplementary material, which is available to authorized users.
DArT markers; SNP; Genetic diversity; Linkage disequilibrium; Rice; Vietnam
We developed the PHIV-RootCell software to quantify anatomical traits of rice roots transverse section images. Combined with an efficient root sample processing method for image acquisition, this program permits supervised measurements of areas (those of whole root section, stele, cortex, and central metaxylem vessels), number of cell layers and number of cells per cell layer. The PHIV-RootCell toolset runs under ImageJ, an independent operating system that has a license-free status. To demonstrate the usefulness of PHIV-RootCell, we conducted a genetic diversity study and an analysis of salt stress responses of root anatomical parameters in rice (Oryza sativa L.). Using 16 cultivars, we showed that we could discriminate between some of the varieties even at the 6 day-olds stage, and that tropical japonica varieties had larger root sections due to an increase in cell number. We observed, as described previously, that root sections become enlarged under salt stress. However, our results show an increase in cell number in ground tissues (endodermis and cortex) but a decrease in external (peripheral) tissues (sclerenchyma, exodermis, and epidermis). Thus, the PHIV-RootCell program is a user-friendly tool that will be helpful for future genetic and physiological studies that investigate root anatomical trait variations.
cell number; image analysis software; rice; root; tissue area; transverse histological section; histological phenotype scoring
Rice is a crop prone to drought stress in upland and rainfed lowland ecosystems. A deep root system is recognized as the best drought avoidance mechanism. Genome-wide association mapping offers higher resolution for locating quantitative trait loci (QTLs) than QTL mapping in biparental populations. We performed an association mapping study for root traits using a panel of 167 japonica accessions, mostly of tropical origin. The panel was genotyped at an average density of one marker per 22.5 kb using genotyping by sequencing technology. The linkage disequilibrium in the panel was high (r2>0.6, on average, for 20 kb mean distances between markers). The plants were grown in transparent 50 cm × 20 cm × 2 cm Plexiglas nailboard sandwiches filled with 1.5 mm glass beads through which a nutrient solution was circulated. Root system architecture and biomass traits were measured in 30-day-old plants. The panel showed a moderate to high diversity in the various traits, particularly for deep (below 30 cm depth) root mass and the number of deep roots. Association analyses were conducted using a mixed model involving both population structure and kinship to control for false positives. Nineteen associations were significant at P<1e-05, and 78 were significant at P<1e-04. The greatest numbers of significant associations were detected for deep root mass and the number of deep roots, whereas no significant associations were found for total root biomass or deep root proportion. Because several QTLs for different traits were co-localized, 51 unique loci were detected; several co-localized with meta-QTLs for root traits, but none co-localized with rice genes known to be involved in root growth. Several likely candidate genes were found in close proximity to these loci. Additional work is necessary to assess whether these markers are relevant in other backgrounds and whether the genes identified are robust candidates.
Chromosome segment substitution lines (CSSLs) are powerful QTL mapping populations that have been used to elucidate the molecular basis of interesting traits of wild species. Cultivated peanut is an allotetraploid with limited genetic diversity. Capturing the genetic diversity from peanut wild relatives is an important objective in many peanut breeding programs. In this study, we used a marker-assisted backcrossing strategy to produce a population of 122 CSSLs from the cross between the wild synthetic allotetraploid (A. ipaënsis×A. duranensis)4x and the cultivated Fleur11 variety. The 122 CSSLs offered a broad coverage of the peanut genome, with target wild chromosome segments averaging 39.2 cM in length. As a demonstration of the utility of these lines, four traits were evaluated in a subset of 80 CSSLs. A total of 28 lines showed significant differences from Fleur11. The line×trait significant associations were assigned to 42 QTLs: 14 for plant growth habit, 15 for height of the main stem, 12 for plant spread and one for flower color. Among the 42 QTLs, 37 were assigned to genomic regions and three QTL positions were considered putative. One important finding arising from this QTL analysis is that peanut growth habit is a complex trait that is governed by several QTLs with different effects. The CSSL population developed in this study has proved efficient for deciphering the molecular basis of trait variations and will be useful to the peanut scientific community for future QTL mapping studies.
Ecuador’s economic history has been closely linked to Theobroma cacao L cultivation, and specifically to the native fine flavour Nacional cocoa variety. The original Nacional cocoa trees are presently in danger of extinction due to foreign germplasm introductions. In a previous work, a few non-introgressed Nacional types were identified as potential founders of the modern Ecuadorian cocoa population, but so far their origin could not be formally identified. In order to determine the putative centre of origin of Nacional and trace its domestication history, we used 80 simple sequence repeat (SSR) markers to analyse the relationships between these potential Nacional founders and 169 wild and cultivated cocoa accessions from South and Central America. The highest genetic similarity was observed between the Nacional pool and some wild genotypes from the southern Amazonian region of Ecuador, sampled along the Yacuambi, Nangaritza and Zamora rivers in Zamora Chinchipe province. This result was confirmed by a parentage analysis. Based on our results and on data about pre-Columbian civilization and Spanish colonization history of Ecuador, we determined, for the first time, the possible centre of origin and migration events of the Nacional variety from the Amazonian area until its arrival in the coastal provinces. As large unexplored forest areas still exist in the southern part of the Ecuadorian Amazonian region, our findings could provide clues as to where precious new genetic resources could be collected, and subsequently used to improve the flavour and disease resistance of modern Ecuadorian cocoa varieties.
Models of indirect (genetic) benefits sexual selection predict linkage disequilibria between genes that influence male traits and female preferences, owing to non-random mate choice or physical linkage. Such linkage disequilibria can accelerate the evolution of traits and preferences to exaggerated levels. Both theory and recent empirical findings on species recognition suggest that such linkage disequilibria may result from physical linkage or pleiotropy, but very little work has addressed this possibility within the context of sexual selection. We studied the genetic architecture of sexually selected traits by analyzing signals and preferences in an acoustic moth, Achroia grisella, in which males attract females with a train of ultrasound pulses and females prefer loud songs and a fast pulse rhythm. Both male signal characters and female preferences are repeatable and heritable traits. Moreover, female choice is based largely on male song, while males do not appear to provide direct benefits at mating. Thus, some genetic correlation between song and preference traits is expected. We employed a standard crossing design between inbred lines and used AFLP markers to build a linkage map for this species and locate quantitative trait loci (QTL) that influence male song and female preference. Our analyses mostly revealed QTLs of moderate strength that influence various male signal and female receiver traits, but one QTL was found that exerts a major influence on the pulse-pair rate of male song, a critical trait in female attraction. However, we found no evidence of specific co-localization of QTLs influencing male signal and female receiver traits on the same linkage groups. This finding suggests that the sexual selection process would proceed at a modest rate in A. grisella and that evolution toward exaggerated character states may be tempered. We suggest that this equilibrium state may be more the norm than the exception among animal species.
Polyploidy can result in genetic bottlenecks, especially for species of monophyletic origin. Cultivated peanut is an allotetraploid harbouring limited genetic diversity, likely resulting from the combined effects of its single origin and domestication. Peanut wild relatives represent an important source of novel alleles that could be used to broaden the genetic basis of the cultigen. Using an advanced backcross population developed with a synthetic amphidiploid as donor of wild alleles, under two water regimes, we conducted a detailed QTL study for several traits involved in peanut productivity and adaptation as well as domestication.
A total of 95 QTLs were mapped in the two water treatments. About half of the QTL positive effects were associated with alleles of the wild parent and several QTLs involved in yield components were specific to the water-limited treatment. QTLs detected for the same trait mapped to non-homeologous genomic regions, suggesting differential control in subgenomes as a consequence of polyploidization. The noteworthy clustering of QTLs for traits involved in seed and pod size and in plant and pod morphology suggests, as in many crops, that a small number of loci have contributed to peanut domestication.
In our study, we have identified QTLs that differentiated cultivated peanut from its wild relatives as well as wild alleles that contributed positive variation to several traits involved in peanut productivity and adaptation. These findings offer novel opportunities for peanut improvement using wild relatives.
Theobroma cacao is an economically important tree of several tropical countries. Its genetic improvement is essential to provide protection against major diseases and improve chocolate quality. We discovered and mapped new expressed sequence tag-single nucleotide polymorphism (EST-SNP) and simple sequence repeat (SSR) markers and constructed a high-density genetic map. By screening 149 650 ESTs, 5246 SNPs were detected in silico, of which 1536 corresponded to genes with a putative function, while 851 had a clear polymorphic pattern across a collection of genetic resources. In addition, 409 new SSR markers were detected on the Criollo genome. Lastly, 681 new EST-SNPs and 163 new SSRs were added to the pre-existing 418 co-dominant markers to construct a large consensus genetic map. This high-density map and the set of new genetic markers identified in this study are a milestone in cocoa genomics and for marker-assisted breeding. The data are available at http://tropgenedb.cirad.fr.
Theobroma cacao; genetic map; SNP; molecular marker
The genus Musa is a large species complex which includes cultivars at diploid and triploid levels. These sterile and vegetatively propagated cultivars are based on the A genome from Musa acuminata, exclusively for sweet bananas such as Cavendish, or associated with the B genome (Musa balbisiana) in cooking bananas such as Plantain varieties. In M. acuminata cultivars, structural heterozygosity is thought to be one of the main causes of sterility, which is essential for obtaining seedless fruits but hampers breeding. Only partial genetic maps are presently available due to chromosomal rearrangements within the parents of the mapping populations. This causes large segregation distortions inducing pseudo-linkages and difficulties in ordering markers in the linkage groups. The present study aims at producing a saturated linkage map of M. acuminata, taking into account hypotheses on the structural heterozygosity of the parents.
An F1 progeny of 180 individuals was obtained from a cross between two genetically distant accessions of M. acuminata, 'Borneo' and 'Pisang Lilin' (P. Lilin). Based on the gametic recombination of each parent, two parental maps composed of SSR and DArT markers were established. A significant proportion of the markers (21.7%) deviated (p < 0.05) from the expected Mendelian ratios. These skewed markers were distributed in different linkage groups for each parent. To solve some complex ordering of the markers on linkage groups, we associated tools such as tree-like graphic representations, recombination frequency statistics and cytogenetical studies to identify structural rearrangements and build parsimonious linkage group order. An illustration of such an approach is given for the P. Lilin parent.
We propose a synthetic map with 11 linkage groups containing 489 markers (167 SSRs and 322 DArTs) covering 1197 cM. This first saturated map is proposed as a "reference Musa map" for further analyses. We also propose two complete parental maps with interpretations of structural rearrangements localized on the linkage groups. The structural heterozygosity in P. Lilin is hypothesized to result from a duplication likely accompanied by an inversion on another chromosome. This paper also illustrates a methodological approach, transferable to other species, to investigate the mapping of structural rearrangements and determine their consequences on marker segregation.
Peanut (Arachis hypogaea L.) is widely used as a food and cash crop around the world. It is considered to be an allotetraploid (2n = 4x = 40) originated from a single hybridization event between two wild diploids. The most probable hypothesis gave A. duranensis as the wild donor of the A genome and A. ipaënsis as the wild donor of the B genome. A low level of molecular polymorphism is found in cultivated germplasm and up to date few genetic linkage maps have been published. The utilization of wild germplasm in breeding programs has received little attention due to the reproductive barriers between wild and cultivated species and to the technical difficulties encountered in making large number of crosses. We report here the development of a SSR based genetic map and the analysis of genome-wide segment introgressions into the background of a cultivated variety through the utilization of a synthetic amphidiploid between A. duranensis and A. ipaënsis.
Two hundred ninety eight (298) loci were mapped in 21 linkage groups (LGs), spanning a total map distance of 1843.7 cM with an average distance of 6.1 cM between adjacent markers. The level of polymorphism observed between the parent of the amphidiploid and the cultivated variety is consistent with A. duranensis and A. ipaënsis being the most probable donor of the A and B genomes respectively. The synteny analysis between the A and B genomes revealed an overall good collinearity of the homeologous LGs. The comparison with the diploid and tetraploid maps shed new light on the evolutionary forces that contributed to the divergence of the A and B genome species and raised the question of the classification of the B genome species. Structural modifications such as chromosomal segment inversions and a major translocation event prior to the tetraploidisation of the cultivated species were revealed. Marker assisted selection of BC1F1 and then BC2F1 lines carrying the desirable donor segment with the best possible return to the background of the cultivated variety provided a set of lines offering an optimal distribution of the wild introgressions.
The genetic map developed, allowed the synteny analysis of the A and B genomes, the comparison with diploid and tetraploid maps and the analysis of the introgression segments from the wild synthetic into the background of a cultivated variety. The material we have produced in this study should facilitate the development of advanced backcross and CSSL breeding populations for the improvement of cultivated peanut.
Meta-analysis of QTLs combines the results of several QTL detection studies and provides narrow confidence intervals for meta-QTLs, permitting easier positional candidate gene identification. It is usually applied to multiple mapping populations, but can be applied to one. Here, a meta-analysis of drought related QTLs in the Bala × Azucena mapping population compiles data from 13 experiments and 25 independent screens providing 1,650 individual QTLs separated into 5 trait categories; drought avoidance, plant height, plant biomass, leaf morphology and root traits. A heat map of the overlapping 1 LOD confidence intervals provides an overview of the distribution of QTLs. The programme BioMercator is then used to conduct a formal meta-analysis at example QTL clusters to illustrate the value of meta-analysis of QTLs in this population.
The heat map graphically illustrates the genetic complexity of drought related traits in rice. QTLs can be linked to their physical position on the rice genome using Additional file 1 provided. Formal meta-analysis on chromosome 1, where clusters of QTLs for all trait categories appear close, established that the sd1 semi-dwarfing gene coincided with a plant height meta-QTL, that the drought avoidance meta-QTL was not likely to be associated with this gene, and that this meta-QTL was not pleiotropic with close meta-QTLs for leaf morphology and root traits. On chromosome 5, evidence suggests that a drought avoidance meta-QTL was pleiotropic with leaf morphology and plant biomass meta-QTLs, but not with meta-QTLs for root traits and plant height 10 cM lower down. A region of dense root QTL activity graphically visible on chromosome 9 was dissected into three meta-QTLs within a space of 35 cM. The confidence intervals for meta-QTLs obtained ranged from 5.1 to 14.5 cM with an average of 9.4 cM, which is approximately 180 genes in rice.
The meta-analysis is valuable in providing improved ability to dissect the complex genetic structure of traits, and distinguish between pleiotropy and close linkage. It also provides relatively small target regions for the identification of positional candidate genes.
Background and Aims
Prediction of phenotypic traits from new genotypes under untested environmental conditions is crucial to build simulations of breeding strategies to improve target traits. Although the plant response to environmental stresses is characterized by both architectural and functional plasticity, recent attempts to integrate biological knowledge into genetics models have mainly concerned specific physiological processes or crop models without architecture, and thus may prove limited when studying genotype × environment interactions. Consequently, this paper presents a simulation study introducing genetics into a functional–structural growth model, which gives access to more fundamental traits for quantitative trait loci (QTL) detection and thus to promising tools for yield optimization.
The GREENLAB model was selected as a reasonable choice to link growth model parameters to QTL. Virtual genes and virtual chromosomes were defined to build a simple genetic model that drove the settings of the species-specific parameters of the model. The QTL Cartographer software was used to study QTL detection of simulated plant traits. A genetic algorithm was implemented to define the ideotype for yield maximization based on the model parameters and the associated allelic combination.
Key Results and Conclusions
By keeping the environmental factors constant and using a virtual population with a large number of individuals generated by a Mendelian genetic model, results for an ideal case could be simulated. Virtual QTL detection was compared in the case of phenotypic traits – such as cob weight – and when traits were model parameters, and was found to be more accurate in the latter case. The practical interest of this approach is illustrated by calculating the parameters (and the corresponding genotype) associated with yield optimization of a GREENLAB maize model. The paper discusses the potentials of GREENLAB to represent environment × genotype interactions, in particular through its main state variable, the ratio of biomass supply over demand.
Plant growth model; GREENLAB; genetics; QTL; breeding; yield optimization; genetic algorithm; Zea mays
Theobroma cacao L., is a tree originated from the tropical rainforest of South America. It is one of the major cash crops for many tropical countries. T. cacao is mainly produced on smallholdings, providing resources for 14 million farmers. Disease resistance and T. cacao quality improvement are two important challenges for all actors of cocoa and chocolate production. T. cacao is seriously affected by pests and fungal diseases, responsible for more than 40% yield losses and quality improvement, nutritional and organoleptic, is also important for consumers. An international collaboration was formed to develop an EST genomic resource database for cacao.
Fifty-six cDNA libraries were constructed from different organs, different genotypes and different environmental conditions. A total of 149,650 valid EST sequences were generated corresponding to 48,594 unigenes, 12,692 contigs and 35,902 singletons. A total of 29,849 unigenes shared significant homology with public sequences from other species.
Gene Ontology (GO) annotation was applied to distribute the ESTs among the main GO categories.
A specific information system (ESTtik) was constructed to process, store and manage this EST collection allowing the user to query a database.
To check the representativeness of our EST collection, we looked for the genes known to be involved in two different metabolic pathways extensively studied in other plant species and important for T. cacao qualities: the flavonoid and the terpene pathways. Most of the enzymes described in other crops for these two metabolic pathways were found in our EST collection.
A large collection of new genetic markers was provided by this ESTs collection.
This EST collection displays a good representation of the T. cacao transcriptome, suitable for analysis of biochemical pathways based on oligonucleotide microarrays derived from these ESTs. It will provide numerous genetic markers that will allow the construction of a high density gene map of T. cacao. This EST collection represents a unique and important molecular resource for T. cacao study and improvement, facilitating the discovery of candidate genes for important T. cacao trait variation.
To organize data resulting from the phenotypic characterization of a library of 30 000 T-DNA enhancer trap (ET) insertion lines of rice (Oryza sativa L cv. Nipponbare), we developed the Oryza Tag Line (OTL) database (http://urgi.versailles.inra.fr/OryzaTagLine/). OTL structure facilitates forward genetic search for specific phenotypes, putatively resulting from gene disruption, and/or for GUSA or GFP reporter gene expression patterns, reflecting ET-mediated endogenous gene detection. In the latest version, OTL gathers the detailed morpho-physiological alterations observed during field evaluation and specific screens in a first set of 13 928 lines. Detection of GUS or GFP activity in specific organ/tissues in a subset of the library is also provided. Search in OTL can be achieved through trait ontology category, organ and/or developmental stage, keywords, expression of reporter gene in specific organ/tissue as well as line identification number. OTL now contains the description of 9721 mutant phenotypic traits observed in 2636 lines and 1234 GUS or GFP expression patterns. Each insertion line is documented through a generic passport data including production records, seed stocks and FST information. 8004 and 6101 of the 13 928 lines are characterized by at least one T-DNA and one Tos17 FST, respectively that OTL links to the rice genome browser OryGenesDB.
Improvement of Citrus, the most economically important fruit crop in the world, is extremely slow and inherently costly because of the long-term nature of tree breeding and an unusual combination of reproductive characteristics. Aside from disease resistance, major commercial traits in Citrus are improved fruit quality, higher yield and tolerance to environmental stresses, especially salinity.
A normalized full length and 9 standard cDNA libraries were generated, representing particular treatments and tissues from selected varieties (Citrus clementina and C. sinensis) and rootstocks (C. reshni, and C. sinenis × Poncirus trifoliata) differing in fruit quality, resistance to abscission, and tolerance to salinity. The goal of this work was to provide a large expressed sequence tag (EST) collection enriched with transcripts related to these well appreciated agronomical traits. Towards this end, more than 54000 ESTs derived from these libraries were analyzed and annotated. Assembly of 52626 useful sequences generated 15664 putative transcription units distributed in 7120 contigs, and 8544 singletons. BLAST annotation produced significant hits for more than 80% of the hypothetical transcription units and suggested that 647 of these might be Citrus specific unigenes. The unigene set, composed of ~13000 putative different transcripts, including more than 5000 novel Citrus genes, was assigned with putative functions based on similarity, GO annotations and protein domains
Comparative genomics with Arabidopsis revealed the presence of putative conserved orthologs and single copy genes in Citrus and also the occurrence of both gene duplication events and increased number of genes for specific pathways. In addition, phylogenetic analysis performed on the ammonium transporter family and glycosyl transferase family 20 suggested the existence of Citrus paralogs. Analysis of the Citrus gene space showed that the most important metabolic pathways known to affect fruit quality were represented in the unigene set. Overall, the similarity analyses indicated that the sequences of the genes belonging to these varieties and rootstocks were essentially identical, suggesting that the differential behaviour of these species cannot be attributed to major sequence divergences. This Citrus EST assembly contributes both crucial information to discover genes of agronomical interest and tools for genetic and genomic analyses, such as the development of new markers and microarrays.
TropGENE-DB, is a crop information system created to store genetic, molecular and phenotypic data of the numerous yet poorly documented tropical crop species. The most common data stored in TropGENE-DB are information on genetic resources (agro-morphological data, parentages, allelic diversity), molecular markers, genetic maps, results of quantitative trait loci analyses, data from physical mapping, sequences, genes, as well as the corresponding references. TropGENE-DB is organized on a crop basis with currently three running modules (sugarcane, cocoa and banana), with plans to create additional modules for rice, cotton, oil palm, coconut, rubber tree, pineapple, taro, yam and sorghum. The TropGENE-DB information system is accessible for consultation via the internet at http://tropgenedb.cirad.fr. Specific web consultation interfaces have been designed to allow quick consultations as well as complex queries.