The hybrid pigeonpea (Cajanus cajan) breeding technology based on cytoplasmic male sterility (CMS) is currently unique among legumes and displays major potential for yield increase. CMS is defined as a condition in which a plant is unable to produce functional pollen grains. The novel chimeric open reading frames (ORFs) produced as a results of mitochondrial genome rearrangements are considered to be the main cause of CMS. To identify these CMS-related ORFs in pigeonpea, we sequenced the mitochondrial genomes of three C. cajan lines (the male-sterile line ICPA 2039, the maintainer line ICPB 2039, and the hybrid line ICPH 2433) and of the wild relative (Cajanus cajanifolius ICPW 29). A single, circular-mapping molecule of length 545.7 kb was assembled and annotated for the ICPA 2039 line. Sequence annotation predicted 51 genes, including 34 protein-coding and 17 RNA genes. Comparison of the mitochondrial genomes from different Cajanus genotypes identified 31 ORFs, which differ between lines within which CMS is present or absent. Among these chimeric ORFs, 13 were identified by comparison of the related male-sterile and maintainer lines. These ORFs display features that are known to trigger CMS in other plant species and to represent the most promising candidates for CMS-related mitochondrial rearrangements in pigeonpea.
mitochondria; pigeonpea; next-generation sequencing; cytoplasmic male sterility; open reading frames
Pearl millet [Pennisetum glaucum (L.) R. Br.] is a widely cultivated drought- and high-temperature tolerant C4 cereal grown under dryland, rainfed and irrigated conditions in drought-prone regions of the tropics and sub-tropics of Africa, South Asia and the Americas. It is considered an orphan crop with relatively few genomic and genetic resources. This study was undertaken to increase the EST-based microsatellite marker and genetic resources for this crop to facilitate marker-assisted breeding.
Newly developed EST-SSR markers (99), along with previously mapped EST-SSR (17), genomic SSR (53) and STS (2) markers, were used to construct linkage maps of four F7 recombinant inbred populations (RIP) based on crosses ICMB 841-P3 × 863B-P2 (RIP A), H 77/833-2 × PRLT 2/89-33 (RIP B), 81B-P6 × ICMP 451-P8 (RIP C) and PT 732B-P2 × P1449-2-P1 (RIP D). Mapped loci numbers were greatest for RIP A (104), followed by RIP B (78), RIP C (64) and RIP D (59). Total map lengths (Haldane) were 615 cM, 690 cM, 428 cM and 276 cM, respectively. A total of 176 loci detected by 171 primer pairs were mapped among the four crosses. A consensus map of 174 loci (899 cM) detected by 169 primer pairs was constructed using MergeMap to integrate the individual linkage maps. Locus order in the consensus map was well conserved for nearly all linkage groups. Eighty-nine EST-SSR marker loci from this consensus map had significant BLAST hits (top hits with e-value ≤ 1E-10) on the genome sequences of rice, foxtail millet, sorghum, maize and Brachypodium with 35, 88, 58, 48 and 38 loci, respectively.
The consensus map developed in the present study contains the largest set of mapped SSRs reported to date for pearl millet, and represents a major consolidation of existing pearl millet genetic mapping information. This study increased numbers of mapped pearl millet SSR markers by >50%, filling important gaps in previously published SSR-based linkage maps for this species and will greatly facilitate SSR-based QTL mapping and applied marker-assisted selection programs.
EST-SSR markers; EST; Linkage map; Consensus map; Drought stress; Pearl millet; Synteny
The complex, tetraploid genome structure of peanut (Arachis hypogaea) has obstructed advances in genetics and genomics in the species. The aim of this study is to understand the genome structure of Arachis by developing a high-density integrated consensus map. Three recombinant inbred line populations derived from crosses between the A genome diploid species, Arachis duranensis and Arachis stenosperma; the B genome diploid species, Arachis ipaënsis and Arachis magna; and between the AB genome tetraploids, A. hypogaea and an artificial amphidiploid (A. ipaënsis × A. duranensis)4×, were used to construct genetic linkage maps: 10 linkage groups (LGs) of 544 cM with 597 loci for the A genome; 10 LGs of 461 cM with 798 loci for the B genome; and 20 LGs of 1442 cM with 1469 loci for the AB genome. The resultant maps plus 13 published maps were integrated into a consensus map covering 2651 cM with 3693 marker loci which was anchored to 20 consensus LGs corresponding to the A and B genomes. The comparative genomics with genome sequences of Cajanus cajan, Glycine max, Lotus japonicus, and Medicago truncatula revealed that the Arachis genome has segmented synteny relationship to the other legumes. The comparative maps in legumes, integrated tetraploid consensus maps, and genome-specific diploid maps will increase the genetic and genomic understanding of Arachis and should facilitate molecular breeding.
Arachis spp.; comparative genomics; genetic linkage map; integrated consensus map; legume genome
Groundnut (Arachis hypogaea L.), a self-pollinated legume is an important crop cultivated in 24 million ha world over for extraction of edible oil and food uses. The kernels are rich in oil (48–50%) and protein (25–28%), and are source of several vitamins, minerals, antioxidants, biologically active polyphenols, flavonoids, and isoflavones. Improved varieties of groundnut with high yield potential were developed and released for cultivation world over. The improved varieties belong to different maturity durations and possess resistance to diseases, tolerance to drought, enhanced oil content, and improved quality traits for food uses. Conventional breeding procedures along with the tools for phenotyping were largely used in groundnut improvement programs. Mutations were used to induce variability and wide hybridization was attempted to tap variability from wild species. Low genetic variability has been a bottleneck for groundnut improvement. The vast potential of wild species, reservoir of new alleles remains under-utilized. Development of linkage maps of groundnut during the last decade was followed by identification of markers and quantitative trait loci for the target traits. Consequently, the last decade has witnessed the deployment of molecular breeding approaches to complement the ongoing groundnut improvement programs in USA, China, India, and Japan. The other potential advantages of molecular breeding are the feasibility to target multiple traits for improvement and provide tools to tap new alleles from wild species. The first groundnut variety developed through marker-assisted back-crossing is a root-knot nematode-resistant variety, NemaTAM in USA. The uptake of molecular breeding approaches in groundnut improvement programs by NARS partners in India and many African countries is slow or needs to be initiated in part due to inadequate infrastructure, high genotyping costs, and human capacities. Availability of draft genome sequence for diploid (AA and BB) and tetraploid, AABB genome species of Arachis in coming years is expected to bring low-cost genotyping to the groundnut community that will facilitate use of modern genetics and breeding approaches such as genome-wide association studies for trait mapping and genomic selection for crop improvement.
Arachis hypogaea; genetic variability; pedigree; disease resistance; phenotyping; QTLs; molecular breeding; genomic selection
Legumes play an important role as food and forage crops in international agriculture especially in developing countries. Legumes have a unique biological process called nitrogen fixation (NF) by which they convert atmospheric nitrogen to ammonia. Although legume genomes have undergone polyploidization, duplication and divergence, NF-related genes, because of their essential functional role for legumes, might have remained conserved. To understand the relationship of divergence and evolutionary processes in legumes, this study analyzes orthologs and paralogs for selected 20 NF-related genes by using comparative genomic approaches in six legumes i.e., Medicago truncatula (Mt), Cicer arietinum, Lotus japonicus, Cajanus cajan (Cc), Phaseolus vulgaris (Pv), and Glycine max (Gm). Subsequently, sequence distances, numbers of synonymous substitutions per synonymous site (Ks) and non-synonymous substitutions per non-synonymous site (Ka) between orthologs and paralogs were calculated and compared across legumes. These analyses suggest the closest relationship between Gm and Cc and the highest distance between Mt and Pv in six legumes. Ks proportional plots clearly showed ancient genome duplication in all legumes, whole genome duplication event in Gm and also speciation pattern in different legumes. This study also reports some interesting observations e.g., no peak at Ks 0.4 in Gm-Gm, location of two independent genes next to each other in Mt and low Ks values for outparalogs for three genes as compared to other 12 genes. In summary, this study underlines the importance of NF-related genes and provides important insights in genome organization and evolutionary aspects of six legume species analyzed.
nitrogen fixation; legume; comparative analysis; Ks; evolution
Single-nucleotide polymorphisms (SNPs, >2000) were discovered by using RNA-seq and allele-specific sequencing approaches in pigeonpea (Cajanus cajan). For making the SNP genotyping cost-effective, successful competitive allele-specific polymerase chain reaction (KASPar) assays were developed for 1616 SNPs and referred to as PKAMs (pigeonpea KASPar assay markers). Screening of PKAMs on 24 genotypes [23 from cultivated species and 1 wild species (Cajanus scarabaeoides)] defined a set of 1154 polymorphic markers (77.4%) with a polymorphism information content (PIC) value from 0.04 to 0.38. One thousand and ninety-four PKAMs showed polymorphisms between parental lines of the reference mapping population (C. cajan ICP 28 × C. scarabaeoides ICPW 94). By using high-quality marker genotyping data on 167 F2 lines from the population, a comprehensive genetic map comprising 875 PKAMs with an average inter-marker distance of 1.11 cM was developed. Previously mapped 35 simple sequence repeat markers were integrated into the PKAM map and an integrated genetic map of 996.21 cM was constructed. Mapped PKAMs showed a higher degree of synteny with the genome of Glycine max followed by Medicago truncatula and Lotus japonicus and least with Vigna unguiculata. These PKAMs will be useful for genetics research and breeding applications in pigeonpea and for utilizing genome information from other legume species.
pigeonpea; SNP; linkage map; comparative genomics; molecular breeding
Pigeonpea (Cajanus cajan L.) is an important food legume crop of rainfed agriculture. Owing to exposure of the crop to a number of biotic and abiotic stresses, the crop productivity has remained stagnant for almost last five decades at ca. 750 kg/ha. The availability of a cytoplasmic male sterility (CMS) system has facilitated the development and release of hybrids which are expected to enhance the productivity of pigeonpea. Recent advances in genomics and molecular breeding such as marker-assisted selection (MAS) offer the possibility to accelerate hybrid breeding. Molecular markers and genetic maps are pre-requisites for deploying MAS in breeding. However, in the case of pigeonpea, only one inter- and two intra-specific genetic maps are available so far. Here, four new intra-specific genetic maps comprising 59–140 simple sequence repeat (SSR) loci with map lengths ranging from 586.9 to 881.6 cM have been constructed. Using these four genetic maps together with two recently published intra-specific genetic maps, a consensus map was constructed, comprising of 339 SSR loci spanning a distance of 1,059 cM. Furthermore, quantitative trait loci (QTL) analysis for fertility restoration (Rf) conducted in three mapping populations identified four major QTLs explaining phenotypic variances up to 24 %. To the best of our knowledge, this is the first report on construction of a consensus genetic map in pigeonpea and on the identification of QTLs for fertility restoration. The developed consensus genetic map should serve as a reference for developing new genetic maps as well as correlating with the physical map in pigeonpea to be developed in near future. The availability of more informative markers in the bins harbouring QTLs for sterility mosaic disease (SMD) and Rf will facilitate the selection of the most suitable markers for genetic analysis and molecular breeding applications in pigeonpea.
Electronic supplementary material
The online version of this article (doi:10.1007/s00122-012-1916-5) contains supplementary material, which is available to authorized users.
Drought is one of the most serious production constraint for world agriculture and is projected to worsen with anticipated climate change. Inter-disciplinary scientists have been trying to understand and dissect the mechanisms of plant tolerance to drought stress using a variety of approaches; however, success has been limited. Modern genomics and genetic approaches coupled with advances in precise phenotyping and breeding methodologies are expected to more effectively unravel the genes and metabolic pathways that confer drought tolerance in crops. This article discusses the most recent advances in plant physiology for precision phenotyping of drought response, a vital step before implementing the genetic and molecular-physiological strategies to unravel the complex multilayered drought tolerance mechanism and further exploration using molecular breeding approaches for crop improvement. Emphasis has been given to molecular dissection of drought tolerance by QTL or gene discovery through linkage and association mapping, QTL cloning, candidate gene identification, transcriptomics and functional genomics. Molecular breeding approaches such as marker-assisted backcrossing, marker-assisted recurrent selection and genome-wide selection have been suggested to be integrated in crop improvement strategies to develop drought-tolerant cultivars that will enhance food security in the context of a changing and more variable climate.
A comprehensive transcriptome assembly for pigeonpea has been developed by analyzing 128.9 million short Illumina GA IIx single end reads, 2.19 million single end FLX/454 reads, and 18 353 Sanger expressed sequenced tags from more than 16 genotypes. The resultant transcriptome assembly, referred to as CcTA v2, comprised 21 434 transcript assembly contigs (TACs) with an N50 of 1510 bp, the largest one being ∼8 kb. Of the 21 434 TACs, 16 622 (77.5%) could be mapped on to the soybean genome build 1.0.9 under fairly stringent alignment parameters. Based on knowledge of intron junctions, 10 009 primer pairs were designed from 5033 TACs for amplifying intron spanning regions (ISRs). By using in silico mapping of BAC-end-derived SSR loci of pigeonpea on the soybean genome as a reference, putative mapping positions at the chromosome level were predicted for 6284 ISR markers, covering all 11 pigeonpea chromosomes. A subset of 128 ISR markers were analyzed on a set of eight genotypes. While 116 markers were validated, 70 markers showed one to three alleles, with an average of 0.16 polymorphism information content (PIC) value. In summary, the CcTA v2 transcript assembly and ISR markers will serve as a useful resource to accelerate genetic research and breeding applications in pigeonpea.
Cajanus cajan (L.); second-generation sequencing; transcriptome assembly; intron spanning region (ISR) markers
Single feature polymorphisms (SFPs) are microarray-based molecular markers that are detected by hybridization of DNA or cRNA to oligonucleotide probes. With an objective to identify the potential polymorphic markers for drought tolerance in pigeonpea [Cajanus cajan (L.) Millspaugh], an important legume crop for the semi-arid tropics but deficient in genomic resources, Affymetrix Genome Arrays of soybean (Glycine max), a closely related species of pigeonpea were used on cRNA of six parental genotypes of three mapping populations of pigeonpea segregating for agronomic traits like drought tolerance and pod borer (Helicoverpa armigiera) resistance. By using robustified projection pursuit method on 15 pair-wise comparisons for the six parental genotypes, 5,692 SFPs were identified. Number of SFPs varied from 780 (ICPL 8755 × ICPL 227) to 854 (ICPL 151 × ICPL 87) per parental combination of the mapping populations. Randomly selected 179 SFPs were used for validation by Sanger sequencing and good quality sequence data were obtained for 99 genes of which 75 genes showed sequence polymorphisms. While associating the sequence polymorphisms with SFPs detected, true positives were observed for 52.6% SFPs detected. In terms of parental combinations of the mapping populations, occurrence of true positives was 34.48% for ICPL 151 × ICPL 87, 41.86% for ICPL 8755 × ICPL 227, and 81.58% for ICP 28 × ICPW 94. In addition, a set of 139 candidate genes that may be associated with drought tolerance has been identified based on gene ontology analysis of the homologous pigeonpea genes to the soybean genes that detected SFPs between the parents of the mapping populations segregating for drought tolerance.
Electronic supplementary material
The online version of this article (doi:10.1007/s10142-011-0227-2) contains supplementary material, which is available to authorized users.
Single feature polymorphism; Microarray; Robustified projection pursuit; Molecular markers; Legumes
Chickpea (Cicer arietinum L.) is an important grain-legume crop that is mainly grown in rainfed areas, where terminal drought is a major constraint to its productivity. We generated expressed sequence tags (ESTs) by suppression subtraction hybridization (SSH) to identify differentially expressed genes in drought-tolerant and -susceptible genotypes in chickpea.
EST libraries were generated by SSH from root and shoot tissues of IC4958 (drought tolerant) and ICC 1882 (drought resistant) exposed to terminal drought conditions by the dry down method. SSH libraries were also constructed by using 2 sets of bulks prepared from the RNA of root tissues from selected recombinant inbred lines (RILs) (10 each) for the extreme high and low root biomass phenotype. A total of 3062 unigenes (638 contigs and 2424 singletons), 51.4% of which were novel in chickpea, were derived by cluster assembly and sequence alignment of 5949 ESTs. Only 2185 (71%) unigenes showed significant BLASTX similarity (<1E-06) in the NCBI non-redundant (nr) database. Gene ontology functional classification terms (BLASTX results and GO term), were retrieved for 2006 (92.0%) sequences, and 656 sequences were further annotated with 812 Enzyme Commission (EC) codes and were mapped to 108 different KEGG pathways. In addition, expression status of 830 unigenes in response to terminal drought stress was evaluated using macro-array (dot blots). The expression of few selected genes was validated by northern blotting and quantitative real-time PCR assay.
Our study compares not only genes that are up- and down-regulated in a drought-tolerant genotype under terminal drought stress and a drought susceptible genotype but also between the bulks of the selected RILs exhibiting extreme phenotypes. More than 50% of the genes identified have been shown to be associated with drought stress in chickpea for the first time. This study not only serves as resource for marker discovery, but can provide a better insight into the selection of candidate genes (both up- and downregulated) associated with drought tolerance. These results can be used to identify suitable targets for manipulating the drought-tolerance trait in chickpea.
A transcript map has been constructed by the development and integration of genic molecular markers (GMMs) including single nucleotide polymorphism (SNP), genic microsatellite or simple sequence repeat (SSR) and intron spanning region (ISR)-based markers, on an inter-specific mapping population of chickpea, the third food legume crop of the world and the first food legume crop of India. For SNP discovery through allele re-sequencing, primer pairs were designed for 688 genes/expressed sequence tags (ESTs) of chickpea and 657 genes/ESTs of closely related species of chickpea. High-quality sequence data obtained for 220 candidate genic regions on 2–20 genotypes representing 9 Cicer species provided 1,893 SNPs with an average frequency of 1/35.83 bp and 0.34 PIC (polymorphism information content) value. On an average 2.9 haplotypes were present in 220 candidate genic regions with an average haplotype diversity of 0.6326. SNP2CAPS analysis of 220 sequence alignments, as mentioned above, provided a total of 192 CAPS candidates. Experimental analysis of these 192 CAPS candidates together with 87 CAPS candidates identified earlier through in silico mining of ESTs provided scorable amplification in 173 (62.01%) cases of which predicted assays were validated in 143 (82.66%) cases (CGMM). Alignments of chickpea unigenes with Medicago truncatula genome were used to develop 121 intron spanning region (CISR) markers of which 87 yielded scorable products. In addition, optimization of 77 EST-derived SSR (ICCeM) markers provided 51 scorable markers. Screening of easily assayable 281 markers including 143 CGMMs, 87 CISRs and 51 ICCeMs on 5 parental genotypes of three mapping populations identified 104 polymorphic markers including 90 markers on the inter-specific mapping population. Sixty-two of these GMMs together with 218 earlier published markers (including 64 GMM loci) and 20 other unpublished markers could be integrated into this genetic map. A genetic map developed here, therefore, has a total of 300 loci including 126 GMM loci and spans 766.56 cM, with an average inter-marker distance of 2.55 cM. In summary, this is the first report on the development of large-scale genic markers including development of easily assayable markers and a transcript map of chickpea. These resources should be useful not only for genome analysis and genetics and breeding applications of chickpea, but also for comparative legume genomics.
Electronic supplementary material
The online version of this article (doi:10.1007/s00122-011-1556-1) contains supplementary material, which is available to authorized users.
The genus Arachis, originated in South America, is divided into nine taxonomical sections comprising of 80 species. Most of the Arachis species are diploids (2n = 2x = 20) and the tetraploid species (2n = 2x = 40) are found in sections Arachis, Extranervosae and Rhizomatosae. Diploid species have great potential to be used as resistance sources for agronomic traits like pests and diseases, drought related traits and different life cycle spans. Understanding of genetic relationships among wild species and between wild and cultivated species will be useful for enhanced utilization of wild species in improving cultivated germplasm. The present study was undertaken to evaluate genetic relationships among species (96 accessions) belonging to seven sections of Arachis by using simple sequence repeat (SSR) markers developed from Arachis hypogaea genomic library and gene sequences from related genera of Arachis.
The average transferability rate of 101 SSR markers tested to section Arachis and six other sections was 81% and 59% respectively. Five markers (IPAHM 164, IPAHM 165, IPAHM 407a, IPAHM 409, and IPAHM 659) showed 100% transferability. Cluster analysis of allelic data from a subset of 32 SSR markers on 85 wild and 11 cultivated accessions grouped accessions according to their genome composition, sections and species to which they belong. A total of 109 species specific alleles were detected in different wild species, Arachis pusilla exhibited largest number of species specific alleles (15). Based on genetic distance analysis, the A-genome accession ICG 8200 (A. duranensis) and the B-genome accession ICG 8206 (A. ipaënsis) were found most closely related to A. hypogaea.
A set of cross species and cross section transferable SSR markers has been identified that will be useful for genetic studies of wild species of Arachis, including comparative genome mapping, germplasm analysis, population genetic structure and phylogenetic inferences among species. The present study provides strong support based on both genomic and genic markers, probably for the first time, on relationships of A. monticola and A. hypogaea as well as on the most probable donor of A and B-genomes of cultivated groundnut.
Chickpea (Cicer arietinum L.), an important grain legume crop of the world is seriously challenged by terminal drought and salinity stresses. However, very limited number of molecular markers and candidate genes are available for undertaking molecular breeding in chickpea to tackle these stresses. This study reports generation and analysis of comprehensive resource of drought- and salinity-responsive expressed sequence tags (ESTs) and gene-based markers.
A total of 20,162 (18,435 high quality) drought- and salinity- responsive ESTs were generated from ten different root tissue cDNA libraries of chickpea. Sequence editing, clustering and assembly analysis resulted in 6,404 unigenes (1,590 contigs and 4,814 singletons). Functional annotation of unigenes based on BLASTX analysis showed that 46.3% (2,965) had significant similarity (≤1E-05) to sequences in the non-redundant UniProt database. BLASTN analysis of unique sequences with ESTs of four legume species (Medicago, Lotus, soybean and groundnut) and three model plant species (rice, Arabidopsis and poplar) provided insights on conserved genes across legumes as well as novel transcripts for chickpea. Of 2,965 (46.3%) significant unigenes, only 2,071 (32.3%) unigenes could be functionally categorised according to Gene Ontology (GO) descriptions. A total of 2,029 sequences containing 3,728 simple sequence repeats (SSRs) were identified and 177 new EST-SSR markers were developed. Experimental validation of a set of 77 SSR markers on 24 genotypes revealed 230 alleles with an average of 4.6 alleles per marker and average polymorphism information content (PIC) value of 0.43. Besides SSR markers, 21,405 high confidence single nucleotide polymorphisms (SNPs) in 742 contigs (with ≥ 5 ESTs) were also identified. Recognition sites for restriction enzymes were identified for 7,884 SNPs in 240 contigs. Hierarchical clustering of 105 selected contigs provided clues about stress- responsive candidate genes and their expression profile showed predominance in specific stress-challenged libraries.
Generated set of chickpea ESTs serves as a resource of high quality transcripts for gene discovery and development of functional markers associated with abiotic stress tolerance that will be helpful to facilitate chickpea breeding. Mapping of gene-based markers in chickpea will also add more anchoring points to align genomes of chickpea and other legume species.
Drought tolerance is a key trait for increasing and stabilizing barley productivity in dry areas worldwide. Identification of the genes responsible for drought tolerance in barley (Hordeum vulgare L.) will facilitate understanding of the molecular mechanisms of drought tolerance, and also facilitate the genetic improvement of barley through marker-assisted selection or gene transformation. To monitor the changes in gene expression at the transcriptional level in barley leaves during the reproductive stage under drought conditions, the 22K Affymetrix Barley 1 microarray was used to screen two drought-tolerant barley genotypes, Martin and Hordeum spontaneum 41-1 (HS41-1), and one drought-sensitive genotype Moroc9-75. Seventeen genes were expressed exclusively in the two drought-tolerant genotypes under drought stress, and their encoded proteins may play significant roles in enhancing drought tolerance through controlling stomatal closure via carbon metabolism (NADP malic enzyme, NADP-ME, and pyruvate dehydrogenase, PDH), synthesizing the osmoprotectant glycine-betaine (C-4 sterol methyl oxidase, CSMO), generating protectants against reactive-oxygen-species scavenging (aldehyde dehydrogenase,ALDH, ascorbate-dependent oxidoreductase, ADOR), and stabilizing membranes and proteins (heat-shock protein 17.8, HSP17.8, and dehydrin 3, DHN3). Moreover, 17 genes were abundantly expressed in Martin and HS41-1 compared with Moroc9-75 under both drought and control conditions. These genes were possibly constitutively expressed in drought-tolerant genotypes. Among them, seven known annotated genes might enhance drought tolerance through signalling [such as calcium-dependent protein kinase (CDPK) and membrane steroid binding protein (MSBP)], anti-senescence (G2 pea dark accumulated protein, GDA2), and detoxification (glutathione S-transferase, GST) pathways. In addition, 18 genes, including those encoding Δl-pyrroline-5-carboxylate synthetase (P5CS), protein phosphatase 2C-like protein (PP2C), and several chaperones, were differentially expressed in all genotypes under drought; thus they were more likely to be general drought-responsive genes in barley. These results could provide new insights into further understanding of drought-tolerance mechanisms in barley.
Barley; drought stress; drought tolerance; microarray; reproductive stage
There is a need for software scripts and modules for format parsing, data manipulation, statistical analysis and annotation especially for tasks related to marker identification from sequence data and sequence diversity analysis.
Here we present several new Perl scripts and a module for sequence data diversity analysis. To enable the use of these software with other public domain tools, we also make available PISE (Pasteur Institute Software Environment) wrappers for these Perl scripts and module. This enables the user to generate pipelines for automated analysis, since PISE is a web interface generator for bioinformatics programmes.
A new set of modules and scripts for diversity statistic calculation, format parsing and data manipulation are available with PISE wrappers that enable pipelining of these scripts with commonly used contig assembly and sequence feature prediction software, to answer specific sequence diversity related questions.
Plant genetic resources (PGR) are the basic raw materials for future genetic progress and an insurance against unforeseen threats to agricultural production. An extensive characterization of PGR provides an opportunity to dissect structure, mine allelic variations, and identify diverse accessions for crop improvement. The Generation Challenge Program conceptualized the development of "composite collections" and extraction of "reference sets" from these for more efficient tapping of global crop-related genetic resources. In this study, we report the genetic structure, diversity and allelic richness in a composite collection of chickpea using SSR markers, and formation of a reference set of 300 accessions.
The 48 SSR markers detected 1683 alleles in 2915 accessions, of which, 935 were considered rare, 720 common and 28 most frequent. The alleles per locus ranged from 14 to 67, averaged 35, and the polymorphic information content was from 0.467 to 0.974, averaged 0.854. Marker polymorphism varied between groups of accessions in the composite collection and reference set. A number of group-specific alleles were detected: 104 in Kabuli, 297 in desi, and 69 in wild Cicer; 114 each in Mediterranean and West Asia (WA), 117 in South and South East Asia (SSEA), and 10 in African region accessions. Desi and kabuli shared 436 alleles, while wild Cicer shared 17 and 16 alleles with desi and kabuli, respectively. The accessions from SSEA and WA shared 74 alleles, while those from Mediterranean 38 and 33 alleles with WA and SSEA, respectively. Desi chickpea contained a higher proportion of rare alleles (53%) than kabuli (46%), while wild Cicer accessions were devoid of rare alleles. A genotype-based reference set captured 1315 (78%) of the 1683 composite collection alleles of which 463 were rare, 826 common, and 26 the most frequent alleles. The neighbour-joining tree diagram of this reference set represents diversity from all directions of the tree diagram of the composite collection.
The genotype-based reference set, reported here, is an ideal set of germplasm for allele mining, association genetics, mapping and cloning gene(s), and in applied breeding for the development of broad-based elite breeding lines/cultivars with superior yield and enhanced adaptation to diverse environments.
Hordeum chilense, a native South American diploid wild barley, is a potential source of useful genes for cereal breeding. The use of this wild species to increase genetic variation in cereals will be greatly facilitated by marker-assisted selection. Different economically feasible approaches have been undertaken for this wild species with limited direct agricultural use in a search for suitable and cost-effective markers. The availability of Expressed Sequence Tags (EST) derived microsatellites or simple sequence repeat (SSR) markers, commonly called as EST-SSRs, for barley (Hordeum vulgare) represents a promising source to increase the number of genetic markers available for the H. chilense genome.
All of the 82 barley EST-derived SSR primer pairs tested for transferability to H. chilense amplified products of correct size from this species. Of these 82 barley EST-SSRs, 21 (26%) showed polymorphism among H. chilense lines. Identified polymorphic markers were used to test the transferability and polymorphism in other Poaceae family species with the aim of establishing H. chilense phylogenetic relationships. Triticum aestivum-H. chilense addition lines allowed us to determine the chromosomal localizations of EST-SSR markers and confirm conservation of the linkage group.
From the present study a set of 21 polymorphic EST-SSR markers have been identified to be useful for diversity analysis of H. chilense, related wild barleys like H. murinum, and for wheat marker-assisted introgression breeding. Across-genera transferability of the barley EST-SSR markers has allowed phylogenetic inference within the Triticeae complex.
Cultivated peanut or groundnut (Arachis hypogaea L.) is the fourth most important oilseed crop in the world, grown mainly in tropical, subtropical and warm temperate climates. Due to its origin through a single and recent polyploidization event, followed by successive selection during breeding efforts, cultivated groundnut has a limited genetic background. In such species, microsatellite or simple sequence repeat (SSR) markers are very informative and useful for breeding applications. The low level of polymorphism in cultivated germplasm, however, warrants a need of larger number of polymorphic microsatellite markers for cultivated groundnut.
A microsatellite-enriched library was constructed from the genotype TMV2. Sequencing of 720 putative SSR-positive clones from a total of 3,072 provided 490 SSRs. 71.2% of these SSRs were perfect type, 13.1% were imperfect and 15.7% were compound. Among these SSRs, the GT/CA repeat motifs were the most common (37.6%) followed by GA/CT repeat motifs (25.9%). The primer pairs could be designed for a total of 170 SSRs and were optimized initially on two genotypes. 104 (61.2%) primer pairs yielded scorable amplicon and 46 (44.2%) primers showed polymorphism among 32 cultivated groundnut genotypes. The polymorphic SSR markers detected 2 to 5 alleles with an average of 2.44 per locus. The polymorphic information content (PIC) value for these markers varied from 0.12 to 0.75 with an average of 0.46. Based on 112 alleles obtained by 46 markers, a phenogram was constructed to understand the relationships among the 32 genotypes. Majority of the genotypes representing subspecies hypogaea were grouped together in one cluster, while the genotypes belonging to subspecies fastigiata were grouped mainly under two clusters.
Newly developed set of 104 markers extends the repertoire of SSR markers for cultivated groundnut. These markers showed a good level of PIC value in cultivated germplasm and therefore would be very useful for germplasm analysis, linkage mapping, diversity studies and phylogenetic relationships in cultivated groundnut as well as related Arachis species.
The large amounts of EST sequence data available from a single species of an organism as well as for several species within a genus provide an easy source of identification of
intra- and interspecies single nucleotide polymorphisms
(SNPs). In the case of model organisms, the data available are
numerous, given the degree of redundancy in the deposited EST
data. There are several available bioinformatics tools that
can be used to mine this data; however, using them requires a
certain level of expertise: the tools have to be used
sequentially with accompanying format conversion and steps
like clustering and assembly of sequences become
time-intensive jobs even for moderately sized datasets. We
report here a pipeline of open source software extended to run
on multiple CPU architectures that can be used to mine large
EST datasets for SNPs and identify restriction sites for
assaying the SNPs so that cost-effective CAPS assays can be
developed for SNP genotyping in genetics and breeding
applications. At the International Crops Research Institute for
the Semi-Arid Tropics (ICRISAT), the pipeline has been
implemented to run on a Paracel high-performance system
consisting of four dual AMD Opteron processors running Linux
with MPICH. The pipeline can be accessed through user-friendly
web interfaces at http://hpc.icrisat.cgiar.org/PBSWeb and is
available on request for academic use. We have validated the
developed pipeline by mining chickpea ESTs for interspecies
SNPs, development of CAPS assays for SNP genotyping, and
confirmation of restriction digestion pattern at the sequence
Small heat shock protein 17.8 (HSP17.8) is produced abundantly in plant cells under heat and other stress conditions and may play an important role in plant tolerance to stress environments. However, HSP17.8 may be differentially expressed in different accessions of a crop species exposed to identical stress conditions. The ability of different genotypes to adapt to various stress conditions resides in their genetic diversity. Allelic variations are the most common forms of genetic variation in natural populations. In this study, single nucleotide polymorphisms (SNPs) of the HSP17.8 gene were investigated across 210 barley accessions collected from 30 countries using EcoTILLING technology. Eleven SNPs including 10 from the coding region of HSP17.8 were detected, which form nine distinguishable haplotypes in the barley collection. Among the 10 SNPs in the coding region, six are missense mutations and four are synonymous nucleotide changes. Five of the six missense changes are predicted to be deleterious to HSP17.8 function. The accessions from Middle East Asia showed the higher nucleotide diversity of HSP17.8 than those from other regions and wild barley (H. spontaneum) accessions exhibited greater diversity than the cultivated barley (H. vulgare) accessions. Four SNPs in HSP17.8 were found associated with at least one of the agronomic traits evaluated except for spike length, namely number of grains per spike, thousand kernel weight, plant height, flag leaf area and leaf color. The association between SNP and these agronomic traits may provide new insight for study of the gene's potential contribution to drought tolerance of barley.
Only a few genetic maps based on recombinant inbred line (RIL) and backcross (BC) populations have been developed for tetraploid groundnut. The marker density, however, is not very satisfactory especially in the context of large genome size (2800 Mb/1C) and 20 linkage groups (LGs). Therefore, using marker segregation data for 10 RILs and one BC population from the international groundnut community, with the help of common markers across different populations, a reference consensus genetic map has been developed. This map is comprised of 897 marker loci including 895 simple sequence repeat (SSR) and 2 cleaved amplified polymorphic sequence (CAPS) loci distributed on 20 LGs (a01–a10 and b01–b10) spanning a map distance of 3, 863.6 cM with an average map density of 4.4 cM. The highest numbers of markers (70) were integrated on a01 and the least number of markers (21) on b09. The marker density, however, was lowest (6.4 cM) on a08 and highest (2.5 cM) on a01. The reference consensus map has been divided into 20 cM long 203 BINs. These BINs carry 1 (a10_02, a10_08 and a10_09) to 20 (a10_04) loci with an average of 4 marker loci per BIN. Although the polymorphism information content (PIC) value was available for 526 markers in 190 BINs, 36 and 111 BINs have at least one marker with >0.70 and >0.50 PIC values, respectively. This information will be useful for selecting highly informative and uniformly distributed markers for developing new genetic maps, background selection and diversity analysis. Most importantly, this reference consensus map will serve as a reliable reference for aligning new genetic and physical maps, performing QTL analysis in a multi-populations design, evaluating the genetic background effect on QTL expression, and serving other genetic and molecular breeding activities in groundnut.
Pigeonpea (Cajanus cajan) is an annual or short-lived perennial food legume of acute regional importance, providing significant protein to the human diet in less developed regions of Asia and Africa. Due to its narrow genetic base, pigeonpea improvement is increasingly reliant on introgression of valuable traits from wild forms, a practice that would benefit from knowledge of its domestication history and relationships to wild species. Here we use 752 single nucleotide polymorphisms (SNPs) derived from 670 low copy orthologous genes to clarify the evolutionary history of pigeonpea (79 accessions) and its wild relatives (31 accessions). We identified three well-supported lineages that are geographically clustered and congruent with previous nuclear and plastid sequence-based phylogenies. Among all species analyzed Cajanus cajanifolius is the most probable progenitor of cultivated pigeonpea. Multiple lines of evidence suggest recent gene flow between cultivated and non-cultivated forms, as well as historical gene flow between diverged but sympatric species. Evidence supports that primary domestication occurred in India, with a second and more recent nested population bottleneck focused in tropical regions that is the likely consequence of pigeonpea breeding. We find abundant allelic variation and genetic diversity among the wild relatives, with the exception of wild species from Australia for which we report a third bottleneck unrelated to domestication within India. Domesticated C. cajan possess 75% less allelic diversity than the progenitor clade of wild Indian species, indicating a severe “domestication bottleneck” during pigeonpea domestication.
Cultivated peanut (Arachis hypogaea L.) is an important crop worldwide, valued for its edible oil and digestible protein. It has a very narrow genetic base that may well derive from a relatively recent single polyploidization event. Accordingly molecular markers have low levels of polymorphism and the number of polymorphic molecular markers available for cultivated peanut is still limiting.
Here, we report a large set of BAC-end sequences (BES), use them for developing SSR (BES-SSR) markers, and apply them in genetic linkage mapping. The majority of BESs had no detectable homology to known genes (49.5%) followed by sequences with similarity to known genes (44.3%), and miscellaneous sequences (6.2%) such as transposable element, retroelement, and organelle sequences. A total of 1,424 SSRs were identified from 36,435 BESs. Among these identified SSRs, dinucleotide (47.4%) and trinucleotide (37.1%) SSRs were predominant. The new set of 1,152 SSRs as well as about 4,000 published or unpublished SSRs were screened against two parents of a mapping population, generating 385 polymorphic loci. A genetic linkage map was constructed, consisting of 318 loci onto 21 linkage groups and covering a total of 1,674.4 cM, with an average distance of 5.3 cM between adjacent loci. Two markers related to resistance gene homologs (RGH) were mapped to two different groups, thus anchoring 1 RGH-BAC contig and 1 singleton.
The SSRs mined from BESs will be of use in further molecular analysis of the peanut genome, providing a novel set of markers, genetically anchoring BAC clones, and incorporating gene sequences into a linkage map. This will aid in the identification of markers linked to genes of interest and map-based cloning.
Chickpea (Cicer arietinum L.) is the third most important cool season food legume, cultivated in arid and semi-arid regions of the world. The goal of this study was to develop novel molecular markers such as microsatellite or simple sequence repeat (SSR) markers from bacterial artificial chromosome (BAC)-end sequences (BESs) and diversity arrays technology (DArT) markers, and to construct a high-density genetic map based on recombinant inbred line (RIL) population ICC 4958 (C. arietinum)×PI 489777 (C. reticulatum). A BAC-library comprising 55,680 clones was constructed and 46,270 BESs were generated. Mining of these BESs provided 6,845 SSRs, and primer pairs were designed for 1,344 SSRs. In parallel, DArT arrays with ca. 15,000 clones were developed, and 5,397 clones were found polymorphic among 94 genotypes tested. Screening of newly developed BES-SSR markers and DArT arrays on the parental genotypes of the RIL mapping population showed polymorphism with 253 BES-SSR markers and 675 DArT markers. Segregation data obtained for these polymorphic markers and 494 markers data compiled from published reports or collaborators were used for constructing the genetic map. As a result, a comprehensive genetic map comprising 1,291 markers on eight linkage groups (LGs) spanning a total of 845.56 cM distance was developed (http://cmap.icrisat.ac.in/cmap/sm/cp/thudi/). The number of markers per linkage group ranged from 68 (LG 8) to 218 (LG 3) with an average inter-marker distance of 0.65 cM. While the developed resource of molecular markers will be useful for genetic diversity, genetic mapping and molecular breeding applications, the comprehensive genetic map with integrated BES-SSR markers will facilitate its anchoring to the physical map (under construction) to accelerate map-based cloning of genes in chickpea and comparative genome evolution studies in legumes.