A large number of short grain aromatic rice suited to the agro-climatic conditions and local preferences are grown in niche areas of different parts of India and their diversity is evolved over centuries as a result of selection by traditional farmers. Systematic characterization of these specialty rices has not been attempted. An effort was made to characterize 126 aromatic short grain rice landraces, collected from 19 different districts in the State of Odisha, from eastern India. High level of variation for grain quality and agronomic traits among these aromatic rices was observed and genotypes having desirable phenotypic traits like erect flag leaf, thick culm, compact and dense panicles, short plant stature, early duration, superior yield and grain quality traits were identified. A total of 24 SSR markers corresponding to the hyper variable regions of rice chromosomes were used to understand the genetic diversity and to establish the genetic relationship among the aromatic short grain rice landraces at nuclear genome level. SSR analysis of 126 genotypes from Odisha and 10 genotypes from other states revealed 110 alleles with an average of 4.583 and the Nei’s genetic diversity value (He) was in the range of 0.034–0.880 revealing two sub-populations SP 1 (membership percentage-27.1%) and SP 2 (72.9%). At the organelle genomic level for the C/A repeats in PS1D sequence of chloroplasts, eight different plastid sub types and 33 haplotypes were detected. The japonica (Nipponbare) subtype (6C7A) was detected in 100 genotypes followed by O. rufipogon (KF428978) subtype (6C6A) in 13 genotypes while indica (93–11) sub type (8C8A) was seen in 14 genotypes. The tree constructed based on haplotypes suggests that short grain aromatic landraces might have independent origin of these plastid subtypes. Notably a wide range of diversity was observed among these landraces cultivated in different parts confined to the State of Odisha.
Speciality rice, in general, and aromatic rice in particular, possess enormous market potential for enhancing farm profits. However, systematic characterization of the diversity present in this natural wealth is a major pre requisite for using it in the breeding programs. This study reports qualitative phenotypic trait based characterization of 126 short grain aromatic rice genotypes, collected from different areas of the state of Odisha, India.
Out of the 24 descriptors employed, highest variability (8 different types) was observed for lemma-palea colour with a genetic diversity index (He) of 0.696. The principal component analysis reveals that the tip colour of lemma, colour of awn and colour of stigma, cumulatively explain 74 % of the total variation. The Population STRUCTURE analysis classified the population into two subpopulations which were subdivided further into four distinct groups. The western and southern districts of Odisha are endowed with maximum diversity in comparison to eastern and northern districts and at district level comparisons, Koraput and Puri districts are rich with a genetic diversity values of 0.324 and 0.303 respectively. With this set of morphological qualitative traits, based on ‘phenoprinting’, a newly proposed bar coding system, unique fingerprints of each genotype can be effectively generated that can help in easy identification of these genotypes.
Though aromatic rices represent a tiny fraction of the total rice germplasm, a small collection of 126 land races did exhibit rich diversity for all the qualitative traits. For lemma-palea colour, eight different types were detected while for tip colour of lemma, six different types were recorded, suggesting the presence of rich variability in short grain aromatic rices that are conserved in this region. The proposed ‘phenoprinting’ can be an effective descriptor with the unique finger prints generated for each genotype and coupled with molecular (DNA) finger printing, we can discriminate and identify each and every aromatic short grain rice genotype. The proposed system not only help in conservation but also can confer IPR protection to these specialty rices.
Electronic supplementary material
The online version of this article (doi:10.1186/s12898-016-0086-8) contains supplementary material, which is available to authorized users.
Rice; Landraces; Aromatic short grain; Phenotypic; Trait; Characterization; Diversity
Heat shock protein 70 (HSP70) is an important chaperone, involved in protein folding, refolding, translocation and complex remodeling reactions under normal as well as stress conditions. However, expression of HSPA1A gene in heat and cold stress conditions associates with other chaperons and perform its function. Experimental structure for Camel HSP70 protein (cHSP70) has not been reported so far. Hence, we constructed 3D models of cHSP70 through multi- template comparative modeling with HSP110 protein of S. cerevisiae (open state) and with HSP70 protein of E. coli 70kDa DnaK (close state) and relaxed them for 100 nanoseconds (ns) using all-atom Molecular Dynamics (MD) Simulation. Two stable conformations of cHSP70 with Substrate Binding Domain (SBD) in open and close states were obtained. The collective mode analysis of different transitions of open state to close state and vice versa was examined via Principal Component Analysis (PCA) and Minimum Distance Matrix (MDM). The results provide mechanistic representation of the communication between Nucleotide Binding Domain (NBD) and SBD to identify the role of sub domains in conformational change mechanism, which leads the chaperone cycle of cHSP70. Further, residues present in the chaperon functioning site were also identified through protein-peptide docking. This study provides an overall insight into the inter domain communication mechanism and identification of the chaperon binding cavity, which explains the underlying mechanism involved during heat and cold stress conditions in camel.
Salinity tolerance in rice is highly desirable to sustain production in areas rendered saline due to various reasons. It is a complex quantitative trait having different components, which can be dissected effectively by genome-wide association study (GWAS). Here, we implemented GWAS to identify loci controlling salinity tolerance in rice. A custom-designed array based on 6,000 single nucleotide polymorphisms (SNPs) in as many stress-responsive genes, distributed at an average physical interval of <100 kb on 12 rice chromosomes, was used to genotype 220 rice accessions using Infinium high-throughput assay. Genetic association was analysed with 12 different traits recorded on these accessions under field conditions at reproductive stage. We identified 20 SNPs (loci) significantly associated with Na+/K+ ratio, and 44 SNPs with other traits observed under stress condition. The loci identified for various salinity indices through GWAS explained 5–18% of the phenotypic variance. The region harbouring Saltol, a major quantitative trait loci (QTLs) on chromosome 1 in rice, which is known to control salinity tolerance at seedling stage, was detected as a major association with Na+/K+ ratio measured at reproductive stage in our study. In addition to Saltol, we also found GWAS peaks representing new QTLs on chromosomes 4, 6 and 7. The current association mapping panel contained mostly indica accessions that can serve as source of novel salt tolerance genes and alleles. The gene-based SNP array used in this study was found cost-effective and efficient in unveiling genomic regions/candidate genes regulating salinity stress tolerance in rice.
genome-wide association study; infinium genotyping assay; rice; salt tolerance; single-nucleotide polymorphism
Halomonas salina strain CIFRI1 is an extremely salt-stress-tolerant bacterium isolated from the salt crystals of the east coast of India. Here we report the annotated 3.45-Mb draft genome sequence of strain CIFRI1 having 86 contigs with 3,139 protein coding loci, including 62 RNA genes.
Earlier studies were focused on the genetics of temperate and tropical maize under drought. We identified genetic loci and their association with functional mechanisms in 240 accessions of subtropical maize using a high-density marker set under water stress.
Out of 61 significant SNPs (11 were false-discovery-rate-corrected associations), identified across agronomic traits, models, and locations by subjecting the accessions to water stress at flowering stage, 48% were associated with drought-tolerant genes. Maize gene models revealed that SNPs mapped for agronomic traits were in fact associated with number of functional traits as follows: stomatal closure, 28; flowering, 15; root development, 5; detoxification, 4; and reduced water potential, 2. Interactions of these SNPS through the functional traits could lead to drought tolerance. The SNPs associated with ABA-dependent signalling pathways played a major role in the plant’s response to stress by regulating a series of functions including flowering, root development, auxin metabolism, guard cell functions, and scavenging reactive oxygen species (ROS). ABA signalling genes regulate flowering through epigenetic changes in stress-responsive genes. ROS generated by ABA signalling are reduced by the interplay between ethylene, ABA, and detoxification signalling transductions. Integration of ABA-signalling genes with auxin-inducible genes regulates root development which in turn, maintains the water balance by regulating electrochemical gradient in plant.
Several genes are directly or indirectly involved in the functioning of agronomic traits related to water stress. Genes involved in these crucial biological functions interacted significantly in order to maintain the primary as well as exclusive functions related to coping with water stress. SNPs associated with drought-tolerant genes involved in strategic biological functions will be useful to understand the mechanisms of drought tolerance in subtropical maize.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-1182) contains supplementary material, which is available to authorized users.
Genome-wide SNPs; Association mapping; Functional mechanisms; Candidate SNPs; Water stress; Drought tolerance; Maize
Maize is an increasingly important food crop in southeast Asia. The elucidation of its genetic architecture, accomplished by exploring quantitative trait loci and useful alleles in various lines across numerous breeding programs, is therefore of great interest. The present study aimed to characterize subtropical maize lines using high-quality SNPs distributed throughout the genome.
We genotyped a panel of 240 subtropical elite maize inbred lines and carried out linkage disequilibrium, genetic diversity, population structure, and principal component analyses on the generated SNP data. The mean SNP distance across the genome was 70 Kb. The genome had both high and low linkage disequilibrium (LD) regions; the latter were dominant in areas near the gene-rich telomeric portions where recombination is frequent. A total of 252 haplotype blocks, ranging in size from 1 to 15.8 Mb, were identified. Slow LD decay (200–300 Kb) at r
≤ 0.1 across all chromosomes explained the selection of favorable traits around low LD regions in different breeding programs. The association mapping panel was characterized by strong population substructure. Genotypes were grouped into three distinct clusters with a mean genetic dissimilarity coefficient of 0.36.
The genotyped panel of subtropical maize lines characterized in this study should be useful for association mapping of agronomically important genes. The dissimilarity uncovered among genotypes provides an opportunity to exploit the heterotic potential of subtropical elite maize breeding lines.
Subtropical maize; Genome-wide SNPs; Linkage disequilibrium; Population structure; Association mapping; Genetic diversity
Single nucleotide polymorphism (SNP) validation and large-scale genotyping are required to maximize the use of DNA sequence variation and determine the functional relevance of candidate genes for complex stress tolerance traits through genetic association in rice. We used the bead array platform-based Illumina GoldenGate assay to validate and genotype SNPs in a select set of stress-responsive genes to understand their functional relevance and study the population structure in rice.
Of the 384 putative SNPs assayed, we successfully validated and genotyped 362 (94.3%). Of these 325 (84.6%) showed polymorphism among the 91 rice genotypes examined. Physical distribution, degree of allele sharing, admixtures and introgression, and amino acid replacement of SNPs in 263 abiotic and 62 biotic stress-responsive genes provided clues for identification and targeted mapping of trait-associated genomic regions. We assessed the functional and adaptive significance of validated SNPs in a set of contrasting drought tolerant upland and sensitive lowland rice genotypes by correlating their allelic variation with amino acid sequence alterations in catalytic domains and three-dimensional secondary protein structure encoded by stress-responsive genes. We found a strong genetic association among SNPs in the nine stress-responsive genes with upland and lowland ecological adaptation. Higher nucleotide diversity was observed in indica accessions compared with other rice sub-populations based on different population genetic parameters. The inferred ancestry of 16% among rice genotypes was derived from admixed populations with the maximum between upland aus and wild Oryza species.
SNPs validated in biotic and abiotic stress-responsive rice genes can be used in association analyses to identify candidate genes and develop functional markers for stress tolerance in rice.
Illumina GoldenGate assay; Population structure; Rice; SNPs; Single nucleotide polymorphisms; Stress-responsive genes
Rice is staple food for more than half of the world’s population including two billion Asians, who obtain 60-70% of their energy intake from rice and its derivatives. To meet the growing demand from human population, rice varieties with higher yield potential and greater yield stability need to be developed. The favourable alleles for yield and yield contributing traits are distributed among two subspecies i.e., indica and japonica of cultivated rice (Oryza sativa L.). Identification of novel favourable alleles in indica/japonica will pave way to marker-assisted mobilization of these alleles in to a genetic background to break genetic barriers to yield.
A new plant type (NPT) based mapping population of 310 recombinant inbred lines (RILs) was used to map novel genomic regions and QTL hotspots influencing yield and eleven yield component traits. We identified major quantitative trait loci (QTLs) for days to 50% flowering (R2 = 25%, LOD = 14.3), panicles per plant (R2 = 19%, LOD = 9.74), flag leaf length (R2 = 22%, LOD = 3.05), flag leaf width (R2 = 53%, LOD = 46.5), spikelets per panicle (R2 = 16%, LOD = 13.8), filled grains per panicle (R2 = 22%, LOD = 15.3), percent spikelet sterility (R2 = 18%, LOD = 14.24), thousand grain weight (R2 = 25%, LOD = 12.9) and spikelet setting density (R2 = 23%, LOD = 15) expressing over two or more locations by using composite interval mapping. The phenotypic variation (R2) ranged from 8 to 53% for eleven QTLs expressing across all three locations. 19 novel QTLs were contributed by the NPT parent, Pusa1266. 15 QTL hotpots on eight chromosomes were identified for the correlated traits. Six epistatic QTLs effecting five traits at two locations were identified. A marker interval (RM3276-RM5709) on chromosome 4 harboring major QTLs for four traits was identified.
The present study reveals that favourable alleles for yield and yield contributing traits were distributed among two subspecies of rice and QTLs were co-localized in different genomic regions. QTL hotspots will be useful for understanding the common genetic control mechanism of the co-localized traits and selection for beneficial allele at these loci will result in a cumulative increase in yield due to the integrative positive effect of various QTLs. The information generated in the present study will be useful to fine map and to identify the genes underlying major robust QTLs and to transfer all favourable QTLs to one genetic background to break genetic barriers to yield for sustained food security.
The small hairpin RNAs (shRNA) are useful in many ways like identification of trait specific molecular markers, gene silencing and
characterization of a species. In public domain, hardly there exists any standalone software for shRNA prediction. Hence, a
software shRNAPred (1.0) is proposed here to offer a user-friendly Command-line User Interface (CUI) to predict ‘shRNA-like’
regions from a large set of nucleotide sequences. The software is developed using PERL Version 5.12.5 taking into account the
parameters such as stem and loop length combinations, specific loop sequence, GC content, melting temperature, position specific
nucleotides, low complexity filter, etc. Each of the parameters is assigned with a specific score and based on which the software
ranks the predicted shRNAs. The high scored shRNAs obtained from the software are depicted as potential shRNAs and provided
to the user in the form of a text file. The proposed software also allows the user to customize certain parameters while predicting
specific shRNAs of his interest. The shRNAPred (1.0) is open access software available for academic users. It can be downloaded
freely along with user manual, example dataset and output for easy understanding and implementation.
The database is available for free at http://bioinformatics.iasri.res.in/EDA/downloads/shRNAPred_v1.0.exe
shRNA; shRNA prediction; RNAi; Gene silencing
Pigeonpea (Cajanus cajan) is an important grain legume of the Indian subcontinent, South-East Asia and East Africa. More than eighty five percent of the world pigeonpea is produced and consumed in India where it is a key crop for food and nutritional security of the people. Here we present the first draft of the genome sequence of a popular pigeonpea variety ‘Asha’. The genome was assembled using long sequence reads of 454 GS-FLX sequencing chemistry with mean read lengths of >550 bp and >10-fold genome coverage, resulting in 510,809,477 bp of high quality sequence. Total 47,004 protein coding genes and 12,511 transposable elements related genes were predicted. We identified 1,213 disease resistance/defense response genes and 152 abiotic stress tolerance genes in the pigeonpea genome that make it a hardy crop. In comparison to soybean, pigeonpea has relatively fewer number of genes for lipid biosynthesis and larger number of genes for cellulose synthesis. The sequence contigs were arranged in to 59,681 scaffolds, which were anchored to eleven chromosomes of pigeonpea with 347 genic-SNP markers of an intra-species reference genetic map. Eleven pigeonpea chromosomes showed low but significant synteny with the twenty chromosomes of soybean. The genome sequence was used to identify large number of hypervariable ‘Arhar’ simple sequence repeat (HASSR) markers, 437 of which were experimentally validated for PCR amplification and high rate of polymorphism among pigeonpea varieties. These markers will be useful for fingerprinting and diversity analysis of pigeonpea germplasm and molecular breeding applications. This is the first plant genome sequence completed entirely through a network of Indian institutions led by the Indian Council of Agricultural Research and provides a valuable resource for the pigeonpea variety improvement.
Electronic supplementary material
The online version of this article (doi:10.1007/s13562-011-0088-8) contains supplementary material, which is available to authorized users.
Pigeonpea; Genome sequence; Disease resistance; SSR markers; Legumes
Unigene sequences constitute a rich source of functionally relevant microsatellites. The present study was undertaken to mine the microsatellites in the available unigene sequences of sugarcane for understanding their constitution in the expressed genic component of its complex polyploid/aneuploid genome, assessing their functional significance in silico, determining the extent of allelic diversity at the microsatellite loci and for evaluating their utility in large-scale genotyping applications in sugarcane.
The average frequency of perfect microsatellite was 1/10.9 kb, while it was 1/44.3 kb for the long and hypervariable class I repeats. GC-rich trinucleotides coding for alanine and the GA-rich dinucleotides were the most abundant microsatellite classes. Out of 15,594 unigenes mined in the study, 767 contained microsatellite repeats and for 672 of these putative functions were determined in silico. The microsatellite repeats were found in the functional domains of proteins encoded by 364 unigenes. Its significance was assessed by establishing the structure-function relationship for the beta-amylase and protein kinase encoding unigenes having repeats in the catalytic domains. A total of 726 allelic variants (7.42 alleles per locus) with different repeat lengths were captured precisely for a set of 47 fluorescent dye labeled primers in 36 sugarcane genotypes and five cereal species using the automated fragment analysis system, which suggested the utility of designed primers for rapid, large-scale and high-throughput genotyping applications in sugarcane. Pair-wise similarity ranging from 0.33 to 0.84 with an average of 0.40 revealed a broad genetic base of the Indian varieties in respect of functionally relevant regions of the large and complex sugarcane genome.
Microsatellite repeats were present in 4.92% of sugarcane unigenes, for most (87.6%) of which functions were determined in silico. High level of allelic diversity in repeats including those present in the functional domains of proteins encoded by the unigenes demonstrated their use in assay of useful variation in the genic component of complex polyploid sugarcane genome.
Forty-four soybean genotypes with different photoperiod response were selected after screening of 1000 soybean accessions under artificial condition and were profiled using 40 SSR and 5 AFLP primer pairs. The average polymorphism information content (PIC) for SSR and AFLP marker systems was 0.507 and 0.120, respectively. Clustering of genotypes was done using UPGMA method for SSR and AFLP and correlation was 0.337 and 0.504, respectively. Mantel's correlation coefficients between Jaccard's similarity coefficient and the cophenetic values were fairly high in both the marker systems (SSR = 0.924; AFLP = 0.958) indicating very good fit for the clustering pattern. UPGMA based cluster analysis classified soybean genotypes into four major groups with fairly moderate bootstrap support. These major clusters corresponded with the photoperiod response and place of origin. The results indicate that the photoperiod insensitive genotypes, 11/2/1939 (EC 325097) and MACS 330 would be better choice for broadening the genetic base of soybean for this trait.
photoperiod response; SSR; AFLP; genetic diversity; soybean
Despite great advances in genomic technology observed in several crop species, the availability of molecular tools such as microsatellite markers has been limited in tea (Camellia sinensis L.). The development of microsatellite markers will have a major impact on genetic analysis, gene mapping and marker assisted breeding. Unigene derived microsatellite (UGMS) markers identified from publicly available sequence database have the advantage of assaying variation in the expressed component of the genome with unique identity and position. Therefore, they can serve as efficient and cost effective alternative markers in such species.
Considering the multiple advantages of UGMS markers, 1,223 unigenes were predicted from 2,181 expressed sequence tags (ESTs) of tea (Camellia sinensis L.). A total of 109 (8.9%) unigenes containing 120 SSRs were identified. SSR abundance was one in every 3.55 kb of EST sequences. The microsatellites mainly comprised of di (50.8%), tri (30.8%), tetra (6.6%), penta (7.5%) and few hexa (4.1%) nucleotide repeats. Among the dinucleotide repeats, (GA)n.(TC)n were most abundant (83.6%). Ninety six primer pairs could be designed form 83.5% of SSR containing unigenes. Of these, 61 (63.5%) primer pairs were experimentally validated and used to investigate the genetic diversity among the 34 accessions of different Camellia spp. Fifty one primer pairs (83.6%) were successfully cross transferred to the related species at various levels. Functional annotation of the unigenes containing SSRs was done through gene ontology (GO) characterization. Thirty six (60%) of them revealed significant sequence similarity with the known/putative proteins of Arabidopsis thaliana. Polymorphism information content (PIC) ranged from 0.018 to 0.972 with a mean value of 0.497. The average heterozygosity expected (HE) and observed (Ho) obtained was 0.654 and 0.413 respectively, thereby suggesting highly heterogeneous nature of tea. Further, test for IAM and SMM models for the UGMS loci showed excess heterozygosity and did not show any bottleneck operating in the tea population.
UGMS markers identified and characterized in this study provided insight about the abundance and distribution of SSR in the expressed genome of C. sinensis. The identification and validation of 61 new UGMS markers will not only help in intra and inter specific genetic diversity assessment but also be enriching limited microsatellite markers resource in tea. Further, the use of these markers would reduce the cost and facilitate the gene mapping and marker-aided selection in tea. Since, 36 of these UGMS markers correspond to the Arabidopsis protein sequence data with known functions will offer the opportunity to investigate the consequences of SSR polymorphism on gene functions.
Completely sequenced plant genomes provide scope for designing a large number of microsatellite markers, which are useful in various aspects of crop breeding and genetic analysis. With the objective of developing genic but non-coding microsatellite (GNMS) markers for the rice (Oryza sativa L.) genome, we characterized the frequency and relative distribution of microsatellite repeat-motifs in 18,935 predicted protein coding genes including 14,308 putative promoter sequences.
We identified 19,555 perfect GNMS repeats with densities ranging from 306.7/Mb in chromosome 1 to 450/Mb in chromosome 12 with an average of 357.5 GNMS per Mb. The average microsatellite density was maximum in the 5' untranslated regions (UTRs) followed by those in introns, promoters, 3'UTRs and minimum in the coding sequences (CDS). Primers were designed for 17,966 (92%) GNMS repeats, including 4,288 (94%) hypervariable class I types, which were bin-mapped on the rice genome. The GNMS markers were most polymorphic in the intronic region (73.3%) followed by markers in the promoter region (53.3%) and least in the CDS (26.6%). The robust polymerase chain reaction (PCR) amplification efficiency and high polymorphic potential of GNMS markers over genic coding and random genomic microsatellite markers suggest their immediate use in efficient genotyping applications in rice. A set of these markers could assess genetic diversity and establish phylogenetic relationships among domesticated rice cultivar groups. We also demonstrated the usefulness of orthologous and paralogous conserved non-coding microsatellite (CNMS) markers, identified in the putative rice promoter sequences, for comparative physical mapping and understanding of evolutionary and gene regulatory complexities among rice and other members of the grass family. The divergence between long-grained aromatics and subspecies japonica was estimated to be more recent (0.004 Mya) compared to short-grained aromatics from japonica (0.006 Mya) and long-grained aromatics from subspecies indica (0.014 Mya).
Our analyses showed that GNMS markers with their high polymorphic potential would be preferred candidate functional markers in various marker-based applications in rice genetics, genomics and breeding. The CNMS markers provided encouraging implications for their use in comparative genome mapping and understanding of evolutionary complexities in rice and other members of grass family.