A high level of transgene expression is required, in several applications of transgenic technology. While use of strong promoters has been the main focus in such instances, 5′UTRs have also been shown to enhance transgene expression. Here, we present a 28 nt long synthetic 5′UTR (synJ), which enhances gene expression in tobacco and cotton.
The influence of synJ on transgene expression was studied in callus cultures of cotton and different tissues of transgenic tobacco plants. The study was based on comparing the expression of reporter gene gus and gfp, with and without synJ as its 5′UTR. Mutations in synJ were also analyzed to identify the region important for enhancement. synJ, enhances gene expression by 10 to 50 fold in tobacco and cotton depending upon the tissue studied. This finding is based on the experiments comparing the expression of gus gene, encoding the synJ as 5′UTR under the control of 35S promoter with expression cassettes based on vectors like pBI121 or pRT100. Further, the enhancement was in most cases equivalent to that observed with the viral leader sequences known to enhance translation like Ω and AMV. In case of transformed cotton callus as well as in the roots of tobacco transgenic plants, the up-regulation mediated by synJ was much higher than that observed in the presence of both Ω as well as AMV. The enhancement mediated by synJ was found to be at the post-transcriptional level. The study also demonstrates the importance of a 5′UTR in realizing the full potential of the promoter strength. synJ has been utilized to design four cloning vectors: pGEN01, pBGEN02, pBGEN02-hpt and pBGEN02-ALSdm each of which can be used for cloning the desired transgene and achieving high level of expression in the resulting transgenic plants.
synJ, a synthetic 5′UTR, can enhance transgene expression under a strong promoter like 35S as well as under a weak promoter like nos in dicotyledonous plants. synJ can be incorporated as the 5′UTR of transgenes, especially in cases where high levels of expression is required. A set of vectors has also been designed to facilitate this process.
Synthetic 5′UTR; Transgene expression; 35S promoter; Ω leader; AMV leader
MicroRNAs (miRNAs) and other types of small regulatory RNAs play critical roles in the regulation of gene expression at the post-transcriptional level in plants. Cotton is one of the most economically important crops, but little is known about the roles of miRNAs during cotton fiber elongation.
Here, we combined high-throughput sequencing with computational analysis to identify small RNAs (sRNAs) related to cotton fiber elongation in Gossypium hirsutum L. (G. hirsutum). The sequence analysis confirmed the expression of 79 known miRNA families in elongating fiber cells and identified 257 novel miRNAs, primarily derived from corresponding specific loci in the Gossypium raimondii Ulbr. (G. raimondii) genome. Furthermore, a comparison of the miRNAomes revealed that 46 miRNA families were differentially expressed throughout the elongation period. Importantly, the predicted and experimentally validated targets of eight miRNAs were associated with fiber elongation, with obvious functional relationships with calcium and auxin signal transduction, fatty acid metabolism, anthocyanin synthesis and the xylem tissue differentiation. Moreover, one tasiRNA was also identified, and its target, ARF4, was experimentally validated in vivo.
This study not only facilitated the discovery of 257 novel low-abundance miRNAs in elongating cotton fiber cells but also revealed a potential regulatory network of nine sRNAs important for fiber elongation. The identification and characterization of miRNAs in elongating cotton fiber cells might promote the further study of fiber miRNA regulation mechanisms and provide insight into the importance of miRNAs in cotton.
Cotton; Comparative miRNAome analysis; Fiber cell elongation; High-throughput sequencing; miRNAs; tasiRNA
Cotton (Gossypium hirsutum L) is an important crop worldwide that provides fiber for the textile industry. Cotton is a perennial plant that stores starch in stems and roots to provide carbohydrates for growth in subsequent seasons. Domesticated cotton makes these reserves available to developing seeds which impacts seed yield. The goals of these analyses were to identify genes and physiological pathways that establish cotton stems and roots as physiological sinks and investigate the role these pathways play in cotton development during seed set.
Analysis of field-grown cotton plants indicated that starch levels peaked about the time of first anthesis and then declined similar to reports in greenhouse-grown cotton plants. Starch accumulated along the length of the stem and the shape and size of the starch grains from stems were easily distinguished from transient starch. Microarray analyses compared gene expression in tissues containing low levels of starch with tissues rapidly accumulating starch. Statistical analysis of differentially expressed genes indicated increased expression among genes associated with starch synthesis, starch degradation, hexose metabolism, raffinose synthesis and trehalose synthesis. The anticipated changes in these sugars were largely confirmed by measuring soluble sugars in selected tissues.
In domesticated cotton starch stored prior to flowering was available to support seed production. Starch accumulation observed in young field-grown plants was not observed in greenhouse grown plants. A suite of genes associated with starch biosynthesis was identified. The pathway for starch utilization after flowering was associated with an increase in expression of a glucan water dikinase gene as has been implicated in utilization of transient starch. Changes in raffinose levels and levels of expression of genes controlling trehalose and raffinose biosynthesis were also observed in vegetative cotton tissues as plants age.
The ubiquitin protein is present in all eukaryotic cells and promoters from ubiquitin genes are good candidates to regulate the constitutive expression of transgenes in plants. Therefore, two switchgrass (Panicum virgatum L.) ubiquitin genes (PvUbi1 and PvUbi2) were cloned and characterized. Reporter constructs were produced containing the isolated 5' upstream regulatory regions of the coding sequences (i.e. PvUbi1 and PvUbi2 promoters) fused to the uidA coding region (GUS) and tested for transient and stable expression in a variety of plant species and tissues.
PvUbi1 consists of 607 bp containing cis-acting regulatory elements, a 5' untranslated region (UTR) containing a 93 bp non-coding exon and a 1291 bp intron, and a 918 bp open reading frame (ORF) that encodes four tandem, head -to-tail ubiquitin monomer repeats followed by a 191 bp 3' UTR. PvUbi2 consists of 692 bp containing cis-acting regulatory elements, a 5' UTR containing a 97 bp non-coding exon and a 1072 bp intron, a 1146 bp ORF that encodes five tandem ubiquitin monomer repeats and a 183 bp 3' UTR. PvUbi1 and PvUbi2 were expressed in all examined switchgrass tissues as measured by qRT-PCR. Using biolistic bombardment, PvUbi1 and PvUbi2 promoters showed strong expression in switchgrass and rice callus, equaling or surpassing the expression levels of the CaMV 35S, 2x35S, ZmUbi1, and OsAct1 promoters. GUS staining following stable transformation in rice demonstrated that the PvUbi1 and PvUbi2 promoters drove expression in all examined tissues. When stably transformed into tobacco (Nicotiana tabacum), the PvUbi2+3 and PvUbi2+9 promoter fusion variants showed expression in vascular and reproductive tissues.
The PvUbi1 and PvUbi2 promoters drive expression in switchgrass, rice and tobacco and are strong constitutive promoter candidates that will be useful in genetic transformation of monocots and dicots.
Upland cotton, Gossypium hirsutum L., is one of the world's most important economic crops. In the absence of the entire genomic sequence, a large number of expressed sequence tag (EST) resources of upland cotton have been generated and used in several studies. However, information about the flower development of this species is rare.
To clarify the molecular mechanism of flower development in upland cotton, 22,915 high-quality ESTs were generated and assembled into 14,373 unique sequences consisting of 4,563 contigs and 9,810 singletons from a normalized and full-length cDNA library constructed from pooled RNA isolated from shoot apexes, squares, and flowers. Comparative analysis indicated that 5,352 unique sequences had no high-degree matches to the cotton public database. Functional annotation showed that several upland cotton homologs with flowering-related genes were identified in our library. The majority of these genes were specifically expressed in flowering-related tissues. Three GhSEP (G. hirsutum L. SEPALLATA) genes determining floral organ development were cloned, and quantitative real-time PCR (qRT-PCR) revealed that these genes were expressed preferentially in squares or flowers. Furthermore, 670 new putative microsatellites with flanking sequences sufficient for primer design were identified from the 645 unigenes. Twenty-five EST–simple sequence repeats were randomly selected for validation and transferability testing in 17 Gossypium species. Of these, 23 were identified as true-to-type simple sequence repeat loci and were highly transferable among Gossypium species.
A high-quality, normalized, full-length cDNA library with a total of 14,373 unique ESTs was generated to provide sequence information for gene discovery and marker development related to upland cotton flower development. These EST resources form a valuable foundation for gene expression profiling analysis, functional analysis of newly discovered genes, genetic linkage, and quantitative trait loci analysis.
Plant architecture and the timing and distribution of reproductive structures are fundamental agronomic traits shaped by patterns of determinate and indeterminate growth. Florigen, encoded by FLOWERING LOCUS T (FT) in Arabidopsis and SINGLE FLOWER TRUSS (SFT) in tomato, acts as a general growth hormone, advancing determinate growth. Domestication of upland cotton (Gossypium hirsutum) converted it from a lanky photoperiodic perennial to a highly inbred, compact day-neutral plant that is managed as an annual row-crop. This dramatic change in plant architecture provides a unique opportunity to analyze the transition from perennial to annual growth.
To explore these architectural changes, we addressed the role of day-length upon flowering in an ancestral, perennial accession and in a domesticated variety of cotton. Using a disarmed Cotton leaf crumple virus (CLCrV) as a transient expression system, we delivered FT to both cotton accessions. Ectopic expression of FT in ancestral cotton mimicked the effects of day-length, promoting photoperiod-independent flowering, precocious determinate architecture, and lanceolate leaf shape. Domesticated cotton infected with FT demonstrated more synchronized fruiting and enhanced “annualization”. Transient expression of FT also facilitated simple crosses between wild photoperiodic and domesticated day-neutral accessions, effectively demonstrating a mechanism to increase genetic diversity among cultivated lines of cotton. Virus was not detected in the F1 progeny, indicating that crosses made by this approach do not harbor recombinant DNA molecules.
These findings extend our understanding of FT as a general growth hormone that regulates shoot architecture by advancing organ-specific and age-related determinate growth. Judicious manipulation of FT could benefit cotton architecture to improve crop management.
Heat shock transcriptional factors (Hsfs) play important roles in the processes of biotic and abiotic stresses as well as in plant development. Cotton (Gossypium hirsutum, 2n = 4x = (AD)2 = 52) is an important crop for natural fiber production. Due to continuous high temperature and intermittent drought, heat stress is becoming a handicap to improve cotton yield and lint quality. Recently, the related wild diploid species Gossypium raimondii genome (2n = 2x = (D5)2 = 26) has been fully sequenced. In order to analyze the functions of different Hsfs at the genome-wide level, detailed characterization and analysis of the Hsf gene family in G. hirsutum is indispensable.
EST assembly and genome-wide analyses were applied to clone and identify heat shock transcription factor (Hsf) genes in Upland cotton (GhHsf). Forty GhHsf genes were cloned, identified and classified into three main classes (A, B and C) according to the characteristics of their domains. Analysis of gene duplications showed that GhHsfs have occurred more frequently than reported in plant genomes such as Arabidopsis and Populus. Quantitative real-time PCR (qRT-PCR) showed that all GhHsf transcripts are expressed in most cotton plant tissues including roots, stems, leaves and developing fibers, and abundantly in developing ovules. Three expression patterns were confirmed in GhHsfs when cotton plants were exposed to high temperature for 1 h. GhHsf39 exhibited the most immediate response to heat shock. Comparative analysis of Hsfs expression differences between the wild-type and fiberless mutant suggested that Hsfs are involved in fiber development.
Comparative genome analysis showed that Upland cotton D-subgenome contains 40 Hsf members, and that the whole genome of Upland cotton contains more than 80 Hsf genes due to genome duplication. The expression patterns in different tissues in response to heat shock showed that GhHsfs are important for heat stress as well as fiber development. These results provide an improved understanding of the roles of the Hsf gene family during stress responses and fiber development.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-961) contains supplementary material, which is available to authorized users.
Heat shock transcriptional factors; Gossypium hirsutum; Heat stress; qRT-PCR; Fiber development
Verticillium wilt, caused by the fungal pathogen Verticillium dahliae, is the most severe disease in cotton (Gossypium spp.), causing great lint losses worldwide. Disease management could be achieved in the field if genetically improved, resistant plants were used. However, the interaction between V. dahliae and cotton is a complicated process, and its molecular mechanism remains obscure. To understand better the defense response to this pathogen as a means for obtaining more tolerant cultivars, we monitored the transcriptome profiles of roots from resistant plants of G. barbadense cv. Pima90-53 that were challenged with V. dahliae.
In all, 46,192 high-quality expressed sequence tags (ESTs) were generated from a full-length cDNA library of G. barbadense. They were clustered and assembled into 23126 unigenes that comprised 2661 contigs and 20465 singletons. Those unigenes were assigned Gene Ontology terms and mapped to 289 KEGG pathways. A total of 3027 unigenes were found to be homologous to known defense-related genes in other plants. They were assigned to the functional classification of plant–pathogen interactions, including disease defenses and signal transduction. The branch of "SA→NPR1→TGA→PR-1→Disease resistance" was first discovered in the interaction of cotton–V. dahliae, indicating that this wilt process includes both biotrophic and necrotrophic stages. In all, 4936 genes coding for putative transcription factors (TF) were identified in our library. The most abundant TF family was the NAC group (527), followed by G2-like (440), MYB (372), BHLH (331), bZIP (271) ERF, C3H, and WRKY. We also analyzed the expression of genes involved in pathogen-associated molecular pattern (PAMP) recognition, the activation of effector-triggered immunity, TFs, and hormone biosynthesis, as well as genes that are pathogenesis-related, or have roles in signaling/regulatory functions and cell wall modification. Their differential expression patterns were compared among mock-/inoculated- and resistant/susceptible cotton. Our results suggest that the cotton defense response has significant transcriptional complexity and that large accumulations of defense-related transcripts may contribute to V. dahliae resistance in cotton. Therefore, these data provide a resource for cotton improvement through molecular breeding approaches.
This study generated a substantial amount of cotton transcript sequences that are related to defense responses against V. dahliae. These genomics resources and knowledge of important related genes contribute to our understanding of host–pathogen interactions and the defense mechanisms utilized by G. barbadense, a non-model plant system. These tools can be applied in establishing a modern breeding program that uses marker-assisted selections and oligonucleotide arrays to identify candidate genes that can be linked to valuable agronomic traits in cotton, including disease resistance.
Chloroplast genetic engineering overcomes concerns of gene containment, low levels of transgene expression, gene silencing, positional and pleiotropic effects or presence of vector sequences in transformed genomes. Several therapeutic proteins and agronomic traits have been highly expressed via the tobacco chloroplast genome but extending this concept to important crops has been a major challenge; lack of 100% homologous species-specific chloroplast transformation vectors containing suitable selectable markers, ability to regulate transgene expression in developing plastids and inadequate tissue culture systems via somatic embryogenesis are major challenges. We employed a ‘Double Gene/Single Selection (DGSS)’ plastid transformation vector that harbors two selectable marker genes (aphA-6 and nptII) to detoxify the same antibiotic by two enzymes, irrespective of the type of tissues or plastids; by combining this with an efficient regeneration system via somatic embryogenesis, cotton plastid transformation was achieved for the first time. The DGSS transformation vector is at least 8-fold (1 event/2.4 bombarded plates) more efficient than ‘Single Gene/Single Selection (SGSS)’ vector (aphA-6; 1 event per 20 bombarded plates). Chloroplast transgenic lines were fertile, flowered and set seeds similar to untransformed plants. Transgenes stably integrated into the cotton chloroplast genome were maternally inherited and were not transmitted via pollen when out-crossed with untransformed female plants. Cotton is one of the most important genetically modified crops ($ 120 billion US annual economy). Successful transformation of the chloroplast genome should address concerns about transgene escape, insects developing resistance, inadequate insect control and promote public acceptance of genetically modified cotton.
chloroplast genetic engineering; genetically modified crops; transgene containment; transgenic cotton
Cotton is the world’s primary fiber crop and is a major agricultural commodity in over 30 countries. Like many other global commodities, sustainable cotton production is challenged by restricted natural resources. In response to the anticipated increase of agricultural water demand, a major research direction involves developing crops that use less water or that use water more efficiently. In this study, our objective was to identify differentially expressed genes in response to water deficit stress in cotton. A global expression analysis using cDNA-Amplified Fragment Length Polymorphism was conducted to compare root and leaf gene expression profiles from a putative drought resistant cotton cultivar grown under water deficit stressed and well watered field conditions.
We identified a total of 519 differentially expressed transcript derived fragments. Of these, 147 transcript derived fragment sequences were functionally annotated according to their gene ontology. Nearly 70 percent of transcript derived fragments belonged to four major categories: 1) unclassified, 2) stress/defense, 3) metabolism, and 4) gene regulation. We found heat shock protein-related and reactive oxygen species-related transcript derived fragments to be among the major parts of functional pathways induced by water deficit stress. Also, twelve novel transcripts were identified as both water deficit responsive and cotton specific. A subset of differentially expressed transcript derived fragments was verified using reverse transcription-polymerase chain reaction. Differential expression analysis also identified five pairs of duplicated transcript derived fragments in which four pairs responded differentially between each of their two homologues under water deficit stress.
In this study, we detected differentially expressed transcript derived fragments from water deficit stressed root and leaf tissues in tetraploid cotton and provided their gene ontology, functional/biological distribution, and possible roles of gene duplication. This discovery demonstrates complex mechanisms involved with polyploid cotton’s transcriptome response to naturally occurring field water deficit stress. The genes identified in this study will provide candidate targets to manipulate the water use characteristics of cotton at the molecular level.
Cotton (Gossypium spp.) is one of the major fibre crops of the world. Although it is classified as salt tolerant crop, cotton growth and productivity are adversely affected by high salinity, especially at germination and seedling stages. Identification of genes and miRNAs responsible for salt tolerance in upland cotton (Gossypium hirsutum L.) would help reveal the molecular mechanisms of salt tolerance. We performed physiological experiments and transcriptome sequencing (mRNA-seq and small RNA-seq) of cotton leaves under salt stress using Illumina sequencing technology.
We investigated two distinct salt stress phases—dehydration (4 h) and ionic stress (osmotic restoration; 24 h)—that were identified by physiological changes of 14-day-old seedlings of two cotton genotypes, one salt tolerant and the other salt sensitive, during a 72-h NaCl exposure. A comparative transcriptomics was used to monitor gene and miRNA differential expression at two time points (4 and 24 h) in leaves of the two cotton genotypes under salinity conditions. The expression patterns of differentially co-expressed unigenes were divided into six groups using short time-servies expression miner software. During a 24-h salt exposure, 819 transcription factor unigenes were differentially expressed in both genotypes, with 129 unigenes specifically expressed in the salt-tolerant genotype. Under salt stress, 108 conserved miRNAs from known families were differentially expressed at two time points in the salt-tolerant genotype. We further analyzed the predicted target genes of these miRNAs along with the transcriptome for each time point. Important expressed genes encoding membrane receptors, transporters, and pathways involved in biosynthesis and signal transduction of calcium-dependent protein kinase, mitogen-activated protein kinase, and hormones (abscisic acid and ethylene) were up-regulated. We also analyzed the salt stress response of some key miRNAs and their target genes and found that the expressions of five of nine target genes exhibited significant inverse correlations with their corresponding miRNAs. On the basis of these results, we constructed molecular regulatory pathways and a potential regulatory network for these salt-responsive miRNAs.
Our comprehensive transcriptome analysis has provided new insights into salt-stress response of upland cotton. The results should contribute to the development of genetically modified cotton with salt tolerance.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-760) contains supplementary material, which is available to authorized users.
Cotton; Salt stress; Leaf transcriptome; Transcription factor; MicroRNA
The red leaf coloration of Empire Red Leaf Cotton (ERLC) (Gossypium hirsutum L.), resulted from anthocyanin accumulation in light, is a well known dominant agricultural trait. However, the underpin molecular mechanism remains elusive. To explore this, we compared the molecular biological basis of anthocyanin accumulation in both ERLC and the green leaf cotton variety CCRI 24 (Gossypium hirsutum L.). Introduction of R2R3-MYB transcription factor Rosea1, the master regulator anthocyanin biosynthesis in Antirrhinum majus, into CCRI 24 induced anthocyanin accumulation, indicating structural genes for anthocyanin biosynthesis are not defected and the leaf coloration might be caused by variation of regulatory genes expression. Expression analysis found that a transcription factor RLC1 (Red Leaf Cotton 1) which encodes the ortholog of PAP1/Rosea1 was highly expressed in leaves of ERLC but barely expressed in CCRI 24 in light. Ectopic expression of RLC1 from ERLC and CCRI 24 in hairy roots of Antirrhinum majus and CCRI 24 significantly enhanced anthocyanin accumulation. Comparison of RLC1 promoter sequences between ERLC and CCRI 24 revealed two 228-bp tandem repeats presented in ERLC with only one repeat in CCRI 24. Transient assays in cotton leave tissue evidenced that the tandem repeats in ERLC is responsible for light-induced RLC1 expression and therefore anthocyanin accumulation. Taken together, our results in this article strongly support an important step toward understanding the role of R2R3-MYB transcription factors in the regulatory menchanisms of anthocyanin accumulation in red leaf cotton under light.
Cotton fibres are unicellular seed trichomes. Our previous study suggested that the cotton R2R3 MYB transcript factor GaMYB2 is a functional homologue of the Arabidopsis trichome regulator GLABRA1 (GL1). Here, the GaMYB2 promoter activity is reported in cotton (Gossypium hirsutum), tobacco (Nicotiana tabacum), and Arabidopsis plants. A 2062 bp promoter of GaMYB2 was isolated from G. arboreum, and fused to a β-glucuronidase (GUS) reporter gene. In cotton, the GaMYB2 promoter exhibited activities in developing fibre cells and trichomes of other aerial organs, including leaves, stems and bracts. In Arabidopsis the promoter was specific to trichomes. Different from Arabidopsis and cotton that have unicellular non-glandular simple trichomes, tobacco plants contain more than one type of trichome, including multicellular simple and glandular secreting trichomes (GSTs). Interestingly, in tobacco plants the GaMYB2 promoter directed GUS expression exclusively in glandular cells of GSTs. A series of 5′-deletions revealed that a 360 bp fragment upstream to the translation initiation codon was sufficient to drive gene expression. A putative cis-element of the T/G-box was located at -233 to -214; a yeast one-hybrid assay showed that Arabidopsis bHLH protein GLABRA3 (GL3), also a trichome regulator, and GhDEL65, a GL3-like cotton protein, had high binding activities to the T/G-box motif. Overexpression of GL3 or GhDEL65 enhanced the GaMYB2 promoter activity in transgenic Arabidopsis plants. A comparison of GaMYB2 promoter specificities in trichomes of different plant species with different types of trichomes provides a tool for further dissection of plant trichome structure and development.
Cotton fibre; glandular; MYB; promoter; tobacco; trichome
Cotton is the world’s most important natural textile fiber and a significant oilseed crop. Decoding cotton genomes will provide the ultimate reference and resource for research and utilization of the species. Integration of high-density genetic maps with genomic sequence information will largely accelerate the process of whole-genome assembly in cotton.
In this paper, we update a high-density interspecific genetic linkage map of allotetraploid cultivated cotton. An additional 1,167 marker loci have been added to our previously published map of 2,247 loci. Three new marker types, InDel (insertion-deletion) and SNP (single nucleotide polymorphism) developed from gene information, and REMAP (retrotransposon-microsatellite amplified polymorphism), were used to increase map density. The updated map consists of 3,414 loci in 26 linkage groups covering 3,667.62 cM with an average inter-locus distance of 1.08 cM. Furthermore, genome-wide sequence analysis was finished using 3,324 informative sequence-based markers and publicly-available Gossypium DNA sequence information. A total of 413,113 EST and 195 BAC sequences were physically anchored and clustered by 3,324 sequence-based markers. Of these, 14,243 ESTs and 188 BACs from different species of Gossypium were clustered and specifically anchored to the high-density genetic map. A total of 2,748 candidate unigenes from 2,111 ESTs clusters and 63 BACs were mined for functional annotation and classification. The 337 ESTs/genes related to fiber quality traits were integrated with 132 previously reported cotton fiber quality quantitative trait loci, which demonstrated the important roles in fiber quality of these genes. Higher-level sequence conservation between different cotton species and between the A- and D-subgenomes in tetraploid cotton was found, indicating a common evolutionary origin for orthologous and paralogous loci in Gossypium.
This study will serve as a valuable genomic resource for tetraploid cotton genome assembly, for cloning genes related to superior agronomic traits, and for further comparative genomic analyses in Gossypium.
Extensive studies on floral transition in model species have revealed a network of regulatory interactions between proteins that transduce and integrate developmental and environmental signals to promote or inhibit the transition to flowering. Previous studies indicated FLOWERING PROMOTING FACTOR 1 (FPF1) gene was involved in the promotion of flowering, but the molecular mechanism was still unclear. Here, FPF1 homologous sequences were screened from diploid Gossypium raimondii L. (D-genome, n = 13) and Gossypium arboreum L. genome (A-genome, n = 13) databases. Orthologous genes from the two species were compared, suggesting that distinctions at nucleic acid and amino acid levels were not equivalent because of codon degeneracy. Six FPF1 homologous genes were identified from the cultivated allotetraploid Gossypium hirsutum L. (AD-genome, n = 26). Analysis of relative transcripts of the six genes in different tissues revealed that this gene family displayed strong tissue-specific expression. GhFPF1, encoding a 12.0-kDa protein (Accession No: KC832319) exerted more transcripts in floral apices of short-season cotton, hinting that it could be involved in floral regulation. Significantly activated APETALA 1 and suppressed FLOWERING LOCUS C expression were induced by over-expression of GhFPF1 in the Arabidopsis Columbia-0 ecotype. In addition, transgenic Arabidopsis displayed a constitutive shade-avoiding phenotype that is characterized by long hypocotyls and petioles, reduced chlorophyll content, and early flowering. We propose that GhFPF1 may be involved in flowering time control and shade-avoidance responses.
Although numerous factors can influence gene expression, promoters are perhaps the most important component of the regulatory control process. Promoter regions are often defined as a region upstream of the transcriptional start. They contain regulatory elements that interact with regulatory proteins to modulate gene expression. Most genes possess their own unique promoter and large numbers of promoters are therefore available for study. Unfortunately, relatively few promoters have been isolated and characterized; particularly from soybean (Glycine max).
In this research, a bioinformatics approach was first performed to identify members of the Gmubi (G.max ubiquitin) and the GmERF (G. max Ethylene Response Factor) gene families of soybean. Ten Gmubi and ten GmERF promoters from selected genes were cloned upstream of the gfp gene and successfully characterized using rapid validation tools developed for both transient and stable expression. Quantification of promoter strength using transient expression in lima bean (Phaseolus lunatus) cotyledonary tissue and stable expression in soybean hairy roots showed that the intensity of gfp gene expression was mostly conserved across the two expression systems. Seven of the ten Gmubi promoters yielded from 2- to 7-fold higher expression than a standard CaMV35S promoter while four of the ten GmERF promoters showed from 1.5- to 2.2-times higher GFP levels compared to the CaMV35S promoter. Quantification of GFP expression in stably-transformed hairy roots of soybean was variable among roots derived from different transformation events but consistent among secondary roots, derived from the same primary transformation events. Molecular analysis of hairy root events revealed a direct relationship between copy number and expression intensity; higher copy number events displayed higher GFP expression.
In this study, we present expression intensity data on 20 novel soybean promoters from two different gene families, ubiquitin and ERF. We also demonstrate the utility of lima bean cotyledons and soybean hairy roots for rapid promoter analyses and provide novel insights towards the utilization of these expression systems. The soybean promoters characterized here will be useful for production of transgenic soybean plants for both basic research and commercial plant improvement.
Cultivated cotton is an annual fiber crop derived mainly from two perennial species, Gossypium hirsutum L. or upland cotton, and G. barbadense L., extra long-staple fiber Pima or Egyptian cotton. These two cultivated species are among five allotetraploid species presumably derived monophyletically between G. arboreum and G. raimondii. Genomic-based approaches have been hindered by the limited variation within species. Yet, population-based methods are being used for genome-wide introgression of novel alleles from G. mustelinum and G. tomentosum into G. hirsutum using combinations of backcrossing, selfing, and inter-mating. Recombinant inbred line populations between genetics standards TM-1, (G. hirsutum) × 3-79 (G. barbadense) have been developed to allow high-density genetic mapping of traits.
This paper describes a strategy to efficiently characterize genomic variation (SNPs and indels) within and among cotton species. Over 1000 SNPs from 270 loci and 279 indels from 92 loci segregating in G. hirsutum and G. barbadense were genotyped across a standard panel of 24 lines, 16 of which are elite cotton breeding lines and 8 mapping parents of populations from six cotton species. Over 200 loci were genetically mapped in a core mapping population derived from TM-1 and 3-79 and in G. hirsutum breeding germplasm.
In this research, SNP and indel diversity is characterized for 270 single-copy polymorphic loci in cotton. A strategy for SNP discovery is defined to pre-screen loci for copy number and polymorphism. Our data indicate that the A and D genomes in both diploid and tetraploid cotton remain distinct from each such that paralogs can be distinguished. This research provides mapped DNA markers for intra-specific crosses and introgression of exotic germplasm in cotton.
The most widely cultivated cotton (Gossypium hirsutum L., AD-genome) is derived from tetraploidization between A- and D-genome species. G. arboreum L. (A-genome) and G. raimondii Ulbr. (D-genome) are two of closely-related extant progenitors. Gene expression studies in allotetraploid cotton are complicated by the homoeologous loci of A- and D-genome origins. To develop genomic resources for gene expression and cotton breeding, we sequenced and assembled expressed sequence tags (ESTs) derived from G. arboreum and G. raimondii.
Roche/454 FLX sequencing technology was employed to sequence normalized cDNA libraries prepared from leaves, roots, bolls, ovules, and fibers in G. arboreum and G. raimondii, respectively. Sequencing reads from two independent libraries in each species were combined to assemble high-quality EST contigs. The combined sequencing reads included 1,699,776 from A-genome and 1,464,815 from D-genome, which were clustered into 89,588 contigs in the A-genome and 65,542 contigs in the D-genome. These contigs represented ~80% of EST collections in Cotton Gene Index 11 (CGI11, March 2011). Compared to the D-genome transcript database, 27,537 and 10,452 contigs were unique transcripts in A and D genomes, respectively. Further analysis using self-blastn reduced the unigene contig number by 52% in A-genome and 57% in D-genome, suggesting that 50% or more of contigs are paralogs or isoforms within each species. The majority of EST contigs (73–81%) were conserved between A- and D-genomes, whereas 27% and 19% contigs were specific to A- and D-genomes, respectively. Using these ESTs, we generated a total of 75,754 genome-specific single nucleotide polymorphism (SNP) (gSNPs or GNPs) or homoeologous-specific SNPs (hSNPs) of 10,885 contigs or genes between A and D genomes, indicating a possibility of separating allelic expression for those genes in allotetraploid cotton.
Expressed genes are highly redundant within each diploid progenitor and between A and D progenitor species, suggesting that diploid progenitors in cotton are likely ancient tetraploids. This large set of A- and D-genome ESTs and GNPs will be valuable resources for genome annotation, gene expression, and crop improvement in allotetraploid cotton.
Electronic supplementary material
The online version of this article (doi:10.1186/1756-0500-7-493) contains supplementary material, which is available to authorized users.
Cotton; Polyploidy; G. hirsutum; G. arboreum; G. raimondii; EST; GNPs; mRNA sequencing; Transcriptome
The majority of commercial cotton varieties planted worldwide are derived from Gossypium hirsutum, which is a naturally occurring allotetraploid produced by interspecific hybridization of A- and D-genome diploid progenitor species. While most cotton species are adapted to warm, semi-arid tropical and subtropical regions, and thus perform well in these geographical areas, cotton seedlings are sensitive to cold temperature, which can significantly reduce crop yields. One of the common biochemical responses of plants to cold temperatures is an increase in omega-3 fatty acids, which protects cellular function by maintaining membrane integrity. The purpose of our study was to identify and characterize the omega-3 fatty acid desaturase (FAD) gene family in G. hirsutum, with an emphasis on identifying omega-3 FADs involved in cold temperature adaptation.
Eleven omega-3 FAD genes were identified in G. hirsutum, and characterization of the gene family in extant A and D diploid species (G. herbaceum and G. raimondii, respectively) allowed for unambiguous genome assignment of all homoeologs in tetraploid G. hirsutum. The omega-3 FAD family of cotton includes five distinct genes, two of which encode endoplasmic reticulum-type enzymes (FAD3-1 and FAD3-2) and three that encode chloroplast-type enzymes (FAD7/8-1, FAD7/8-2, and FAD7/8-3). The FAD3-2 gene was duplicated in the A genome progenitor species after the evolutionary split from the D progenitor, but before the interspecific hybridization event that gave rise to modern tetraploid cotton. RNA-seq analysis revealed conserved, gene-specific expression patterns in various organs and cell types and semi-quantitative RT-PCR further revealed that FAD7/8-1 was specifically induced during cold temperature treatment of G. hirsutum seedlings.
The omega-3 FAD gene family in cotton was characterized at the genome-wide level in three species, showing relatively ancient establishment of the gene family prior to the split of A and D diploid progenitor species. The FAD genes are differentially expressed in various organs and cell types, including fiber, and expression of the FAD7/8-1 gene was induced by cold temperature. Collectively, these data define the genetic and functional genomic properties of this important gene family in cotton and provide a foundation for future efforts to improve cotton abiotic stress tolerance through molecular breeding approaches.
Electronic supplementary material
The online version of this article (doi:10.1186/s12870-014-0312-5) contains supplementary material, which is available to authorized users.
Chilling tolerance; Cotton; Drought; Fatty acid desaturase; Gossypium; Linolenic acid; Omega-3 fatty acid
Cotton, one of the world’s leading crops, is important to the world’s textile and energy industries, and is a model species for studies of plant polyploidization, cellulose biosynthesis and cell wall biogenesis. Here, we report the construction of a plant-transformation-competent binary bacterial artificial chromosome (BIBAC) library and comparative genome sequence analysis of polyploid Upland cotton (Gossypium hirsutum L.) with one of its diploid putative progenitor species, G. raimondii Ulbr.
We constructed the cotton BIBAC library in a vector competent for high-molecular-weight DNA transformation in different plant species through either Agrobacterium or particle bombardment. The library contains 76,800 clones with an average insert size of 135 kb, providing an approximate 99% probability of obtaining at least one positive clone from the library using a single-copy probe. The quality and utility of the library were verified by identifying BIBACs containing genes important for fiber development, fiber cellulose biosynthesis, seed fatty acid metabolism, cotton-nematode interaction, and bacterial blight resistance. In order to gain an insight into the Upland cotton genome and its relationship with G. raimondii, we sequenced nearly 10,000 BIBAC ends (BESs) randomly selected from the library, generating approximately one BES for every 250 kb along the Upland cotton genome. The retroelement Gypsy/DIRS1 family predominates in the Upland cotton genome, accounting for over 77% of all transposable elements. From the BESs, we identified 1,269 simple sequence repeats (SSRs), of which 1,006 were new, thus providing additional markers for cotton genome research. Surprisingly, comparative sequence analysis showed that Upland cotton is much more diverged from G. raimondii at the genomic sequence level than expected. There seems to be no significant difference between the relationships of the Upland cotton D- and A-subgenomes with the G. raimondii genome, even though G. raimondii contains a D genome (D5).
The library represents the first BIBAC library in cotton and related species, thus providing tools useful for integrative physical mapping, large-scale genome sequencing and large-scale functional analysis of the Upland cotton genome. Comparative sequence analysis provides insights into the Upland cotton genome, and a possible mechanism underlying the divergence and evolution of polyploid Upland cotton from its diploid putative progenitor species, G. raimondii.
BIBAC library; Gossypium hirsutum; Gossypium raimondii; BIBAC end sequence (BES); Genome evolution; SSR; Polyploidization and evolution
Cotton (Gossypium hirsutum L.) is one of the world’s most economically-important crops. However, its entire genome has not been sequenced, and limited resources are available in GenBank for understanding the molecular mechanisms underlying leaf development and senescence.
In this study, 9,874 high-quality ESTs were generated from a normalized, full-length cDNA library derived from pooled RNA isolated from throughout leaf development during the plant blooming stage. After clustering and assembly of these ESTs, 5,191 unique sequences, representative 1,652 contigs and 3,539 singletons, were obtained. The average unique sequence length was 682 bp. Annotation of these unique sequences revealed that 84.4% showed significant homology to sequences in the NCBI non-redundant protein database, and 57.3% had significant hits to known proteins in the Swiss-Prot database. Comparative analysis indicated that our library added 2,400 ESTs and 991 unique sequences to those known for cotton. The unigenes were functionally characterized by gene ontology annotation. We identified 1,339 and 200 unigenes as potential leaf senescence-related genes and transcription factors, respectively. Moreover, nine genes related to leaf senescence and eleven MYB transcription factors were randomly selected for quantitative real-time PCR (qRT-PCR), which revealed that these genes were regulated differentially during senescence. The qRT-PCR for three GhYLSs revealed that these genes express express preferentially in senescent leaves.
These EST resources will provide valuable sequence information for gene expression profiling analyses and functional genomics studies to elucidate their roles, as well as for studying the mechanisms of leaf development and senescence in cotton and discovering candidate genes related to important agronomic traits of cotton. These data will also facilitate future whole-genome sequence assembly and annotation in G. hirsutum and comparative genomics among Gossypium species.
A central question in evolutionary biology concerns the developmental processes by which new phenotypes arise. An exceptional example of evolutionary innovation is the single-celled seed trichome in Gossypium (“cotton fiber”). We have used fiber development in Gossypium as a system to understand how morphology can rapidly evolve. Fiber has undergone considerable morphological changes between the short, tightly adherent fibers of G. longicalyx and the derived long, spinnable fibers of its closest relative, G. herbaceum, which facilitated cotton domestication. We conducted comparative gene expression profiling across a developmental time-course of fibers from G. longicalyx and G. herbaceum using microarrays with ∼22,000 genes. Expression changes between stages were temporally protracted in G. herbaceum relative to G. longicalyx, reflecting a prolongation of the ancestral developmental program. Gene expression and GO analyses showed that many genes involved with stress responses were upregulated early in G. longicalyx fiber development. Several candidate genes upregulated in G. herbaceum have been implicated in regulating redox levels and cell elongation processes. Three genes previously shown to modulate hydrogen peroxide levels were consistently expressed in domesticated and wild cotton species with long fibers, but expression was not detected by quantitative real time-PCR in wild species with short fibers. Hydrogen peroxide is important for cell elongation, but at high concentrations it becomes toxic, activating stress processes that may lead to early onset of secondary cell wall synthesis and the end of cell elongation. These observations suggest that the evolution of long spinnable fibers in cotton was accompanied by novel expression of genes assisting in the regulation of reactive oxygen species levels. Our data suggest a model for the evolutionary origin of a novel morphology through differential gene regulation causing prolongation of an ancestral developmental program.
Human domestication of plants has resulted in dramatic changes in mature structures, often over relatively short time frames. The availability of both wild and domesticated forms of domesticated species provides an opportunity to understand the genetic and developmental steps involved in domestication, thereby providing a model of how the evolutionary process shapes phenotypes. Here we use a comparative approach to explore the evolutionary innovations leading to modern cotton fiber, which represent some of the more remarkable single-celled hairs in the plant kingdom. We used microarrays assaying approximately 22,000 genes to elucidate expression differences across a developmental time-course of fibers from G. longicalyx, representing wild cotton, and G. herbaceum, a cultivated species. Expression changes between stages were temporally elongated in G. herbaceum relative to G. longicalyx, showing that domestication involved a prolongation of an ancestral developmental program. These data and quantitative real time-PCR experiments showed that long, spinnable fiber is associated with a number of genes implicated in regulating redox levels and cell elongation processes, suggesting that the evolution of spinnable cotton fiber entailed a novel metabolic regulatory program
Cotton is a major fibre crop grown worldwide that suffers extensive damage from chewing insects, including the cotton boll weevil larvae (Anthonomus grandis). Transcriptome analysis was performed to understand the molecular interactions between Gossypium hirsutum L. and cotton boll weevil larvae. The Illumina HiSeq 2000 platform was used to sequence the transcriptome of cotton flower buds infested with boll weevil larvae.
The analysis generated a total of 327,489,418 sequence reads that were aligned to the G. hirsutum reference transcriptome. The total number of expressed genes was over 21,697 per sample with an average length of 1,063 bp. The DEGseq analysis identified 443 differentially expressed genes (DEG) in cotton flower buds infected with boll weevil larvae. Among them, 402 (90.7%) were up-regulated, 41 (9.3%) were down-regulated and 432 (97.5%) were identified as orthologues of A. thaliana genes using Blastx. Mapman analysis of DEG indicated that many genes were involved in the biotic stress response spanning a range of functions, from a gene encoding a receptor-like kinase to genes involved in triggering defensive responses such as MAPK, transcription factors (WRKY and ERF) and signalling by ethylene (ET) and jasmonic acid (JA) hormones. Furthermore, the spatial expression pattern of 32 of the genes responsive to boll weevil larvae feeding was determined by “in situ” qPCR analysis from RNA isolated from two flower structures, the stamen and the carpel, by laser microdissection (LMD).
A large number of cotton transcripts were significantly altered upon infestation by larvae. Among the changes in gene expression, we highlighted the transcription of receptors/sensors that recognise chitin or insect oral secretions; the altered regulation of transcripts encoding enzymes related to kinase cascades, transcription factors, Ca2+ influxes, and reactive oxygen species; and the modulation of transcripts encoding enzymes from phytohormone signalling pathways. These data will aid in the selection of target genes to genetically engineer cotton to control the cotton boll weevil.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-854) contains supplementary material, which is available to authorized users.
Cotton; Larvae; Transcriptome sequencing; Biotic stress; WRKY FT; Laser microdissection (LMD)
Normalizing through reference genes, or housekeeping genes, can make more accurate and reliable results from reverse transcription real-time quantitative polymerase chain reaction (qPCR). Recent studies have shown that no single housekeeping gene is universal for all experiments. Thus, suitable reference genes should be the first step of any qPCR analysis. Only a few studies on the identification of housekeeping gene have been carried on plants. Therefore qPCR studies on important crops such as cotton has been hampered by the lack of suitable reference genes.
By the use of two distinct algorithms, implemented by geNorm and NormFinder, we have assessed the gene expression of nine candidate reference genes in cotton: GhACT4, GhEF1α5, GhFBX6, GhPP2A1, GhMZA, GhPTB, GhGAPC2, GhβTUB3 and GhUBQ14. The candidate reference genes were evaluated in 23 experimental samples consisting of six distinct plant organs, eight stages of flower development, four stages of fruit development and in flower verticils. The expression of GhPP2A1 and GhUBQ14 genes were the most stable across all samples and also when distinct plants organs are examined. GhACT4 and GhUBQ14 present more stable expression during flower development, GhACT4 and GhFBX6 in the floral verticils and GhMZA and GhPTB during fruit development. Our analysis provided the most suitable combination of reference genes for each experimental set tested as internal control for reliable qPCR data normalization. In addition, to illustrate the use of cotton reference genes we checked the expression of two cotton MADS-box genes in distinct plant and floral organs and also during flower development.
We have tested the expression stabilities of nine candidate genes in a set of 23 tissue samples from cotton plants divided into five different experimental sets. As a result of this evaluation, we recommend the use of GhUBQ14 and GhPP2A1 housekeeping genes as superior references for normalization of gene expression measures in different cotton plant organs; GhACT4 and GhUBQ14 for flower development, GhACT4 and GhFBX6 for the floral organs and GhMZA and GhPTB for fruit development. We also provide the primer sequences whose performance in qPCR experiments is demonstrated. These genes will enable more accurate and reliable normalization of qPCR results for gene expression studies in this important crop, the major source of natural fiber and also an important source of edible oil. The use of bona fide reference genes allowed a detailed and accurate characterization of the temporal and spatial expression pattern of two MADS-box genes in cotton.
Tetraploid cotton contains two sets of homologous chromosomes, the At- and Dt-subgenomes. Consequently, many markers in cotton were mapped to multiple positions during linkage genetic map construction, posing a challenge to anchoring linkage groups and mapping economically-important genes to particular chromosomes. Chromosome-specific markers could solve this problem. Recently, the genomes of two diploid species were sequenced whose progenitors were putative contributors of the At- and Dt-subgenomes to tetraploid cotton. These sequences provide a powerful tool for developing chromosome-specific markers given the high level of synteny among tetraploid and diploid cotton genomes. In this study, simple sequence repeats (SSRs) on each chromosome in the two diploid genomes were characterized. Chromosome-specific SSRs were developed by comparative analysis and proved to distinguish chromosomes.
A total of 200,744 and 142,409 SSRs were detected on the 13 chromosomes of Gossypium arboreum L. and Gossypium raimondii Ulbrich, respectively. Chromosome-specific SSRs were obtained by comparing SSR flanking sequences from each chromosome with those from the other 25 chromosomes. The average was 7,996 per chromosome. To confirm their chromosome specificity, these SSRs were used to distinguish two homologous chromosomes in tetraploid cotton through linkage group construction. The chromosome-specific SSRs and previously-reported chromosome markers were grouped together, and no marker mapped to another homologous chromosome, proving that the chromosome-specific SSRs were unique and could distinguish homologous chromosomes in tetraploid cotton. Because longer dinucleotide AT-rich repeats were the most polymorphic in previous reports, the SSRs on each chromosome were sorted by motif type and repeat length for convenient selection. The primer sequences of all chromosome-specific SSRs were also made publicly available.
Chromosome-specific SSRs are efficient tools for chromosome identification by anchoring linkage groups to particular chromosomes during genetic mapping and are especially useful in mapping of qualitative-trait genes or quantitative trait loci with just a few markers. The SSRs reported here will facilitate a number of genetic and genomic studies in cotton, including construction of high-density genetic maps, positional gene cloning, fingerprinting, and genetic diversity and comparative evolutionary analyses among Gossypium species.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1265-2) contains supplementary material, which is available to authorized users.
Chromosome-specific; SSR; Tetraploid cotton; Genome-wide