Brachypodium distachyon is a close relative of many important cereal crops. Abiotic stress tolerance has a significant impact on productivity of agriculturally important food and feedstock crops. Analysis of the transcriptome of Brachypodium after chilling, high-salinity, drought, and heat stresses revealed diverse differential expression of many transcripts. Weighted Gene Co-Expression Network Analysis revealed 22 distinct gene modules with specific profiles of expression under each stress. Promoter analysis implicated short DNA sequences directly upstream of module members in the regulation of 21 of 22 modules. Functional analysis of module members revealed enrichment in functional terms for 10 of 22 network modules. Analysis of condition-specific correlations between differentially expressed gene pairs revealed extensive plasticity in the expression relationships of gene pairs. Photosynthesis, cell cycle, and cell wall expression modules were down-regulated by all abiotic stresses. Modules which were up-regulated by each abiotic stress fell into diverse and unique gene ontology GO categories. This study provides genomics resources and improves our understanding of abiotic stress responses of Brachypodium.
The wild grass Brachypodium distachyon has emerged as a model system for temperate grasses and biofuel plants. However, the global analysis of miRNAs, molecules known to be key for eukaryotic gene regulation, has been limited in B. distachyon to studies examining a few samples or that rely on computational predictions. Similarly an in-depth global analysis of miRNA-mediated target cleavage using parallel analysis of RNA ends (PARE) data is lacking in B. distachyon.
B. distachyon small RNAs were cloned and deeply sequenced from 17 libraries that represent different tissues and stresses. Using a computational pipeline, we identified 116 miRNAs including not only conserved miRNAs that have not been reported in B. distachyon, but also non-conserved miRNAs that were not found in other plants. To investigate miRNA-mediated cleavage function, four PARE libraries were constructed from key tissues and sequenced to a total depth of approximately 70 million sequences. The roughly 5 million distinct genome-matched sequences that resulted represent an extensive dataset for analyzing small RNA-guided cleavage events. Analysis of the PARE and miRNA data provided experimental evidence for miRNA-mediated cleavage of 264 sites in predicted miRNA targets. In addition, PARE analysis revealed that differentially expressed miRNAs in the same family guide specific target RNA cleavage in a correspondingly tissue-preferential manner.
B. distachyon miRNAs and target RNAs were experimentally identified and analyzed. Knowledge gained from this study should provide insights into the roles of miRNAs and the regulation of their targets in B. distachyon and related plants.
Lignin is a significant barrier in the conversion of plant biomass to bioethanol. Cinnamyl alcohol dehydrogenase (CAD) and caffeic acid O-methyltransferase (COMT) catalyze key steps in the pathway of lignin monomer biosynthesis. Brown midrib mutants in Zea mays and Sorghum bicolor with impaired CAD or COMT activity have attracted considerable agronomic interest for their altered lignin composition and improved digestibility. Here, we identified and functionally characterized candidate genes encoding CAD and COMT enzymes in the grass model species Brachypodium distachyon with the aim of improving crops for efficient biofuel production.
We developed transgenic plants overexpressing artificial microRNA designed to silence BdCAD1 or BdCOMT4. Both transgenes caused altered flowering time and increased stem count and weight. Downregulation of BdCAD1 caused a leaf brown midrib phenotype, the first time this phenotype has been observed in a C3 plant. While acetyl bromide soluble lignin measurements were equivalent in BdCAD1 downregulated and control plants, histochemical staining and thioacidolysis indicated a decrease in lignin syringyl units and reduced syringyl/guaiacyl ratio in the transgenic plants. BdCOMT4 downregulated plants exhibited a reduction in total lignin content and decreased Maule staining of syringyl units in stem. Ethanol yield by microbial fermentation was enhanced in amiR-cad1-8 plants.
These results have elucidated two key genes in the lignin biosynthetic pathway in B. distachyon that, when perturbed, may result in greater stem biomass yield and bioconversion efficiency.
Cytosine DNA methylation (5mC) is an epigenetic modification that is important to genome stability and regulation of gene expression. Perturbations of 5mC have been implicated as a cause of phenotypic variation among plants regenerated through in vitro culture systems. However, the pattern of change in 5mC and its functional role with respect to gene expression, are poorly understood at the genome scale. A fuller understanding of how 5mC changes during in vitro manipulation may aid the development of methods for reducing or amplifying the mutagenic and epigenetic effects of in vitro culture and plant transformation.
We investigated the in vitro methylome of the model tree species Populus trichocarpa in a system that mimics routine methods for regeneration and plant transformation in the genus Populus (poplar). Using methylated DNA immunoprecipitation followed by high-throughput sequencing (MeDIP-seq), we compared the methylomes of internode stem segments from micropropagated explants, dedifferentiated calli, and internodes from regenerated plants. We found that more than half (56%) of the methylated portion of the genome appeared to be differentially methylated among the three tissue types. Surprisingly, gene promoter methylation varied little among tissues, however, the percentage of body-methylated genes increased from 9% to 14% between explants and callus tissue, then decreased to 8% in regenerated internodes. Forty-five percent of differentially-methylated genes underwent transient methylation, becoming methylated in calli, and demethylated in regenerants. These genes were more frequent in chromosomal regions with higher gene density. Comparisons with an expression microarray dataset showed that genes methylated at both promoters and gene bodies had lower expression than genes that were unmethylated or only promoter-methylated in all three tissues. Four types of abundant transposable elements showed their highest levels of 5mC in regenerated internodes.
DNA methylation varies in a highly gene- and chromosome-differential manner during in vitro differentiation and regeneration. 5mC in redifferentiated tissues was not reset to that in original explants during the study period. Hypermethylation of gene bodies in dedifferentiated cells did not interfere with transcription, and may serve a protective role against activation of abundant transposable elements.
Sacred lotus is a basal eudicot with agricultural, medicinal, cultural and religious importance. It was domesticated in Asia about 7,000 years ago, and cultivated for its rhizomes and seeds as a food crop. It is particularly noted for its 1,300-year seed longevity and exceptional water repellency, known as the lotus effect. The latter property is due to the nanoscopic closely packed protuberances of its self-cleaning leaf surface, which have been adapted for the manufacture of a self-cleaning industrial paint, Lotusan.
The genome of the China Antique variety of the sacred lotus was sequenced with Illumina and 454 technologies, at respective depths of 101× and 5.2×. The final assembly has a contig N50 of 38.8 kbp and a scaffold N50 of 3.4 Mbp, and covers 86.5% of the estimated 929 Mbp total genome size. The genome notably lacks the paleo-triplication observed in other eudicots, but reveals a lineage-specific duplication. The genome has evidence of slow evolution, with a 30% slower nucleotide mutation rate than observed in grape. Comparisons of the available sequenced genomes suggest a minimum gene set for vascular plants of 4,223 genes. Strikingly, the sacred lotus has 16 COG2132 multi-copper oxidase family proteins with root-specific expression; these are involved in root meristem phosphate starvation, reflecting adaptation to limited nutrient availability in an aquatic environment.
The slow nucleotide substitution rate makes the sacred lotus a better resource than the current standard, grape, for reconstructing the pan-eudicot genome, and should therefore accelerate comparative analysis between eudicots and monocots.
High-throughput genome scans are important tools for genetic studies and breeding applications. Here, a 6K SNP array for use with the Illumina Infinium® system was developed for diploid sweet cherry (Prunus avium) and allotetraploid sour cherry (P. cerasus). This effort was led by RosBREED, a community initiative to enable marker-assisted breeding for rosaceous crops. Next-generation sequencing in diverse breeding germplasm provided 25 billion basepairs (Gb) of cherry DNA sequence from which were identified genome-wide SNPs for sweet cherry and for the two sour cherry subgenomes derived from sweet cherry (avium subgenome) and P. fruticosa (fruticosa subgenome). Anchoring to the peach genome sequence, recently released by the International Peach Genome Initiative, predicted relative physical locations of the 1.9 million putative SNPs detected, preliminarily filtered to 368,943 SNPs. Further filtering was guided by results of a 144-SNP subset examined with the Illumina GoldenGate® assay on 160 accessions. A 6K Infinium® II array was designed with SNPs evenly spaced genetically across the sweet and sour cherry genomes. SNPs were developed for each sour cherry subgenome by using minor allele frequency in the sour cherry detection panel to enrich for subgenome-specific SNPs followed by targeting to either subgenome according to alleles observed in sweet cherry. The array was evaluated using panels of sweet (n = 269) and sour (n = 330) cherry breeding germplasm. Approximately one third of array SNPs were informative for each crop. A total of 1825 polymorphic SNPs were verified in sweet cherry, 13% of these originally developed for sour cherry. Allele dosage was resolved for 2058 polymorphic SNPs in sour cherry, one third of these being originally developed for sweet cherry. This publicly available genomics resource represents a significant advance in cherry genome-scanning capability that will accelerate marker-locus-trait association discovery, genome structure investigation, and genetic diversity assessment in this diploid-tetraploid crop group.
Pyrenophora tritici-repentis (Ptr), a necrotrophic fungus and the causal agent of tan spot of wheat, produces one or a combination of host-selective toxins (HSTs) necessary for disease development. The two most studied toxins produced by Ptr, Ptr ToxA (ToxA) and Ptr ToxB (ToxB), are proteins that cause necrotic or chlorotic symptoms respectively. Investigation of host responses induced by HSTs provides better insight into the nature of the host susceptibility. Microarray analysis of ToxA has provided evidence that it can elicit responses similar to those associated with defense. In order to evaluate whether there are consistent host responses associated with susceptibility, a similar analysis of ToxB-induced changes in the same sensitive cultivar was conducted. Comparative analysis of ToxA- and ToxB-induced transcriptional changes showed that similar groups of genes encoding WRKY transcription factors, RLKs, PRs, components of the phenylpropanoid and jasmonic acid pathways are activated. ROS accumulation and photosystem dysfunction proved to be common mechanism-of-action for these toxins. Despite similarities in defense responses, transcriptional and biochemical responses as well as symptom development occur more rapidly for ToxA compared to ToxB, which could be explained by differences in perception as well as by differences in activation of a specific process, for example, ethylene biosynthesis in ToxA treatment. Results of this study suggest that perception of HSTs will result in activation of defense responses as part of a susceptible interaction and further supports the hypothesis that necrotrophic fungi exploit defense responses in order to induce cell death.
Recent mapping of eukaryotic transcriptomes and spliceomes using massively parallel RNA sequencing (RNA-seq) has revealed that the extent of alternative splicing has been considerably underestimated. Evidence also suggests that many pre-mRNAs undergo unproductive alternative splicing resulting in incorporation of in-frame premature termination codons (PTCs). The destinies and potential functions of the PTC-harboring mRNAs remain poorly understood. Unproductive alternative splicing in circadian clock genes presents a special case study because the daily oscillations of protein expression levels require rapid and steep adjustments in mRNA levels.
We conducted a systematic survey of alternative splicing of plant circadian clock genes using RNA-seq and found that many Arabidopsis thaliana circadian clock-associated genes are alternatively spliced. Results were confirmed using reverse transcription polymerase chain reaction (RT-PCR), quantitative RT-PCR (qRT-PCR), and/or Sanger sequencing. Intron retention events were frequently observed in mRNAs of the CCA1/LHY-like subfamily of MYB transcription factors. In contrast, the REVEILLE2 (RVE2) transcript was alternatively spliced via inclusion of a "poison cassette exon" (PCE). The PCE type events introducing in-frame PTCs are conserved in some mammalian and plant serine/arginine-rich splicing factors. For some circadian genes such as CCA1 the ratio of the productive isoform (i.e., a representative splice variant encoding the full-length protein) to its PTC counterpart shifted sharply under specific environmental stress conditions.
Our results demonstrate that unproductive alternative splicing is a widespread phenomenon among plant circadian clock genes that frequently generates mRNA isoforms harboring in-frame PTCs. Because LHY and CCA1 are core components of the plant central circadian oscillator, the conservation of alternatively spliced variants between CCA1 and LHY and for CCA1 across phyla  indicates a potential role of nonsense transcripts in regulation of circadian rhythms. Most of the alternatively spliced isoforms harbor in-frame PTCs that arise from full or partial intron retention events. However, a PTC in the RVE2 transcript is introduced through a PCE event. The conservation of AS events and modulation of the relative abundance of nonsense isoforms by environmental and diurnal conditions suggests possible regulatory roles for these alternatively spliced transcripts in circadian clock function. The temperature-dependent expression of the PTC transcripts among members of CCA1/LHY subfamily indicates that alternative splicing may be involved in regulation of the clock temperature compensation mechanism.
This article was reviewed by Dr. Eugene Koonin, Dr. Chungoo Park (nominated by Dr. Kateryna Makova), and Dr. Marcelo Yanovsky (nominated by Dr. Valerian Dolja).
Arabidopsis thaliana; Alternative splicing; Circadian clock; RNA-seq; Intron retention; Cassette exon; Nonsense mRNAs; Premature termination codon; CIRCADIAN CLOCK ASSOCIATED 1 (CCA1); LATE ELONGATED HYPOCOTYL (LHY); REVEILLE 2 (RVE2).
Little is known about the potential of Brachypodium distachyon as a model for low temperature stress responses in Pooideae. The ice recrystallization inhibition protein (IRIP) genes, fructosyltransferase (FST) genes, and many C-repeat binding factor (CBF) genes are Pooideae specific and important in low temperature responses. Here we used comparative analyses to study conservation and evolution of these gene families in B. distachyon to better understand its potential as a model species for agriculturally important temperate grasses.
Brachypodium distachyon contains cold responsive IRIP genes which have evolved through Brachypodium specific gene family expansions. A large cold responsive CBF3 subfamily was identified in B. distachyon, while CBF4 homologs are absent from the genome. No B. distachyon FST gene homologs encode typical core Pooideae FST-motifs and low temperature induced fructan accumulation was dramatically different in B. distachyon compared to core Pooideae species.
We conclude that B. distachyon can serve as an interesting model for specific molecular mechanisms involved in low temperature responses in core Pooideae species. However, the evolutionary history of key genes involved in low temperature responses has been different in Brachypodium and core Pooideae species. These differences limit the use of B. distachyon as a model for holistic studies relevant for agricultural core Pooideae species.
Brachypodium distachyon; Cold climate adaptation; Ice recrystallization inhibition protein; Gene expression; Fructosyltransferase; C-repeat binding factor; Gene family evolution
Although a large number of single nucleotide polymorphism (SNP) markers covering the entire genome are needed to enable molecular breeding efforts such as genome wide association studies, fine mapping, genomic selection and marker-assisted selection in peach [Prunus persica (L.) Batsch] and related Prunus species, only a limited number of genetic markers, including simple sequence repeats (SSRs), have been available to date. To address this need, an international consortium (The International Peach SNP Consortium; IPSC) has pursued a coordinated effort to perform genome-scale SNP discovery in peach using next generation sequencing platforms to develop and characterize a high-throughput Illumina Infinium® SNP genotyping array platform. We performed whole genome re-sequencing of 56 peach breeding accessions using the Illumina and Roche/454 sequencing technologies. Polymorphism detection algorithms identified a total of 1,022,354 SNPs. Validation with the Illumina GoldenGate® assay was performed on a subset of the predicted SNPs, verifying ∼75% of genic (exonic and intronic) SNPs, whereas only about a third of intergenic SNPs were verified. Conservative filtering was applied to arrive at a set of 8,144 SNPs that were included on the IPSC peach SNP array v1, distributed over all eight peach chromosomes with an average spacing of 26.7 kb between SNPs. Use of this platform to screen a total of 709 accessions of peach in two separate evaluation panels identified a total of 6,869 (84.3%) polymorphic SNPs.
The almost 7,000 SNPs verified as polymorphic through extensive empirical evaluation represent an excellent source of markers for future studies in genetic relatedness, genetic mapping, and dissecting the genetic architecture of complex agricultural traits. The IPSC peach SNP array v1 is commercially available and we expect that it will be used worldwide for genetic studies in peach and related stone fruit and nut species.
The woodland strawberry, Fragaria vesca (2n = 2x = 14), is a versatile experimental plant system. This diminutive herbaceous perennial has a small genome (240 Mb), is amenable to genetic transformation and shares substantial sequence identity with the cultivated strawberry (Fragaria × ananassa) and other economically important rosaceous plants. Here we report the draft F. vesca genome, which was sequenced to ×39 coverage using second-generation technology, assembled de novo and then anchored to the genetic linkage map into seven pseudochromosomes. This diploid strawberry sequence lacks the large genome duplications seen in other rosids. Gene prediction modeling identified 34,809 genes, with most being supported by transcriptome mapping. Genes critical to valuable horticultural traits including flavor, nutritional value and flowering time were identified. Macrosyntenic relationships between Fragaria and Prunus predict a hypothetical ancestral Rosaceae genome that had nine chromosomes. New phylogenetic analysis of 154 protein-coding genes suggests that assignment of Populus to Malvidae, rather than Fabidae, is warranted.
Switchgrass (Panicum virgatum L.) is a C4 perennial grass and widely popular as an important bioenergy crop. To accelerate the pace of developing high yielding switchgrass cultivars adapted to diverse environmental niches, the generation of genomic resources for this plant is necessary. The large genome size and polyploid nature of switchgrass makes whole genome sequencing a daunting task even with current technologies. Exploring the transcriptional landscape using next generation sequencing technologies provides a viable alternative to whole genome sequencing in switchgrass.
Switchgrass cDNA libraries from germinating seedlings, emerging tillers, flowers, and dormant seeds were sequenced using Roche 454 GS-FLX Titanium technology, generating 980,000 reads with an average read length of 367 bp. De novo assembly generated 243,600 contigs with an average length of 535 bp. Using the foxtail millet genome as a reference greatly improved the assembly and annotation of switchgrass ESTs. Comparative analysis of the 454-derived switchgrass EST reads with other sequenced monocots including Brachypodium, sorghum, rice and maize indicated a 70–80% overlap. RPKM analysis demonstrated unique transcriptional signatures of the four tissues analyzed in this study. More than 24,000 ESTs were identified in the dormant seed library. In silico analysis indicated that there are more than 2000 EST-SSRs in this collection. Expression of several orphan ESTs was confirmed by RT-PCR.
We estimate that about 90% of the switchgrass gene space has been covered in this analysis. This study nearly doubles the amount of EST information for switchgrass currently in the public domain. The celerity and economical nature of second-generation sequencing technologies provide an in-depth view of the gene space of complex genomes like switchgrass. Sequence analysis of closely related members of the NAD+-malic enzyme type C4 grasses such as the model system Setaria viridis can serve as a viable proxy for the switchgrass genome.
DNA cytosine methylation is an epigenetic modification that has been implicated in many biological processes. However, large-scale epigenomic studies have been applied to very few plant species, and variability in methylation among specialized tissues and its relationship to gene expression is poorly understood.
We surveyed DNA methylation from seven distinct tissue types (vegetative bud, male inflorescence [catkin], female catkin, leaf, root, xylem, phloem) in the reference tree species black cottonwood (Populus trichocarpa). Using 5-methyl-cytosine DNA immunoprecipitation followed by Illumina sequencing (MeDIP-seq), we mapped a total of 129,360,151 36- or 32-mer reads to the P. trichocarpa reference genome. We validated MeDIP-seq results by bisulfite sequencing, and compared methylation and gene expression using published microarray data. Qualitative DNA methylation differences among tissues were obvious on a chromosome scale. Methylated genes had lower expression than unmethylated genes, but genes with methylation in transcribed regions ("gene body methylation") had even lower expression than genes with promoter methylation. Promoter methylation was more frequent than gene body methylation in all tissues except male catkins. Male catkins differed in demethylation of particular transposable element categories, in level of gene body methylation, and in expression range of genes with methylated transcribed regions. Tissue-specific gene expression patterns were correlated with both gene body and promoter methylation.
We found striking differences among tissues in methylation, which were apparent at the chromosomal scale and when genes and transposable elements were examined. In contrast to other studies in plants, gene body methylation had a more repressive effect on transcription than promoter methylation.
Epigenetics; epigenomics; DNA methylation; 5-methylcytosine; Populus
GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts.
Snakes provide a unique vertebrate system for studying a diversity of extreme adaptations, including those related to development, metabolism, physiology, and venom. Despite their importance as research models, genomic resources for snakes are few. Among snakes, the Burmese python is the premier model for studying extremes of metabolic fluctuation and physiological remodelling. In this species, the consumption of large infrequent meals can induce a 40-fold increase in metabolic rate and more than a doubling in size of some organs. To provide a foundation for research utilizing the python, our aim was to assemble and annotate a transcriptome reference from the heart and liver. To accomplish this aim, we used the 454-FLX sequencing platform to collect sequence data from multiple cDNA libraries.
We collected nearly 1 million 454 sequence reads, and assembled these into 37,245 contigs with a combined length of 13,409,006 bp. To identify known genes, these contigs were compared to chicken and lizard gene sets, and to all Genbank sequences. A total of 13,286 of these contigs were annotated based on similarity to known genes or Genbank sequences. We used gene ontology (GO) assignments to characterize the types of genes in this transcriptome resource. The raw data, transcript contig assembly, and transcript annotations are made available online for use by the broader research community.
These data should facilitate future studies using pythons and snakes in general, helping to further contribute to the utilization of snakes as a model evolutionary and physiological system. This sequence collection represents a major genomic resource for the Burmese python, and the large number of transcript sequences characterized should contribute to future research in this and other snake species.
Circadian clocks provide an adaptive advantage through anticipation of daily and seasonal environmental changes. In plants, the central clock oscillator is regulated by several interlocking feedback loops. It was shown that a substantial proportion of the Arabidopsis genome cycles with phases of peak expression covering the entire day. Synchronized transcriptome cycling is driven through an extensive network of diurnal and clock-regulated transcription factors and their target cis-regulatory elements. Study of the cycling transcriptome in other plant species could thus help elucidate the similarities and differences and identify hubs of regulation common to monocot and dicot plants.
Using a combination of oligonucleotide microarrays and data mining pipelines, we examined daily rhythms in gene expression in one monocotyledonous and one dicotyledonous plant, rice and poplar, respectively. Cycling transcriptomes were interrogated under different diurnal (driven) and circadian (free running) light and temperature conditions. Collectively, photocycles and thermocycles regulated about 60% of the expressed nuclear genes in rice and poplar. Depending on the condition tested, up to one third of oscillating Arabidopsis-poplar-rice orthologs were phased within three hours of each other suggesting a high degree of conservation in terms of rhythmic gene expression. We identified clusters of rhythmically co-expressed genes and searched their promoter sequences to identify phase-specific cis-elements, including elements that were conserved in the promoters of Arabidopsis, poplar, and rice.
Our results show that the cycling patterns of many circadian clock genes are highly conserved across poplar, rice, and Arabidopsis. The expression of many orthologous genes in key metabolic and regulatory pathways is diurnal and/or circadian regulated and phased to similar times of day. Our results confirm previous findings in Arabidopsis of three major classes of cis-regulatory modules within the plant circadian network: the morning (ME, GBOX), evening (EE, GATA), and midnight (PBX/TBX/SBX) modules. Identification of identical overrepresented motifs in the promoters of cycling genes from different species suggests that the core diurnal/circadian cis-regulatory network is deeply conserved between mono- and dicotyledonous species.
DNA methylation is an epigenetic mark affecting genes and transposons. We screened for mutations that fail to establish DNA methylation, yielding two mutants termed involved in de novo (idn). IDN1 encodes DMS3, an SMC related protein, IDN2 encodes a novel double stranded RNA binding protein with homology to SGS3. IDN1 and IDN2 control de novo methylation and siRNA-mediated maintenance methylation and are components of the RNA-directed DNA methylation pathway.
Motivation: High-throughput sequencing technologies have recently made deep interrogation of expressed transcript sequences practical, both economically and temporally. Identification of intron/exon boundaries is an essential part of genome annotation, yet remains a challenge. Here, we present supersplat, a method for unbiased splice-junction discovery through empirical RNA-seq data.
Results: Using a genomic reference and RNA-seq high-throughput sequencing datasets, supersplat empirically identifies potential splice junctions at a rate of ∼11.4 million reads per hour. We further benchmark the performance of the algorithm by mapping Illumina RNA-seq reads to identify introns in the genome of the reference dicot plant Arabidopsis thaliana and we demonstrate the utility of supersplat for de novo empirical annotation of splice junctions using the reference monocot plant Brachypodium distachyon.
Availability: Implemented in C++, supersplat source code and binaries are freely available on the web at http://mocklerlab-tools.cgrb.oregonstate.edu/
Our aim is to improve knowledge of gene regulatory circuits important to dedifferentiation, redifferentiation, and adventitious meristem organization during in vitro regeneration of plants. Regeneration of transgenic cells remains a major obstacle to research and commercial deployment of most taxa of transgenic plants, and woody species are particularly recalcitrant. The model woody species Populus, due to its genome sequence and amenability to in vitro manipulation, is an excellent species for study in this area. The genes recognized may help to guide the development of new tools for improving the efficiency of plant regeneration and transformation.
We analyzed gene expression during poplar in vitro dedifferentiation and shoot regeneration using an Affymetrix array representing over 56,000 poplar transcripts. We focused on callus induction and shoot formation, thus we sampled RNAs from tissues: prior to callus induction, 3 days and 15 days after callus induction, and 3 days and 8 days after the start of shoot induction. We used a female hybrid white poplar clone (INRA 717-1 B4, Populus tremula × P. alba) that is used widely as a model transgenic genotype. Approximately 15% of the monitored genes were significantly up-or down-regulated when controlling the false discovery rate (FDR) at 0.01; over 3,000 genes had a 5-fold or greater change in expression. We found a large initial change in expression after the beginning of hormone treatment (at the earliest stage of callus induction), and then a much smaller number of additional differentially expressed genes at subsequent regeneration stages. A total of 588 transcription factors that were distributed in 45 gene families were differentially regulated. Genes that showed strong differential expression included components of auxin and cytokinin signaling, selected cell division genes, and genes related to plastid development and photosynthesis. When compared with data on in vitro callogenesis in Arabidopsis, 25% (1,260) of up-regulated and 22% (748) of down-regulated genes were in common with the genes regulated in poplar during callus induction.
The major regulatory events during plant cell organogenesis occur at early stages of dedifferentiation. The regulatory circuits reflect the combinational effects of transcriptional control and hormone signaling, and associated changes in light environment imposed during dedifferentiation.
New rapid high-throughput sequencing technologies have sparked the creation of a new class of assembler. Since all high-throughput sequencing platforms incorporate errors in their output, short-read assemblers must be designed to account for this error while utilizing all available data.
We have designed and implemented an assembler, Quality-value guided Short Read Assembler, created to take advantage of quality-value scores as a further method of dealing with error. Compared to previous published algorithms, our assembler shows significant improvements not only in speed but also in output quality.
QSRA generally produced the highest genomic coverage, while being faster than VCAKE. QSRA is extremely competitive in its longest contig and N50/N80 contig lengths, producing results of similar quality to those of EDENA and VELVET. QSRA provides a step closer to the goal of de novo assembly of complex genomes, improving upon the original VCAKE algorithm by not only drastically reducing runtimes but also increasing the viability of the assembly algorithm through further error handling capabilities.
Most organisms have internal circadian clocks that mediate responses to daily environmental changes in order to synchronize biological functions to the correct times of the day. Previous studies have focused on plants found in temperate and sub-tropical climates, and little is known about the circadian transcriptional networks of plants that typically grow under conditions with relatively constant day lengths and temperatures over the year. In this study we conducted a genomic and computational analysis of the circadian biology of Carica papaya, a tropical tree. We found that predicted papaya circadian clock genes cycle with the same phase as Arabidopsis genes. The patterns of time-of-day overrepresentation of circadian-associated promoter elements were nearly identical across papaya, Arabidopsis, rice, and poplar. Evolution of promoter structure predicts the observed morning- and evening-specific expression profiles of the papaya PRR5 paralogs. The strong conservation of previously identified circadian transcriptional networks in papaya, despite its tropical habitat and distinct life-style, suggest that circadian timing has played a major role in the evolution of plant genomes, consistent with the selective pressure of anticipating daily environmental changes. Further studies could exploit this conservation to elucidate general design principles that will facilitate engineering plant growth pathways for specific environments.
Electronic supplementary material
The online version of this article (doi:10.1007/s12042-008-9020-3) contains supplementary material, which is available to authorized users.
Carica papaya; Circadian clock; Cis-acting element; Diurnal
Most organisms use daily light/dark cycles as timing cues to control many essential physiological processes. In plants, growth rates of the embryonic stem (hypocotyl) are maximal at different times of day, depending on external photoperiod and the internal circadian clock. However, the interactions between light signaling, the circadian clock, and growth-promoting hormone pathways in growth control remain poorly understood. At the molecular level, such growth rhythms could be attributed to several different layers of time-specific control such as phasing of transcription, signaling, or protein abundance. To determine the transcriptional component associated with the rhythmic control of growth, we applied temporal analysis of the Arabidopsis thaliana seedling transcriptome under multiple growth conditions and mutant backgrounds using DNA microarrays. We show that a group of plant hormone-associated genes are coexpressed at the time of day when hypocotyl growth rate is maximal. This expression correlates with overrepresentation of a cis-acting element (CACATG) in phytohormone gene promoters, which is sufficient to confer the predicted diurnal and circadian expression patterns in vivo. Using circadian clock and light signaling mutants, we show that both internal coincidence of phytohormone signaling capacity and external coincidence with darkness are required to coordinate wild-type growth. From these data, we argue that the circadian clock indirectly controls growth by permissive gating of light-mediated phytohormone transcript levels to the proper time of day. This temporal integration of hormone pathways allows plants to fine tune phytohormone responses for seasonal and shade-appropriate growth regulation.
In plants, stems elongate faster at dawn. This time-of-day–specific growth is controlled by integration of environmental cues and the circadian clock. The specific effectors of growth in plants are the phytohormones: auxin, ethylene, gibberellins, abscisic acid, brassinosteroids, and cytokinins. Each phytohormone plays an independent as well as an overlapping role in growth, and understanding the interactions of the phytohormones has dominated plant research over the past century. The authors present a model in which the circadian clock coordinates growth by synchronizing phytohormone gene expression at dawn, allowing a plant to control growth in a condition-specific manner. Furthermore, the results presented provide a new framework for future experiments aimed at understanding the integration and crosstalk of the phytohormones.
Why do plants grow faster at dawn? New results suggest that light and the circadian clock coordinate growth by synchronizing the expression of plant hormone genes at dawn.
Correct daily phasing of transcription confers an adaptive advantage to almost all organisms, including higher plants. In this study, we describe a hypothesis-driven network discovery pipeline that identifies biologically relevant patterns in genome-scale data. To demonstrate its utility, we analyzed a comprehensive matrix of time courses interrogating the nuclear transcriptome of Arabidopsis thaliana plants grown under different thermocycles, photocycles, and circadian conditions. We show that 89% of Arabidopsis transcripts cycle in at least one condition and that most genes have peak expression at a particular time of day, which shifts depending on the environment. Thermocycles alone can drive at least half of all transcripts critical for synchronizing internal processes such as cell cycle and protein synthesis. We identified at least three distinct transcription modules controlling phase-specific expression, including a new midnight specific module, PBX/TBX/SBX. We validated the network discovery pipeline, as well as the midnight specific module, by demonstrating that the PBX element was sufficient to drive diurnal and circadian condition-dependent expression. Moreover, we show that the three transcription modules are conserved across Arabidopsis, poplar, and rice. These results confirm the complex interplay between thermocycles, photocycles, and the circadian clock on the daily transcription program, and provide a comprehensive view of the conserved genomic targets for a transcriptional network key to successful adaptation.
As the earth rotates, environmental conditions oscillate between illuminated warm days and dark cool nights. Plants have adapted to these changes by timing physiological processes to specific times of the day or night. Light and temperature signaling and the circadian clock regulate this adaptive response. To determine the contributions of each of these factors on gene regulation, we analyzed microarray time course experiments interrogating light, temperature, and circadian conditions. We discovered that almost all Arabidopsis genes cycle in at least one condition. From a signaling perspective, this suggests that light, temperature, and circadian clock play an important role in modulating many physiological pathways. To clarify the contribution of transcriptional regulation on this process, we mined the promoters of cycling genes to identify DNA elements associated with expression at specific times of day. This confirmed the importance of several DNA motifs such as the G-box and the evening element in the regulation of gene expression by light and the circadian clock, but also facilitated the discovery of new elements linked to a novel midnight regulatory module. Identification of orthologous promoter elements in rice and poplar revealed a conserved transcriptional regulatory network that allows global adaptation to the ever-changing daily environment.
How growth regulators provoke context-specific signals is a fundamental question in developmental biology. In plants, both auxin and brassinosteroids (BRs) promote cell expansion, and it was thought that they activated this process through independent mechanisms. In this work, we describe a shared auxin:BR pathway required for seedling growth. Genetic, physiological, and genomic analyses demonstrate that response from one pathway requires the function of the other, and that this interdependence does not act at the level of hormone biosynthetic control. Increased auxin levels saturate the BR-stimulated growth response and greatly reduce BR effects on gene expression. Integration of these two pathways is downstream from BES1 and Aux/IAA proteins, the last known regulatory factors acting downstream of each hormone, and is likely to occur directly on the promoters of auxin:BR target genes. We have developed a new approach to identify potential regulatory elements acting in each hormone pathway, as well as in the shared auxin:BR pathway. We show that one element highly overrepresented in the promoters of auxin- and BR-induced genes is responsive to both hormones and requires BR biosynthesis for normal expression. This work fundamentally alters our view of BR and auxin signaling and describes a powerful new approach to identify regulatory elements required for response to specific stimuli.
Although distinct sets of growth regulators - auxin and brassinosteroids - are required for cell expansion; rather than being independent signals, the response from each pathway requires the other