Search tips
Search criteria

Results 1-25 (27)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
1.  A genetic linkage map of black raspberry (Rubus occidentalis) and the mapping of Ag4 conferring resistance to the aphid Amphorophora agathonica 
Key message
We have constructed a densely populated, saturated genetic linkage map of black raspberry and successfully placed a locus for aphid resistance.
Black raspberry (Rubus occidentalis L.) is a high-value crop in the Pacific Northwest of North America with an international marketplace. Few genetic resources are readily available and little improvement has been achieved through breeding efforts to address production challenges involved in growing this crop. Contributing to its lack of improvement is low genetic diversity in elite cultivars and an untapped reservoir of genetic diversity from wild germplasm. In the Pacific Northwest, where most production is centered, the current standard commercial cultivar is highly susceptible to the aphid Amphorophora agathonica Hottes, which is a vector for the Raspberry mosaic virus complex. Infection with the virus complex leads to a rapid decline in plant health resulting in field replacement after only 3–4 growing seasons. Sources of aphid resistance have been identified in wild germplasm and are used to develop mapping populations to study the inheritance of these valuable traits. We have constructed a genetic linkage map using single-nucleotide polymorphism and transferable (primarily simple sequence repeat) markers for F1 population ORUS 4305 consisting of 115 progeny that segregate for aphid resistance. Our linkage map of seven linkage groups representing the seven haploid chromosomes of black raspberry consists of 274 markers on the maternal map and 292 markers on the paternal map including a morphological locus for aphid resistance. This is the first linkage map of black raspberry and will aid in developing markers for marker-assisted breeding, comparative mapping with other Rubus species, and enhancing the black raspberry genome assembly.
Electronic supplementary material
The online version of this article (doi:10.1007/s00122-015-2541-x) contains supplementary material, which is available to authorized users.
PMCID: PMC4477079  PMID: 26037086
2.  A new alternative in plant retrograde signaling 
Genome Biology  2014;15(5):117.
The reduced or oxidized state of plastoquinone in chloroplasts regulates splicing in the nucleus to control nuclear gene expression in response to changing environmental conditions.
PMCID: PMC4072986  PMID: 25001637
3.  Analysis of Global Gene Expression in Brachypodium distachyon Reveals Extensive Network Plasticity in Response to Abiotic Stress 
PLoS ONE  2014;9(1):e87499.
Brachypodium distachyon is a close relative of many important cereal crops. Abiotic stress tolerance has a significant impact on productivity of agriculturally important food and feedstock crops. Analysis of the transcriptome of Brachypodium after chilling, high-salinity, drought, and heat stresses revealed diverse differential expression of many transcripts. Weighted Gene Co-Expression Network Analysis revealed 22 distinct gene modules with specific profiles of expression under each stress. Promoter analysis implicated short DNA sequences directly upstream of module members in the regulation of 21 of 22 modules. Functional analysis of module members revealed enrichment in functional terms for 10 of 22 network modules. Analysis of condition-specific correlations between differentially expressed gene pairs revealed extensive plasticity in the expression relationships of gene pairs. Photosynthesis, cell cycle, and cell wall expression modules were down-regulated by all abiotic stresses. Modules which were up-regulated by each abiotic stress fell into diverse and unique gene ontology GO categories. This study provides genomics resources and improves our understanding of abiotic stress responses of Brachypodium.
PMCID: PMC3906199  PMID: 24489928
4.  Parallel analysis of RNA ends enhances global investigation of microRNAs and target RNAs of Brachypodium distachyon 
Genome Biology  2013;14(12):R145.
The wild grass Brachypodium distachyon has emerged as a model system for temperate grasses and biofuel plants. However, the global analysis of miRNAs, molecules known to be key for eukaryotic gene regulation, has been limited in B. distachyon to studies examining a few samples or that rely on computational predictions. Similarly an in-depth global analysis of miRNA-mediated target cleavage using parallel analysis of RNA ends (PARE) data is lacking in B. distachyon.
B. distachyon small RNAs were cloned and deeply sequenced from 17 libraries that represent different tissues and stresses. Using a computational pipeline, we identified 116 miRNAs including not only conserved miRNAs that have not been reported in B. distachyon, but also non-conserved miRNAs that were not found in other plants. To investigate miRNA-mediated cleavage function, four PARE libraries were constructed from key tissues and sequenced to a total depth of approximately 70 million sequences. The roughly 5 million distinct genome-matched sequences that resulted represent an extensive dataset for analyzing small RNA-guided cleavage events. Analysis of the PARE and miRNA data provided experimental evidence for miRNA-mediated cleavage of 264 sites in predicted miRNA targets. In addition, PARE analysis revealed that differentially expressed miRNAs in the same family guide specific target RNA cleavage in a correspondingly tissue-preferential manner.
B. distachyon miRNAs and target RNAs were experimentally identified and analyzed. Knowledge gained from this study should provide insights into the roles of miRNAs and the regulation of their targets in B. distachyon and related plants.
PMCID: PMC4053937  PMID: 24367943
5.  Functional characterization of cinnamyl alcohol dehydrogenase and caffeic acid O-methyltransferase in Brachypodium distachyon 
BMC Biotechnology  2013;13:61.
Lignin is a significant barrier in the conversion of plant biomass to bioethanol. Cinnamyl alcohol dehydrogenase (CAD) and caffeic acid O-methyltransferase (COMT) catalyze key steps in the pathway of lignin monomer biosynthesis. Brown midrib mutants in Zea mays and Sorghum bicolor with impaired CAD or COMT activity have attracted considerable agronomic interest for their altered lignin composition and improved digestibility. Here, we identified and functionally characterized candidate genes encoding CAD and COMT enzymes in the grass model species Brachypodium distachyon with the aim of improving crops for efficient biofuel production.
We developed transgenic plants overexpressing artificial microRNA designed to silence BdCAD1 or BdCOMT4. Both transgenes caused altered flowering time and increased stem count and weight. Downregulation of BdCAD1 caused a leaf brown midrib phenotype, the first time this phenotype has been observed in a C3 plant. While acetyl bromide soluble lignin measurements were equivalent in BdCAD1 downregulated and control plants, histochemical staining and thioacidolysis indicated a decrease in lignin syringyl units and reduced syringyl/guaiacyl ratio in the transgenic plants. BdCOMT4 downregulated plants exhibited a reduction in total lignin content and decreased Maule staining of syringyl units in stem. Ethanol yield by microbial fermentation was enhanced in amiR-cad1-8 plants.
These results have elucidated two key genes in the lignin biosynthetic pathway in B. distachyon that, when perturbed, may result in greater stem biomass yield and bioconversion efficiency.
PMCID: PMC3734214  PMID: 23902793
6.  Methylome reorganization during in vitro dedifferentiation and regeneration of Populus trichocarpa 
BMC Plant Biology  2013;13:92.
Cytosine DNA methylation (5mC) is an epigenetic modification that is important to genome stability and regulation of gene expression. Perturbations of 5mC have been implicated as a cause of phenotypic variation among plants regenerated through in vitro culture systems. However, the pattern of change in 5mC and its functional role with respect to gene expression, are poorly understood at the genome scale. A fuller understanding of how 5mC changes during in vitro manipulation may aid the development of methods for reducing or amplifying the mutagenic and epigenetic effects of in vitro culture and plant transformation.
We investigated the in vitro methylome of the model tree species Populus trichocarpa in a system that mimics routine methods for regeneration and plant transformation in the genus Populus (poplar). Using methylated DNA immunoprecipitation followed by high-throughput sequencing (MeDIP-seq), we compared the methylomes of internode stem segments from micropropagated explants, dedifferentiated calli, and internodes from regenerated plants. We found that more than half (56%) of the methylated portion of the genome appeared to be differentially methylated among the three tissue types. Surprisingly, gene promoter methylation varied little among tissues, however, the percentage of body-methylated genes increased from 9% to 14% between explants and callus tissue, then decreased to 8% in regenerated internodes. Forty-five percent of differentially-methylated genes underwent transient methylation, becoming methylated in calli, and demethylated in regenerants. These genes were more frequent in chromosomal regions with higher gene density. Comparisons with an expression microarray dataset showed that genes methylated at both promoters and gene bodies had lower expression than genes that were unmethylated or only promoter-methylated in all three tissues. Four types of abundant transposable elements showed their highest levels of 5mC in regenerated internodes.
DNA methylation varies in a highly gene- and chromosome-differential manner during in vitro differentiation and regeneration. 5mC in redifferentiated tissues was not reset to that in original explants during the study period. Hypermethylation of gene bodies in dedifferentiated cells did not interfere with transcription, and may serve a protective role against activation of abundant transposable elements.
PMCID: PMC3728041  PMID: 23799904
7.  Genome of the long-living sacred lotus (Nelumbo nucifera Gaertn.) 
Genome Biology  2013;14(5):R41.
Sacred lotus is a basal eudicot with agricultural, medicinal, cultural and religious importance. It was domesticated in Asia about 7,000 years ago, and cultivated for its rhizomes and seeds as a food crop. It is particularly noted for its 1,300-year seed longevity and exceptional water repellency, known as the lotus effect. The latter property is due to the nanoscopic closely packed protuberances of its self-cleaning leaf surface, which have been adapted for the manufacture of a self-cleaning industrial paint, Lotusan.
The genome of the China Antique variety of the sacred lotus was sequenced with Illumina and 454 technologies, at respective depths of 101× and 5.2×. The final assembly has a contig N50 of 38.8 kbp and a scaffold N50 of 3.4 Mbp, and covers 86.5% of the estimated 929 Mbp total genome size. The genome notably lacks the paleo-triplication observed in other eudicots, but reveals a lineage-specific duplication. The genome has evidence of slow evolution, with a 30% slower nucleotide mutation rate than observed in grape. Comparisons of the available sequenced genomes suggest a minimum gene set for vascular plants of 4,223 genes. Strikingly, the sacred lotus has 16 COG2132 multi-copper oxidase family proteins with root-specific expression; these are involved in root meristem phosphate starvation, reflecting adaptation to limited nutrient availability in an aquatic environment.
The slow nucleotide substitution rate makes the sacred lotus a better resource than the current standard, grape, for reconstructing the pan-eudicot genome, and should therefore accelerate comparative analysis between eudicots and monocots.
PMCID: PMC4053705  PMID: 23663246
8.  Development and Evaluation of a Genome-Wide 6K SNP Array for Diploid Sweet Cherry and Tetraploid Sour Cherry 
PLoS ONE  2012;7(12):e48305.
High-throughput genome scans are important tools for genetic studies and breeding applications. Here, a 6K SNP array for use with the Illumina Infinium® system was developed for diploid sweet cherry (Prunus avium) and allotetraploid sour cherry (P. cerasus). This effort was led by RosBREED, a community initiative to enable marker-assisted breeding for rosaceous crops. Next-generation sequencing in diverse breeding germplasm provided 25 billion basepairs (Gb) of cherry DNA sequence from which were identified genome-wide SNPs for sweet cherry and for the two sour cherry subgenomes derived from sweet cherry (avium subgenome) and P. fruticosa (fruticosa subgenome). Anchoring to the peach genome sequence, recently released by the International Peach Genome Initiative, predicted relative physical locations of the 1.9 million putative SNPs detected, preliminarily filtered to 368,943 SNPs. Further filtering was guided by results of a 144-SNP subset examined with the Illumina GoldenGate® assay on 160 accessions. A 6K Infinium® II array was designed with SNPs evenly spaced genetically across the sweet and sour cherry genomes. SNPs were developed for each sour cherry subgenome by using minor allele frequency in the sour cherry detection panel to enrich for subgenome-specific SNPs followed by targeting to either subgenome according to alleles observed in sweet cherry. The array was evaluated using panels of sweet (n = 269) and sour (n = 330) cherry breeding germplasm. Approximately one third of array SNPs were informative for each crop. A total of 1825 polymorphic SNPs were verified in sweet cherry, 13% of these originally developed for sour cherry. Allele dosage was resolved for 2058 polymorphic SNPs in sour cherry, one third of these being originally developed for sweet cherry. This publicly available genomics resource represents a significant advance in cherry genome-scanning capability that will accelerate marker-locus-trait association discovery, genome structure investigation, and genetic diversity assessment in this diploid-tetraploid crop group.
PMCID: PMC3527432  PMID: 23284615
9.  Host-Selective Toxins of Pyrenophora tritici-repentis Induce Common Responses Associated with Host Susceptibility 
PLoS ONE  2012;7(7):e40240.
Pyrenophora tritici-repentis (Ptr), a necrotrophic fungus and the causal agent of tan spot of wheat, produces one or a combination of host-selective toxins (HSTs) necessary for disease development. The two most studied toxins produced by Ptr, Ptr ToxA (ToxA) and Ptr ToxB (ToxB), are proteins that cause necrotic or chlorotic symptoms respectively. Investigation of host responses induced by HSTs provides better insight into the nature of the host susceptibility. Microarray analysis of ToxA has provided evidence that it can elicit responses similar to those associated with defense. In order to evaluate whether there are consistent host responses associated with susceptibility, a similar analysis of ToxB-induced changes in the same sensitive cultivar was conducted. Comparative analysis of ToxA- and ToxB-induced transcriptional changes showed that similar groups of genes encoding WRKY transcription factors, RLKs, PRs, components of the phenylpropanoid and jasmonic acid pathways are activated. ROS accumulation and photosystem dysfunction proved to be common mechanism-of-action for these toxins. Despite similarities in defense responses, transcriptional and biochemical responses as well as symptom development occur more rapidly for ToxA compared to ToxB, which could be explained by differences in perception as well as by differences in activation of a specific process, for example, ethylene biosynthesis in ToxA treatment. Results of this study suggest that perception of HSTs will result in activation of defense responses as part of a susceptible interaction and further supports the hypothesis that necrotrophic fungi exploit defense responses in order to induce cell death.
PMCID: PMC3391247  PMID: 22792250
10.  Unproductive alternative splicing and nonsense mRNAs: A widespread phenomenon among plant circadian clock genes 
Biology Direct  2012;7:20.
Recent mapping of eukaryotic transcriptomes and spliceomes using massively parallel RNA sequencing (RNA-seq) has revealed that the extent of alternative splicing has been considerably underestimated. Evidence also suggests that many pre-mRNAs undergo unproductive alternative splicing resulting in incorporation of in-frame premature termination codons (PTCs). The destinies and potential functions of the PTC-harboring mRNAs remain poorly understood. Unproductive alternative splicing in circadian clock genes presents a special case study because the daily oscillations of protein expression levels require rapid and steep adjustments in mRNA levels.
We conducted a systematic survey of alternative splicing of plant circadian clock genes using RNA-seq and found that many Arabidopsis thaliana circadian clock-associated genes are alternatively spliced. Results were confirmed using reverse transcription polymerase chain reaction (RT-PCR), quantitative RT-PCR (qRT-PCR), and/or Sanger sequencing. Intron retention events were frequently observed in mRNAs of the CCA1/LHY-like subfamily of MYB transcription factors. In contrast, the REVEILLE2 (RVE2) transcript was alternatively spliced via inclusion of a "poison cassette exon" (PCE). The PCE type events introducing in-frame PTCs are conserved in some mammalian and plant serine/arginine-rich splicing factors. For some circadian genes such as CCA1 the ratio of the productive isoform (i.e., a representative splice variant encoding the full-length protein) to its PTC counterpart shifted sharply under specific environmental stress conditions.
Our results demonstrate that unproductive alternative splicing is a widespread phenomenon among plant circadian clock genes that frequently generates mRNA isoforms harboring in-frame PTCs. Because LHY and CCA1 are core components of the plant central circadian oscillator, the conservation of alternatively spliced variants between CCA1 and LHY and for CCA1 across phyla [2] indicates a potential role of nonsense transcripts in regulation of circadian rhythms. Most of the alternatively spliced isoforms harbor in-frame PTCs that arise from full or partial intron retention events. However, a PTC in the RVE2 transcript is introduced through a PCE event. The conservation of AS events and modulation of the relative abundance of nonsense isoforms by environmental and diurnal conditions suggests possible regulatory roles for these alternatively spliced transcripts in circadian clock function. The temperature-dependent expression of the PTC transcripts among members of CCA1/LHY subfamily indicates that alternative splicing may be involved in regulation of the clock temperature compensation mechanism.
This article was reviewed by Dr. Eugene Koonin, Dr. Chungoo Park (nominated by Dr. Kateryna Makova), and Dr. Marcelo Yanovsky (nominated by Dr. Valerian Dolja).
PMCID: PMC3403997  PMID: 22747664
Arabidopsis thaliana; Alternative splicing; Circadian clock; RNA-seq; Intron retention; Cassette exon; Nonsense mRNAs; Premature termination codon; CIRCADIAN CLOCK ASSOCIATED 1 (CCA1); LATE ELONGATED HYPOCOTYL (LHY); REVEILLE 2 (RVE2).
12.  Comparative analyses reveal potential uses of Brachypodium distachyon as a model for cold stress responses in temperate grasses 
BMC Plant Biology  2012;12:65.
Little is known about the potential of Brachypodium distachyon as a model for low temperature stress responses in Pooideae. The ice recrystallization inhibition protein (IRIP) genes, fructosyltransferase (FST) genes, and many C-repeat binding factor (CBF) genes are Pooideae specific and important in low temperature responses. Here we used comparative analyses to study conservation and evolution of these gene families in B. distachyon to better understand its potential as a model species for agriculturally important temperate grasses.
Brachypodium distachyon contains cold responsive IRIP genes which have evolved through Brachypodium specific gene family expansions. A large cold responsive CBF3 subfamily was identified in B. distachyon, while CBF4 homologs are absent from the genome. No B. distachyon FST gene homologs encode typical core Pooideae FST-motifs and low temperature induced fructan accumulation was dramatically different in B. distachyon compared to core Pooideae species.
We conclude that B. distachyon can serve as an interesting model for specific molecular mechanisms involved in low temperature responses in core Pooideae species. However, the evolutionary history of key genes involved in low temperature responses has been different in Brachypodium and core Pooideae species. These differences limit the use of B. distachyon as a model for holistic studies relevant for agricultural core Pooideae species.
PMCID: PMC3487962  PMID: 22569006
Brachypodium distachyon; Cold climate adaptation; Ice recrystallization inhibition protein; Gene expression; Fructosyltransferase; C-repeat binding factor; Gene family evolution
13.  Development and Evaluation of a 9K SNP Array for Peach by Internationally Coordinated SNP Detection and Validation in Breeding Germplasm 
PLoS ONE  2012;7(4):e35668.
Although a large number of single nucleotide polymorphism (SNP) markers covering the entire genome are needed to enable molecular breeding efforts such as genome wide association studies, fine mapping, genomic selection and marker-assisted selection in peach [Prunus persica (L.) Batsch] and related Prunus species, only a limited number of genetic markers, including simple sequence repeats (SSRs), have been available to date. To address this need, an international consortium (The International Peach SNP Consortium; IPSC) has pursued a coordinated effort to perform genome-scale SNP discovery in peach using next generation sequencing platforms to develop and characterize a high-throughput Illumina Infinium® SNP genotyping array platform. We performed whole genome re-sequencing of 56 peach breeding accessions using the Illumina and Roche/454 sequencing technologies. Polymorphism detection algorithms identified a total of 1,022,354 SNPs. Validation with the Illumina GoldenGate® assay was performed on a subset of the predicted SNPs, verifying ∼75% of genic (exonic and intronic) SNPs, whereas only about a third of intergenic SNPs were verified. Conservative filtering was applied to arrive at a set of 8,144 SNPs that were included on the IPSC peach SNP array v1, distributed over all eight peach chromosomes with an average spacing of 26.7 kb between SNPs. Use of this platform to screen a total of 709 accessions of peach in two separate evaluation panels identified a total of 6,869 (84.3%) polymorphic SNPs.
The almost 7,000 SNPs verified as polymorphic through extensive empirical evaluation represent an excellent source of markers for future studies in genetic relatedness, genetic mapping, and dissecting the genetic architecture of complex agricultural traits. The IPSC peach SNP array v1 is commercially available and we expect that it will be used worldwide for genetic studies in peach and related stone fruit and nut species.
PMCID: PMC3334984  PMID: 22536421
14.  The genome of woodland strawberry (Fragaria vesca) 
Nature Genetics  2010;43(2):109-116.
The woodland strawberry, Fragaria vesca (2n = 2x = 14), is a versatile experimental plant system. This diminutive herbaceous perennial has a small genome (240 Mb), is amenable to genetic transformation and shares substantial sequence identity with the cultivated strawberry (Fragaria × ananassa) and other economically important rosaceous plants. Here we report the draft F. vesca genome, which was sequenced to ×39 coverage using second-generation technology, assembled de novo and then anchored to the genetic linkage map into seven pseudochromosomes. This diploid strawberry sequence lacks the large genome duplications seen in other rosids. Gene prediction modeling identified 34,809 genes, with most being supported by transcriptome mapping. Genes critical to valuable horticultural traits including flavor, nutritional value and flowering time were identified. Macrosyntenic relationships between Fragaria and Prunus predict a hypothetical ancestral Rosaceae genome that had nine chromosomes. New phylogenetic analysis of 154 protein-coding genes suggests that assignment of Populus to Malvidae, rather than Fabidae, is warranted.
PMCID: PMC3326587  PMID: 21186353
15.  Exploring the Switchgrass Transcriptome Using Second-Generation Sequencing Technology 
PLoS ONE  2012;7(3):e34225.
Switchgrass (Panicum virgatum L.) is a C4 perennial grass and widely popular as an important bioenergy crop. To accelerate the pace of developing high yielding switchgrass cultivars adapted to diverse environmental niches, the generation of genomic resources for this plant is necessary. The large genome size and polyploid nature of switchgrass makes whole genome sequencing a daunting task even with current technologies. Exploring the transcriptional landscape using next generation sequencing technologies provides a viable alternative to whole genome sequencing in switchgrass.
Principal Findings
Switchgrass cDNA libraries from germinating seedlings, emerging tillers, flowers, and dormant seeds were sequenced using Roche 454 GS-FLX Titanium technology, generating 980,000 reads with an average read length of 367 bp. De novo assembly generated 243,600 contigs with an average length of 535 bp. Using the foxtail millet genome as a reference greatly improved the assembly and annotation of switchgrass ESTs. Comparative analysis of the 454-derived switchgrass EST reads with other sequenced monocots including Brachypodium, sorghum, rice and maize indicated a 70–80% overlap. RPKM analysis demonstrated unique transcriptional signatures of the four tissues analyzed in this study. More than 24,000 ESTs were identified in the dormant seed library. In silico analysis indicated that there are more than 2000 EST-SSRs in this collection. Expression of several orphan ESTs was confirmed by RT-PCR.
We estimate that about 90% of the switchgrass gene space has been covered in this analysis. This study nearly doubles the amount of EST information for switchgrass currently in the public domain. The celerity and economical nature of second-generation sequencing technologies provide an in-depth view of the gene space of complex genomes like switchgrass. Sequence analysis of closely related members of the NAD+-malic enzyme type C4 grasses such as the model system Setaria viridis can serve as a viable proxy for the switchgrass genome.
PMCID: PMC3315583  PMID: 22479570
16.  Dynamic DNA cytosine methylation in the Populus trichocarpa genome: tissue-level variation and relationship to gene expression 
BMC Genomics  2012;13:27.
DNA cytosine methylation is an epigenetic modification that has been implicated in many biological processes. However, large-scale epigenomic studies have been applied to very few plant species, and variability in methylation among specialized tissues and its relationship to gene expression is poorly understood.
We surveyed DNA methylation from seven distinct tissue types (vegetative bud, male inflorescence [catkin], female catkin, leaf, root, xylem, phloem) in the reference tree species black cottonwood (Populus trichocarpa). Using 5-methyl-cytosine DNA immunoprecipitation followed by Illumina sequencing (MeDIP-seq), we mapped a total of 129,360,151 36- or 32-mer reads to the P. trichocarpa reference genome. We validated MeDIP-seq results by bisulfite sequencing, and compared methylation and gene expression using published microarray data. Qualitative DNA methylation differences among tissues were obvious on a chromosome scale. Methylated genes had lower expression than unmethylated genes, but genes with methylation in transcribed regions ("gene body methylation") had even lower expression than genes with promoter methylation. Promoter methylation was more frequent than gene body methylation in all tissues except male catkins. Male catkins differed in demethylation of particular transposable element categories, in level of gene body methylation, and in expression range of genes with methylated transcribed regions. Tissue-specific gene expression patterns were correlated with both gene body and promoter methylation.
We found striking differences among tissues in methylation, which were apparent at the chromosomal scale and when genes and transposable elements were examined. In contrast to other studies in plants, gene body methylation had a more repressive effect on transcription than promoter methylation.
PMCID: PMC3298464  PMID: 22251412
Epigenetics; epigenomics; DNA methylation; 5-methylcytosine; Populus
17.  GENE-Counter: A Computational Pipeline for the Analysis of RNA-Seq Data for Gene Expression Differences 
PLoS ONE  2011;6(10):e25279.
GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts.
PMCID: PMC3188579  PMID: 21998647
18.  A multi-organ transcriptome resource for the Burmese Python (Python molurus bivittatus) 
BMC Research Notes  2011;4:310.
Snakes provide a unique vertebrate system for studying a diversity of extreme adaptations, including those related to development, metabolism, physiology, and venom. Despite their importance as research models, genomic resources for snakes are few. Among snakes, the Burmese python is the premier model for studying extremes of metabolic fluctuation and physiological remodelling. In this species, the consumption of large infrequent meals can induce a 40-fold increase in metabolic rate and more than a doubling in size of some organs. To provide a foundation for research utilizing the python, our aim was to assemble and annotate a transcriptome reference from the heart and liver. To accomplish this aim, we used the 454-FLX sequencing platform to collect sequence data from multiple cDNA libraries.
We collected nearly 1 million 454 sequence reads, and assembled these into 37,245 contigs with a combined length of 13,409,006 bp. To identify known genes, these contigs were compared to chicken and lizard gene sets, and to all Genbank sequences. A total of 13,286 of these contigs were annotated based on similarity to known genes or Genbank sequences. We used gene ontology (GO) assignments to characterize the types of genes in this transcriptome resource. The raw data, transcript contig assembly, and transcript annotations are made available online for use by the broader research community.
These data should facilitate future studies using pythons and snakes in general, helping to further contribute to the utilization of snakes as a model evolutionary and physiological system. This sequence collection represents a major genomic resource for the Burmese python, and the large number of transcript sequences characterized should contribute to future research in this and other snake species.
PMCID: PMC3173347  PMID: 21867488
19.  Global Profiling of Rice and Poplar Transcriptomes Highlights Key Conserved Circadian-Controlled Pathways and cis-Regulatory Modules 
PLoS ONE  2011;6(6):e16907.
Circadian clocks provide an adaptive advantage through anticipation of daily and seasonal environmental changes. In plants, the central clock oscillator is regulated by several interlocking feedback loops. It was shown that a substantial proportion of the Arabidopsis genome cycles with phases of peak expression covering the entire day. Synchronized transcriptome cycling is driven through an extensive network of diurnal and clock-regulated transcription factors and their target cis-regulatory elements. Study of the cycling transcriptome in other plant species could thus help elucidate the similarities and differences and identify hubs of regulation common to monocot and dicot plants.
Methodology/Principal Findings
Using a combination of oligonucleotide microarrays and data mining pipelines, we examined daily rhythms in gene expression in one monocotyledonous and one dicotyledonous plant, rice and poplar, respectively. Cycling transcriptomes were interrogated under different diurnal (driven) and circadian (free running) light and temperature conditions. Collectively, photocycles and thermocycles regulated about 60% of the expressed nuclear genes in rice and poplar. Depending on the condition tested, up to one third of oscillating Arabidopsis-poplar-rice orthologs were phased within three hours of each other suggesting a high degree of conservation in terms of rhythmic gene expression. We identified clusters of rhythmically co-expressed genes and searched their promoter sequences to identify phase-specific cis-elements, including elements that were conserved in the promoters of Arabidopsis, poplar, and rice.
Our results show that the cycling patterns of many circadian clock genes are highly conserved across poplar, rice, and Arabidopsis. The expression of many orthologous genes in key metabolic and regulatory pathways is diurnal and/or circadian regulated and phased to similar times of day. Our results confirm previous findings in Arabidopsis of three major classes of cis-regulatory modules within the plant circadian network: the morning (ME, GBOX), evening (EE, GATA), and midnight (PBX/TBX/SBX) modules. Identification of identical overrepresented motifs in the promoters of cycling genes from different species suggests that the core diurnal/circadian cis-regulatory network is deeply conserved between mono- and dicotyledonous species.
PMCID: PMC3111414  PMID: 21694767
20.  IDN1 and IDN2: two proteins required for de novo DNA methylation in Arabidopsis thaliana 
Nature structural & molecular biology  2009;16(12):1325-1327.
DNA methylation is an epigenetic mark affecting genes and transposons. We screened for mutations that fail to establish DNA methylation, yielding two mutants termed involved in de novo (idn). IDN1 encodes DMS3, an SMC related protein, IDN2 encodes a novel double stranded RNA binding protein with homology to SGS3. IDN1 and IDN2 control de novo methylation and siRNA-mediated maintenance methylation and are components of the RNA-directed DNA methylation pathway.
PMCID: PMC2842998  PMID: 19915591
21.  Supersplat—spliced RNA-seq alignment 
Bioinformatics  2010;26(12):1500-1505.
Motivation: High-throughput sequencing technologies have recently made deep interrogation of expressed transcript sequences practical, both economically and temporally. Identification of intron/exon boundaries is an essential part of genome annotation, yet remains a challenge. Here, we present supersplat, a method for unbiased splice-junction discovery through empirical RNA-seq data.
Results: Using a genomic reference and RNA-seq high-throughput sequencing datasets, supersplat empirically identifies potential splice junctions at a rate of ∼11.4 million reads per hour. We further benchmark the performance of the algorithm by mapping Illumina RNA-seq reads to identify introns in the genome of the reference dicot plant Arabidopsis thaliana and we demonstrate the utility of supersplat for de novo empirical annotation of splice junctions using the reference monocot plant Brachypodium distachyon.
Availability: Implemented in C++, supersplat source code and binaries are freely available on the web at
PMCID: PMC2881391  PMID: 20410051
22.  Genome scale transcriptome analysis of shoot organogenesis in Populus 
BMC Plant Biology  2009;9:132.
Our aim is to improve knowledge of gene regulatory circuits important to dedifferentiation, redifferentiation, and adventitious meristem organization during in vitro regeneration of plants. Regeneration of transgenic cells remains a major obstacle to research and commercial deployment of most taxa of transgenic plants, and woody species are particularly recalcitrant. The model woody species Populus, due to its genome sequence and amenability to in vitro manipulation, is an excellent species for study in this area. The genes recognized may help to guide the development of new tools for improving the efficiency of plant regeneration and transformation.
We analyzed gene expression during poplar in vitro dedifferentiation and shoot regeneration using an Affymetrix array representing over 56,000 poplar transcripts. We focused on callus induction and shoot formation, thus we sampled RNAs from tissues: prior to callus induction, 3 days and 15 days after callus induction, and 3 days and 8 days after the start of shoot induction. We used a female hybrid white poplar clone (INRA 717-1 B4, Populus tremula × P. alba) that is used widely as a model transgenic genotype. Approximately 15% of the monitored genes were significantly up-or down-regulated when controlling the false discovery rate (FDR) at 0.01; over 3,000 genes had a 5-fold or greater change in expression. We found a large initial change in expression after the beginning of hormone treatment (at the earliest stage of callus induction), and then a much smaller number of additional differentially expressed genes at subsequent regeneration stages. A total of 588 transcription factors that were distributed in 45 gene families were differentially regulated. Genes that showed strong differential expression included components of auxin and cytokinin signaling, selected cell division genes, and genes related to plastid development and photosynthesis. When compared with data on in vitro callogenesis in Arabidopsis, 25% (1,260) of up-regulated and 22% (748) of down-regulated genes were in common with the genes regulated in poplar during callus induction.
The major regulatory events during plant cell organogenesis occur at early stages of dedifferentiation. The regulatory circuits reflect the combinational effects of transcriptional control and hormone signaling, and associated changes in light environment imposed during dedifferentiation.
PMCID: PMC2784466  PMID: 19919717
23.  QSRA – a quality-value guided de novo short read assembler 
BMC Bioinformatics  2009;10:69.
New rapid high-throughput sequencing technologies have sparked the creation of a new class of assembler. Since all high-throughput sequencing platforms incorporate errors in their output, short-read assemblers must be designed to account for this error while utilizing all available data.
We have designed and implemented an assembler, Quality-value guided Short Read Assembler, created to take advantage of quality-value scores as a further method of dealing with error. Compared to previous published algorithms, our assembler shows significant improvements not only in speed but also in output quality.
QSRA generally produced the highest genomic coverage, while being faster than VCAKE. QSRA is extremely competitive in its longest contig and N50/N80 contig lengths, producing results of similar quality to those of EDENA and VELVET. QSRA provides a step closer to the goal of de novo assembly of complex genomes, improving upon the original VCAKE algorithm by not only drastically reducing runtimes but also increasing the viability of the assembly algorithm through further error handling capabilities.
PMCID: PMC2653489  PMID: 19239711
24.  Conserved Daily Transcriptional Programs in Carica papaya 
Tropical Plant Biology  2008;1(3-4):236-245.
Most organisms have internal circadian clocks that mediate responses to daily environmental changes in order to synchronize biological functions to the correct times of the day. Previous studies have focused on plants found in temperate and sub-tropical climates, and little is known about the circadian transcriptional networks of plants that typically grow under conditions with relatively constant day lengths and temperatures over the year. In this study we conducted a genomic and computational analysis of the circadian biology of Carica papaya, a tropical tree. We found that predicted papaya circadian clock genes cycle with the same phase as Arabidopsis genes. The patterns of time-of-day overrepresentation of circadian-associated promoter elements were nearly identical across papaya, Arabidopsis, rice, and poplar. Evolution of promoter structure predicts the observed morning- and evening-specific expression profiles of the papaya PRR5 paralogs. The strong conservation of previously identified circadian transcriptional networks in papaya, despite its tropical habitat and distinct life-style, suggest that circadian timing has played a major role in the evolution of plant genomes, consistent with the selective pressure of anticipating daily environmental changes. Further studies could exploit this conservation to elucidate general design principles that will facilitate engineering plant growth pathways for specific environments.
Electronic supplementary material
The online version of this article (doi:10.1007/s12042-008-9020-3) contains supplementary material, which is available to authorized users.
PMCID: PMC2890329  PMID: 20671772
Carica papaya; Circadian clock; Cis-acting element; Diurnal
25.  A Morning-Specific Phytohormone Gene Expression Program underlying Rhythmic Plant Growth 
PLoS Biology  2008;6(9):e225.
Most organisms use daily light/dark cycles as timing cues to control many essential physiological processes. In plants, growth rates of the embryonic stem (hypocotyl) are maximal at different times of day, depending on external photoperiod and the internal circadian clock. However, the interactions between light signaling, the circadian clock, and growth-promoting hormone pathways in growth control remain poorly understood. At the molecular level, such growth rhythms could be attributed to several different layers of time-specific control such as phasing of transcription, signaling, or protein abundance. To determine the transcriptional component associated with the rhythmic control of growth, we applied temporal analysis of the Arabidopsis thaliana seedling transcriptome under multiple growth conditions and mutant backgrounds using DNA microarrays. We show that a group of plant hormone-associated genes are coexpressed at the time of day when hypocotyl growth rate is maximal. This expression correlates with overrepresentation of a cis-acting element (CACATG) in phytohormone gene promoters, which is sufficient to confer the predicted diurnal and circadian expression patterns in vivo. Using circadian clock and light signaling mutants, we show that both internal coincidence of phytohormone signaling capacity and external coincidence with darkness are required to coordinate wild-type growth. From these data, we argue that the circadian clock indirectly controls growth by permissive gating of light-mediated phytohormone transcript levels to the proper time of day. This temporal integration of hormone pathways allows plants to fine tune phytohormone responses for seasonal and shade-appropriate growth regulation.
Author Summary
In plants, stems elongate faster at dawn. This time-of-day–specific growth is controlled by integration of environmental cues and the circadian clock. The specific effectors of growth in plants are the phytohormones: auxin, ethylene, gibberellins, abscisic acid, brassinosteroids, and cytokinins. Each phytohormone plays an independent as well as an overlapping role in growth, and understanding the interactions of the phytohormones has dominated plant research over the past century. The authors present a model in which the circadian clock coordinates growth by synchronizing phytohormone gene expression at dawn, allowing a plant to control growth in a condition-specific manner. Furthermore, the results presented provide a new framework for future experiments aimed at understanding the integration and crosstalk of the phytohormones.
Why do plants grow faster at dawn? New results suggest that light and the circadian clock coordinate growth by synchronizing the expression of plant hormone genes at dawn.
PMCID: PMC2535664  PMID: 18798691

Results 1-25 (27)