PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of dibGuide for AuthorsAboutExplore this JournalData in Brief
 
Data Brief. 2017 October; 14: 260–266.
Published online 2017 July 20. doi:  10.1016/j.dib.2017.07.043
PMCID: PMC5540711

RNA-seq data of Oryza sativa cultivar Kuku Belang under PEG treatment

Abstract

Drought stress is the main abiotic factor affecting rice production. Rain-fed upland rice which is grown on unbounded fields and totally dependent on rainfall for moisture is more prone to drought stress compared to rice from other ecosystems. However, upland rice has adapted to this limited water condition, thus are more drought tolerant than rice from other ecosystems. We performed the first transcriptome sequencing of drought tolerant indica upland rice cultivar Kuku Belang to identify differentially expressed genes related to drought tolerance mechanism. Raw reads for non-treated and PEG-treated Oryza sativa subspecies indica cv. Kuku Belang were deposited in the NCBI SRA database with accession number SRP074520 (https://www.ncbi.nlm.nih.gov/sra?term=SRP074520).

Specification Table

Table thumbnail

Value of data

  • • Upland rice which is better adapted to drought condition is more drought tolerant compared to lowland, irrigated or deep-water rice.
  • • Identification of genes responsible for drought tolerant traits of upland rice is therefore important for improvement of rice production under unfavorable conditions such as drought which is getting worse due to global climate change and diminishing water resources.
  • • Sequencing of drought tolerant upland indica rice cv. Kuku Belang and RNA-seq analysis of the transcriptome helps in identification of differentially expressed genes which are related to drought tolerance mechanism thus unraveling the underlying mechanism of drought tolerance in upland rice at molecular level.

1. Data

Transcriptome data of Oryza sativa subspecies indica cv. Kuku Belang were generated from the polyA-enriched cDNA libraries prepared from total RNA extracted from two weeks old seedlings treated with PEG (treated sample) and distilled water (non-treated sample). Short reads were filtered, processed, assembled and analysed as describe in the next section. Raw data for this project were deposited in the NCBI SRA database with accession number SRP074520 (https://www.ncbi.nlm.nih.gov/sra?term=SRP074520).

2. Experimental design, materials and methods

2.1. Plant materials and sample preparation

Seeds of O. sativa indica cv. Kuku Belang obtained from Malaysian Agricultural Research and Development Institute (MARDI), Seberang Prai were sterilised, germinated, and sown in glass house (2°55′14.5′′N 101°47′01.4′′E) with the temperature at 26/22 °C (day/night), 75/70% humidity, day length of 12 h, and light intensity of 700 µmol m2 s−2. To mimic drought stress, two weeks old seedlings were treated with PEG by immersing its roots for 6,12, 18, 48, 72, and 96 h in 20% PEG-6000 solution whereas for non-treated samples, the roots were immersed in distilled water. Samples were collected at the designated time points and frozen in liquid nitrogen before being stored at −80 °C.

2.2. Total RNA extraction and quality control, library preparation and RNA-seq

Exact masses of total RNA extracted from rice seedlings treated with 20% PEG-6000 for 6,12,18,48,72 and 96 h were combined into one sample (treated sample). Similarly, exact masses of total RNA extracted from rice seedlings treated with distilled water for 6,12,18,48,72 and 96 h were combined into one sample (non-treated sample). Total RNA was extracted using TRIzol reagent as described by the manufacturer (Life Technologies). Total RNA purity was confirmed using Nanodrop 1000 (Thermo Fisher Scientific Inc., USA) whereas total RNA integrity was confirmed using 1% agarose gel electrophoresis. DNA contamination was removed using RNAse-free DNase kit as described by the manufacturer (Thermo Scientific). Both of treated and non-treated samples were sent for sequencing at Malaysian Genome Institute (MGI).

PolyA-enriched cDNA library was prepared using TruSeq Stranded Total RNA Sample Preparation with Ribo-Zero Plant kit as described by the manufacturer (Illumina). PEG-treated sample was indexed using TruSeq Adapter Index 14 whereas non-treated sample was indexed using TruSeq Adapter Index 7. Quality of cDNA library prepared were analysed using Agilent Technologies 2100 Bioanalyzer (Agilent Technologies, USA). Clustering was performed using cBot (version 1.4) and TruSeq PE Cluster v3 kit (Illumina). Paired-end sequencing of 101 bp was then performed using Illumina HiSeq™ 2500 and TruSeq SBS v3 kit (Illumina).

2.3. Assembly and RNA-seq analysis

High quality raw reads with Phred score ≥ 30 generated from sequencing of PEG-treated and non-treated samples were kept for assembly. Genome-guided assembly was performed using the Tuxedo [1] protocol whereby the high quality raw reads of both samples were mapped independently to the reference genome used which is the O. sativa subspecies indica genome ASM465v1.15 using TopHat (v2.0.4) [2]. The alignment files of both samples were then fed independently to Cufflink (v2.0.1) [3]. Next, the assembled transcripts from both samples were merged to produce final transcriptome assembly using Cuffmerge [4]. Cuffmerge [4] was also used to merge the final transcriptome assembly with the reference genome annotation. CuffDiff was used to quantify transcripts abundance (FPKM) in both samples and identify differentially expressed genes according to gene expression level and statistical significance test. Genes with log2 fold change ≥ 2, p-value ≤ 0.001 and q-value ≤ 0.05 were considered differentially expressed. Expression plots such as scatter plot (Fig. 1) and density plot (Fig. 2) were generated using CummeRbund (v2.0.0) [5]. Heatmap was generated using Cluster 3.0 [6] and Treeview (v1.1.6r4) [7] (Fig. 3). Table 1 shows the sequencing and RNA-seq statistics. Lists of differentially expressed genes were provided as Supplementary material.

Fig. 1
Scatter plot created from gene expression data (FPKM values) of PEG-treated and non-treated samples using CummeRbund showing distribution of genes with similar expression values which concentrates near the diagonally dotted straight lines and outliers ...
Fig. 2
Density plot showing the distribution of RNA-seq read counts (FPKM) of PEG-treated (orange area) and non-treated (blue area) samples created using CummeRbund. Most genes in PEG-treated and non-treated samples has similar distribution of RNA-seq read counts ...
Fig. 3
Heat map of differentially expressed genes in non-treated and PEG-treated samples created using Cluster and Treeview. Gene expression values used to create the heat map are the log2 FPKM of the differentially expressed genes in non-treated and treated ...
Table 1
Sequencing and RNA-seq statistics of O.sativa indica cv. Kuku Belang.

Acknowledgements

This research is funded by grant PJ008574 from National Academy of Agricultural Science, RDA, Suwon, Republic of Korea.

Footnotes

Transparency documentTransparency document associated with this article can be found in the online version at doi:10.1016/j.dib.2017.07.043.

Appendix ASupplementary data associated with this article can be found in the online version at doi:10.1016/j.dib.2017.07.043.

Transparency document. Supplementary material

Transparency document

.

Appendix A. Supplementary material

Supplementary material

.

References

1. Trapnell C., Roberts A., Goff L., Pertea G., Kim D., Kelley D.R., Pimentel H., Salzberg S.L., Rinn J.L., Pachter L. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 2012;7:562–578. [PubMed]
2. Trapnell C., Pachter L., Salzberg S.L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25:1105–1111. [PubMed]
3. Trapnell C., Williams B.A., Pertea G., Mortazavi A., Kwan G., van Baren M.J., Salzberg S.L., Wold B.J., Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 2010;28:511–515. [PubMed]
4. Trapnell C., Williams B. a, Pertea G., Mortazavi A., Kwan G., van Baren M.J., Salzberg S.L., Wold B.J., Pachter L. Transcript assembly and abundance estimation from RNA-Seq reveals thousands of new transcripts and switching among isoforms. Nat. Biotechnol. 2011;28:511–515. [PMC free article] [PubMed]
5. Goff L., Trapnell C., Kelley D. cummeRbund: analysis, exploration, manipulation, and visualization of Cufflinks high-throughput sequencing data. R Packag Version. 2013
6. M.J.L. De Hoon, S. Imoto, J. Nolan, S. Miyano, Open Source Clustering Software, 20, 2004, pp. 1453–1454. left angle brackethttp://dx.doi.org/10.1093/bioinformatics/bth078right angle bracket. [PubMed]
7. Page R.D.M. TREEVIEW: an application to display phylogenetic trees on personal computers. Comput. Appl. Biosci. 2004;20:1453–1454. [PubMed]

Articles from Data in Brief are provided here courtesy of Elsevier